Linux, OS, SysAdmin and Cloud Computing
208 views | +0 today
Follow
Your new post is loading...
Your new post is loading...
Scooped by Flavio Barros
Scoop.it!

Dockerizing a Shiny App - Flavio Barros

Dockerizing a Shiny App - Flavio Barros | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
After a long pause of more than four months, I am finally back to post here. Unfortunately, many commitments prevented me keep posting, but coming back, i changed the deployment (now this blog runs entirely within a docker container with some other cool things I intend to post more forward) and wrote this post. 1. …
more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

Announcing Snappy Ubuntu | Cloud | Ubuntu

Announcing Snappy Ubuntu | Cloud | Ubuntu | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
This page is dedicated to explaining project snappy, which is a new way of packaging and distributing applications in the cloud.
Flavio Barros's insight:

Ubuntu lança uma distribuição minimalista somente com o core server. Essa distribuição visa fornecer uma base para criação de contêineres de software como o Docker.

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

Mining Massive Datasets

Mining Massive Datasets | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Mining Massive Datasets is a free online class taught by Jure Leskovec, Anand Rajaraman and Jeff Ullman of Stanford University
Flavio Barros's insight:

Curso imperdível para quem tem interesse em Big Data. É um famoso curso oferecido em Stanford, só que agora na sua versão MOOC. Começa a partir do dia 29 de setembro. Já estou inscrito ;-)

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

We're living in a post-open source world

We're living in a post-open source world | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Increasingly, software isn't sold, it's used to power services offered over the Internet. So why contend with the complexities of open source licensing?
more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

Google apoia o Docker, a próxima grandiosidade em Cloud

Google apoia o Docker, a próxima grandiosidade em Cloud | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Google está investindo em uma tecnologia de código aberto que já é uma das mais quentes novidades no mundo da computação em nuvem. O google tem investido em
Flavio Barros's insight:

Nada a acrescentar!

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

Synergy could be the coolest Open Source App I have used in a long time | Tech Drive-in

Synergy could be the coolest Open Source App I have used in a long time | Tech Drive-in | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Flavio Barros's insight:

O Synergy é um app que permite compartilhar o teclado e o mouse pela rede. É multiplataforma e pode ser aplicado em diversas situações.

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

Canonical launches Ubuntu-powered cloud cluster in a box | ZDNet

Canonical launches Ubuntu-powered cloud cluster in a box | ZDNet | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
The Orange Box cluster is designed to encourage companies to experiment with Canonical's Ubuntu OS and orchestration tools when building cloud and distributed compute clusters.
Flavio Barros's insight:

Canonical lança uma "mini cloud". A caixa é um conjunto de servidores que a Canonical tem usado em treinamentos. No entanto, com a grande procura pelo produto talvez o box seja comercializado.

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

What is Docker and When To Use It | CenturyLink Labs

What is Docker and When To Use It | CenturyLink Labs | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Helping Developers do Ops Easier
Flavio Barros's insight:

Se você tem dúvida sobre o que é o Docker e como ele pode ajuda-lo esse artigo é para você.

more...
No comment yet.
Rescooped by Flavio Barros from Big Data, Statistics and Machine Learning
Scoop.it!

LinuxFoundationX: LFS101x : Introduction to Linux | edX

LinuxFoundationX: LFS101x : Introduction to Linux | edX | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
more...
Flavio Barros's curator insight, March 9, 2014 10:21 PM

Para quem ainda não teve a oportunidade de usar o Linux, esse curso é uma grande oportunidade. Está sendo oferecido pela Linux Software Foudation na plataforma de MOOC's do edX. Esse curso já era oferecido pago ao valor de $2400,  direto pela FSF. Agora é de graça! 

Rescooped by Flavio Barros from Big Data, Statistics and Machine Learning
Scoop.it!

Python Scripts as a Replacement for Bash Utility Scripts | Linux Journal

Python Scripts as a Replacement for Bash Utility Scripts | Linux Journal | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
For Linux users, the command line is a celebrated part of our entire experience. Unlike other popular operating systems, where the command line is a scary proposition for all but the most experienced veterans, in the Linux community, command-line use is encouraged.
more...
Flavio Barros's curator insight, May 3, 2014 12:28 PM

Na área de infraestrutura de Big Data o Linux tem um papel fundamental. Seja nos nós, ou como a base de uma infra completa com o OpenStack ( https://www.openstack.org/) é sempre necessário automatizar algumas tarefas administrativas. Para isso até é possível usar scripts no Bash, mas posso dizer que como usuário Linus há mais de 10 anos, scripts em bash são chatos de trabalhar. A sintaxe não é amigável e quando eles ficam grandes é difícil entender depois como eles funcionam. É ai que entra o Python: ele funciona muito bem como um substituto ao bash para automatizar tarefas. Esse post é de 2013 mas foi dos melhores que encontrei de como usar o Python para tarefas administrativas.

Scooped by Flavio Barros
Scoop.it!

WordPress do Digital Ocean - Flavio Barros

WordPress do Digital Ocean - Flavio Barros | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Atualização IMPORTANTE: Para quem quiser experimentar com o  DIGITAL OCEAN, eu ATUALIZEI todos os links para o programa de “referral” …
more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

[Phoronix] Microsoft To Open-Source .NET, Bring It Officially To Linux

[Phoronix] Microsoft To Open-Source .NET, Bring It Officially To Linux | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Phoronix is the leading technology website for Linux hardware reviews, open-source news, Linux benchmarks, open-source benchmarks, distribution screenshots, interviews, and computer hardware tests.
Flavio Barros's insight:

Essa notícia é muito interessante: o .NET agora é open source e multiplataforma! Agora o suporte ao Linux é oficial.

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

Things to try after useR! – Part 1: Deep Learning with H2O

Things to try after useR! – Part 1: Deep Learning with H2O | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Annual R User Conference 2014The useR! 2014 conference was a mind-blowing experience. Hundreds of R enthusiasts and the beautiful UCLA campus, I am really glad that I had the chance to attend! The only problem is that, after a few days of non-stop R talks, I was (and still am) completely overwhelmed with the new cool packages and ideas.Let me start with H2O - one of the three promising projects that John Chambers highlighted during his keynote (the other two were Rcpp/Rcpp11 and RLLVM/RLLVMCompile).What's H2O?
"The Open Source In-Memory, Prediction Engine for Big Data Science" - that's what Oxdata, the creator of H2O, said. Joseph Rickert's blog post is a very good introduction of H2O so please read that if you want to find out more. I am going straight into the deep learning part.Deep Learning in R
Deep learning tools in R are still relatively rare at the moment when compared to other popular algorithms like Random Forest and Support Vector Machines. A nice article about deep learning can be found here. Before the discovery of H2O, my deep learning coding experience was mostly in Matlab with the DeepLearnToolbox. Recently, I have started using 'deepnet', 'darch' as well as my own code for deep learning in R. I have even started developing a new package called 'deepr' to further streamline the procedures. Now I have discovered the package 'h2o', I may well shift the design focus of 'deepr' to further integration with H2O instead!But first, let's play with the 'h2o' package and get familiar with it.

The H2O Experiment
The main purpose of this experiment is to get myself familiar with the 'h2o' package. There are quite a few machine learning algorithms that come with H2O (such as Random Forest and GBM). But I am only interested in the Deep Learning part and the H2O cluster configuration right now. So the following experiment was set up to investigate:
How to set up and connect to a local H2O cluster from R.
How to train a deep neural networks model.
How to use the model for predictions.
Out-of-bag performance of non-regularized and regularized models.
How does the memory usage vary over time.

Experiment 1: 
For the first experiment, I used the Wisconsin Breast Cancer Database. It is a very small dataset (699 samples of 10 features and 1 label) so that I could carry out multiple runs to see the variation in prediction performance. The main purpose is to investigate the impact of model regularization by tuning the 'Dropout' parameter in the h2o.deeplearning(...) function (or basically the objectives 1 to 4 mentioned above).


Experiment 2: 
The next thing to investigate is the memory usage (objective 5). For this purpose, I chose a bigger (but still small in today's standards) dataset MNIST Handwritten Digits Database (LeCun et al.). I would like to find out if the memory usage can be capped at a defined allowance over long period of model training process.





FindingsOK, enough for the background and experiment setup. Instead of writing this blog post like a boring lab report, let's go through what I have found out so far. (If you want to find out more, all code is available here so you can modify it and try it out on your clusters.)Setting Up and Connecting to a H2O Cluster
Smoooooth! - if I have to explain it in one word. Oxdata made this really easy for R users. Below is the code to start a local cluster with 1GB or 2GB memory allowance. However, if you want to start the local cluster from terminal (which is also useful if you see the messages during model training), you can do this java -Xmx1g -jar h2o.jar (see the original H2O documentation here).By default, H2O starts a cluster using all available threads (8 in my case). The h2o.init(...) function has no argument for limiting the number of threads yet (well, sometimes you do want to leave one thread idle for other important tasks like Facebook). But it is not really a problem.Loading Data

In order to train models with the H2O engine, I need to link the datasets to the H2O cluster first. There are many ways to do it. In this case, I linked a data frame (Breast Cancer) and imported CSVs (MNIST) using the following code.


Training a Deep Neural Network ModelThe syntax is very similar to other machine learning algorithms in R. The key differences are the inputs for x and y which you need to use the column numbers as identifiers.Using the Model for PredictionAgain, the code should look very familiar to R users.The h2o.predict(...) function will return the predicted label with the probabilities of all possible outcomes (or numeric outputs for regression problems) - very useful if you want to train more models and build an ensemble.

Out-of-Bag Performance (Breast Cancer Dataset)

No surprise here. As I expected, the non-regularized model overfitted the training set and performed poorly on test set. Also as expected, the regularized models did give consistent out-of-bag performance. Of course, more tests on different datasets are needed. But this is definitely a good start for using deep learning techniques in R!Memory Usage (MNIST Dataset)

This is awesome and really encouraging! In near idle mode, my laptop uses about 1GB of memory (Ubuntu 14.04). During the MNIST model training, H2O successfully kept the memory usage below the capped 2GB allowance over time with all 8 threads working like a steam train! OK, this is based on just one simple test but I already feel comfortable and confident to move on and use H2O for much bigger datasets.Conclusions
OK, let's start from the only negative point. The machine learning algorithms are limited to the ones that come with H2O. I cannot leverage the power of other available algorithms in R yet (correct me if I am wrong. I will be very happy to be proven wrong this time. Please leave a comment on this blog so everyone can see it). Therefore, in terms of model choices, it is not as handy as caret and subsemble.Having said that, the included algorithms (Deep Neural Networks, Random Forest, GBM, K-Means, PCA etc) are solid for most of the common data mining tasks. Discovering and experimenting with the deep learning functions in H2O really made me happy. With the superb memory management and the full integration with multi-node big data platforms, I am sure this H2O engine will become more and more popular among data scientists. I am already thinking about the  Parallella project but I will leave it until I finish my thesis.I can now understand why John Chambers recommended H2O. It has already become one of my essential R tools for data mining. The deep learning algorithm in H2O is very interesting, I will continue to explore and experiment with the rest of the regularization parameters such as 'L1', 'L2' and 'Maxout'.CodeAs usual, code is available at my GitHub repo for this blog.

Personal Highlight of useR! 2014
Just a bit more on useR! During the conference week, I met so many cool R people for the very first time. You can see some of the photos by searching #user2014 and my twitter handle together. Other blog posts about the conference can be found here, here, here, here, here and here. For me, the highlight has to be this text analysis by Ajay:

#User2014 trended thx to: @LouBajuk @guneetc79 @earino @pilatesbuff @matlabulous @timtriche http://t.co/auoFM1xWIw pic.twitter.com/l952WD5ejz— Ajay Gopal (@aj2z) July 7, 2014

... which means I successfully made Matlab trending with R!!! 

During the conference banquet, Jeremy Achin (from DataRobot) suggested that I might as well change my profile photo to a Python logo just to make it even more confusing! It was also very nice to speak to Matt Dowle in person and to learn about his amazing data.table journey from S to R. I have started updating some of my old code to use data.table for the heavy data wrangling tasks.

By the way, Jeremy and the DataRobot team (a dream team of top Kaggle data scientists including Xavier who gave a talk about "10 packages to Win Kaggle Competitions") showed me an amazing demo of their product. Do ask them for a beta account and see for yourself!!!There are more cool things that I am trying at the moment. I will try to blog about them in the near future. If I have to name a few right now ... that will be:
Embedding Shiny Apps in R Markdown by RStudio
subsemble: Ensemble learning in R with the Subsemble algorithm by Erin LeDell
OpenCPU by Jeroen Ooms
dendextend: an R package for easier manipulation and visualization of dendrograms by Tal Galili
Adaptive Resampling in a Parallel World by Max Kuhn
Packrat - A Dependency Management System for R by J.J. Allaire



(Pheeew! So here is my first blog post related to machine learning - the very purpose of starting this blog. Not bad it finally happened after a whole year!)
Flavio Barros's insight:

O H2O é uma forma excelente de estender o R para trabalhar com grandes volumes de dados. 

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

15 fatos sobre programação que você provavelmente não sabia | Blog do Mauro Pichiliani

15 fatos sobre programação que você provavelmente não sabia | Blog do Mauro Pichiliani | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Flavio Barros's insight:

Interessante, vale a pena ler.

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

Debian -- News -- Debian 6 debuts its long term support period

Debian -- News -- Debian 6 debuts its long term support period | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Flavio Barros's insight:

O Debian vai contar agora com versões de suporte de longo prazo, as famosas LTS. O Debian regularmente mantinha suporte e correções de segurança por aproximadamente 3 anos, tempo aproximado entre os lançamentos; agora com as versões LTS o suporte será maior, de 5 anos. A primeira versão LTS será o Debian 6 Squeeze, com suporte até fevereiro de 2016. 

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

Top 10 SBCs OpenSource com Linux e Android - Embarcados - Sua fonte de informações sobre Sistemas Embarcados

Top 10 SBCs OpenSource com Linux e Android - Embarcados - Sua fonte de informações sobre Sistemas Embarcados | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
A The Linux Foundation e o LinuxGizmos.com realizaram uma pesquisa entre os dias 08 e 18 de maio para saber entre 32 placas, quais são as... veja+
Flavio Barros's insight:

Um interessante review dos sistemas embarcados rodando Linux disponíveis no mercado. Tem coisas bem legais ai.

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

Canonical offers 'Chuck Norris Grade' OpenStack private cloud service | ZDNet

Canonical offers 'Chuck Norris Grade' OpenStack private cloud service | ZDNet | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Canonical, best known as the company behind Ubuntu Linux, is entering the private cloud hosting business with an OpenStack-based option for your data center or hosting provider.
Flavio Barros's insight:

Canonical, a empresa por trás da distribuição Linux Ubuntu, passará a oferecer computação em nuvem como serviço. Qualquer empresa, por mais modesta que seja, vai passar a pode usufruir de uma nuvem particular, totalmente administrada pela Canonical ao preço de U$15,00 por nó por dia.

more...
No comment yet.
Scooped by Flavio Barros
Scoop.it!

A hackable text editor for the 21st Century

A hackable text editor for the 21st Century | Linux, OS, SysAdmin and Cloud Computing | Scoop.it

o At GitHub, we’re building the text editor we’ve always wanted: hackable to the core, but approachable on the first day without ever touching a config file. We can’t wait to see what you build with it.

Flavio Barros's insight:

O Atom é um novo editor moderno que vem sendo desenvolvido pelo pessoal do github. Ele é um editor que vem uma proposta muito interessante e foi licenciado recentemente como open source. Vamos ver se vale a pena trocar o vim.

more...
No comment yet.
Rescooped by Flavio Barros from Big Data, Statistics and Machine Learning
Scoop.it!

Homepage - Docker: the Linux container engine

Homepage - Docker: the Linux container engine | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
Docker: An open source project to pack, ship and run any application as a lightweight container
more...
Flavio Barros's curator insight, April 1, 2014 11:36 PM

O Docker é um projeto muito interessante pois permite criar contêineres de software sobre o linux para rodar aplicações, de forma completamente independente. É como se fosse possível criar máquinas virtuais sem criar máquinas virtuais de fato! Os contêineres podem ser enviados e recolocados em outras plataformas praticamente sem qualquer alteração. É um projeto fantástico! Vale a pena conferir. Nesse vídeo tem uma introdução feita por um dos principais desenvolvedores do projeto: https://www.youtube.com/watch?v=Q5POuMHxW-0

Scooped by Flavio Barros
Scoop.it!

LVM, Demystified | Linux Journal

LVM, Demystified | Linux Journal | Linux, OS, SysAdmin and Cloud Computing | Scoop.it
I've been a sysadmin for a long time, and part of being a sysadmin is doing more than is humanly possible. Sometimes that means writing wicked cool scripts, sometimes it means working late, and sometimes it means learning to say no. Unfortunately, it also sometimes means cutting corners. I confess, I've been "that guy" more than once.
Flavio Barros's insight:

O LVM permite uma instalação limpa, customizada e de redimensionamento fácil. Nesse artigo o autor mostra de forma muito clara como funciona o LVM. Eu uso na minha máquina e realmente vale muito a pena, principalmente quando você precisa diminuir ou aumentar uma partição. Se você usa o Linux vale a pena testar.

more...
No comment yet.