Business Analytics & Data Science
17.6K views | +51 today
Follow
 
Scooped by Carlos Lizarraga Celaya
onto Business Analytics & Data Science
Scoop.it!

DataScience Server setup - Networkx

DataScience Server setup - Networkx | Business Analytics & Data Science | Scoop.it
This post contains how to setup a datascience server with the most common software installed a datascientist needed to do this/her work
more...
No comment yet.
Business Analytics & Data Science
Current topics in Data Analytics, Data Mining, Machine Learning, Visualization, Predictive Analytics & Business Intelligence
Your new post is loading...
Your new post is loading...
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Regression Analysis Essentials For Machine Learning - Easy Guides - Wiki - STHDA

Regression Analysis Essentials For Machine Learning - Easy Guides - Wiki - STHDA | Business Analytics & Data Science | Scoop.it

Statistical tools for data analysis and visualization.


Regression analysis consists of a set of machine learning methods that allow us to predict a continuous outcome variable (y) based on the value of one or multiple predictor variables (x). Briefly, the goal of regression model is to build a mathematical equation that defines y as a function of the x variables. Next, this equation can be used to predict the outcome (y) on the basis of new values of the predictor variables (x).

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Extracting PDF Text with R and Creating Tidy Data – Datazar Blog

In the digital age of today, data comes in many forms. Many of the more common file types like CSV, XLSX, and plain text (TXT) are easy to access and manage. Yet, sometimes, the data we need is locked away in a file format that is less accessible such as a PDF. If you have ever found yourself in this dilemma, fret not — pdftools has you covered. In this post, you will learn how to: use pdftools to extract text from a PDF, use the stringr package to manipulate strings of text, and create a tidy data set.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Machine Learning with Python scikit-learn; Part 1. Fisseha Berhane, DataScience+

Machine Learning with Python scikit-learn; Part 1. Fisseha Berhane, DataScience+ | Business Analytics & Data Science | Scoop.it

Previously, I have written a blog post on machine learning with R by Caret package. In this post, I will use the scikit-learn library in Python. As we did in the R post, we will predict power output given a set of environmental readings from various sensors in a natural gas-fired power generation plant.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

autoplotly - One Line of R Code to Build Interactive Visualizations for Popular Statistical Results - Yuan Tang. Yuan's Blog

The autoplotly package is an extension built on top of ggplot2, plotly, and ggfortify to provide functionalities to automatically generate interactive visualizations for many popular statistical results supported by ggfortify package in plotly and ggplot2 styles. The generated visualizations can also be easily extended using ggplot2 and plotly syntax while staying interactive.
more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

JupyterLab is Ready for Users – Jupyter Blog

We are proud to announce the beta release series of JupyterLab, the next-generation web-based interface for Project Jupyter. Project Jupyter exists to develop open-source software, open standards, and services for interactive and reproducible computing. 


Since 2011, the Jupyter Notebook has been our flagship project for creating reproducible computational narratives. The Jupyter Notebook enables users to create and share documents that combine live code with narrative text, mathematical equations, visualizations, interactive controls, and other rich output. It also provides building blocks for interactive computing with data: a file browser, terminals, and a text editor.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Time series model of forecasting future power demand. Rajesh Kumar Pandey, DataScience+

Time series model of forecasting future power demand. Rajesh Kumar Pandey, DataScience+ | Business Analytics & Data Science | Scoop.it

I was working on monthly power demand in the Telangana state of India and used Holt-Winters methodology using R to arrive at prediction forecasts. The data is since June 2014 from CEA website for Telangana (the state was formed in June 2014), so, data is available from that time only.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Time Series Analysis Using ARIMA Model In R. Subhasree Chatterjee. DataScience+

Time Series Analysis Using ARIMA Model In R. Subhasree Chatterjee. DataScience+ | Business Analytics & Data Science | Scoop.it

Time series data are data points collected over a period of time as a sequence of time gap. Time series data analysis means analyzing the available data to find out the pattern or trend in the data to predict some future values which will, in turn, help more effective and optimize business decisions.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

How to implement Random Forests in R. R-posts.com.  Chaitanya Sagar, Prudhvi Potuganti and Saneesh Veetil of Perceptive Analytics.

How to implement Random Forests in R. R-posts.com.  Chaitanya Sagar, Prudhvi Potuganti and Saneesh Veetil of Perceptive Analytics. | Business Analytics & Data Science | Scoop.it

Imagine you were to buy a car, would you just go to a store and buy the first one that you see? No, right? You usually consult few people around you, take their opinion, add your research to it and then go for the final decision. Let’s take a simpler scenario: whenever you go for a movie, do you ask your friends for reviews about the movie (unless, off-course it stars one of your favorite actress)?

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

How to perform Logistic Regression, LDA, & QDA in R. Prashant Shekhar. DataScience+

How to perform Logistic Regression, LDA,  & QDA in R. Prashant Shekhar. DataScience+ | Business Analytics & Data Science | Scoop.it

Classification algorithm defines set of rules to identify a category or group for an observation. There is various classification algorithm available like Logistic Regression, LDA, QDA, Random Forest, SVM etc. Here I am going to discuss Logistic regression, LDA, and QDA. The classification model is evaluated by confusion matrix. This matrix is represented by a table of Predicted True/False value with Actual True/False Value.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

The Mcomp Package for time series analysis. Giorgio Garziano, DataScience+

The Mcomp Package for time series analysis. Giorgio Garziano, DataScience+ | Business Analytics & Data Science | Scoop.it

Makridakis Competitions (also known as the M Competitions or M-Competitions) are a series of competitions organized by teams led by forecasting researcher Spyros Makridakis and intended to evaluate and compare the accuracy of different forecasting methods. So far three competitions have taken place, named as M1 (1982), M2 (1993) and M3 (2000). The fourth competition is going to take place on year 2018 very soon.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

AI and Deep Learning in 2017 – A Year in Review. Denny Britz, WildML.

AI and Deep Learning in 2017 – A Year in Review. Denny Britz, WildML. | Business Analytics & Data Science | Scoop.it

The year is coming to an end. I did not write nearly as much as I had planned to. But I’m hoping to change that next year, with more tutorials around Reinforcement Learning, Evolution, and Bayesian Methods coming to WildML! And what better way to start than with a summary of all the amazing things that happened in 2017? Looking back through my Twitter history and the WildML newsletter, the following topics repeatedly came up. I’ll inevitably miss some important milestones, so please let me know about it in the comments! 

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Combined outlier detection with dplyr and ruler. Evgeni Chasnovski. QuestionFlow.

Combined outlier detection with dplyr and ruler. Evgeni Chasnovski. QuestionFlow. | Business Analytics & Data Science | Scoop.it

Overview of simple outlier detection methods with their combination using dplyr and ruler packages. 

During the process of data analysis one of the most crucial steps is to identify and account for outliers, observations that have essentially different nature than most other observations. Their presence can lead to untrustworthy conclusions. The most complicated part of this task is to define a notion of “outlier”. After that, it is straightforward to identify them based on given data. There are many techniques developed for outlier detection. Majority of them deal with numerical data. This post will describe the most basic ones with their application using dplyr and ruler packages.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

How to apply Linear Regression in R. Prashant Shekhar, DataScience+

How to apply Linear Regression in R. Prashant Shekhar, DataScience+ | Business Analytics & Data Science | Scoop.it

Machine Learning and Regression Machine Learning (ML) is a field of study that provides the capability to a Machine to understand data and to learn from the data. ML is  not only about analytics modeling but it is end-to-end modeling that broadly involves following steps: 

– Defining problem statement 

– Data collection. 

– Exploring, Cleaning and transforming data. 

– Making the analytics model. 

– Dashboard creation & deployment of the model.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

TSrepr use case - Clustering time series representations in R – Peter Laurinec – Time series data mining in R. Bratislava, Slovakia.

TSrepr use case - Clustering time series representations in R – Peter Laurinec – Time series data mining in R. Bratislava, Slovakia. | Business Analytics & Data Science | Scoop.it

Time series data mining in R. 

In this tutorial, I will show you one use case how to use time series representations effectively. This use case is clustering of time series and it will be clustering of consumers of electricity load. By clustering of consumers of electricity load, we can extract typical load profiles, improve the accuracy of consequent electricity consumption forecasting, detect anomalies or monitor a whole smart grid (grid of consumers) (Laurinec et al. (2016), Laurinec and Lucká (2016)). I will show you the first use case, the extraction of typical electricity load profiles by K-medoids clustering method.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Extreme Gradient Boosting with Python. Fisseha Birhane, DataScience+

Extreme Gradient Boosting with Python. Fisseha Birhane, DataScience+ | Business Analytics & Data Science | Scoop.it

Extreme Gradient Boosting is among the hottest libraries in supervised machine learning these days. It supports various objective functions, including regression, classification, and ranking. It has gained much popularity and attention recently as it was the algorithm of choice for many winning teams of a number of machine learning competitions. Previously I showed how to do Extreme Gradient Boosting with R and in this post, I will show how to do it with Python.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Compare outlier detection methods with the OutliersO3 package (Revolutions) by Antony Unwin, University of Augsburg, Germany

Compare outlier detection methods with the OutliersO3 package (Revolutions) by Antony Unwin, University of Augsburg, Germany | Business Analytics & Data Science | Scoop.it

There are many different methods for identifying outliers and a lot of them are available in R. But are outliers a matter of opinion? Do all methods give the same results? Articles on outlier methods use a mixture of theory and practice. Theory is all very well, but outliers are outliers because they don’t follow theory. Practice involves testing methods on data, sometimes with data simulated based on theory, better with `real’ datasets. A method can be considered successful if it finds the outliers we all agree on, but do we all agree on which cases are outliers? 

more...
The Romero Team's comment, October 1, 11:36 PM
good article
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Principal Component Analysis (PCA) in R. Prashant Shekhar, DataScience+

Principal Component Analysis (PCA) in R. Prashant Shekhar, DataScience+ | Business Analytics & Data Science | Scoop.it

Principal Component Analysis (PCA) is unsupervised learning technique and it is used to reduce the dimension of the data with minimum loss of information. PCA is used in an application like face recognition and image compression.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Deep Learning from first principles in Python, R and Octave – Part 2 | Tinniam V. Ganesh,  Giga thoughts …

Deep Learning from first principles in Python, R and Octave – Part 2 | Tinniam V. Ganesh,  Giga thoughts … | Business Analytics & Data Science | Scoop.it
This post is a follow-up post to my earlier post Deep Learning from first principles in Python, R and Octave-Part 1. In the first part, I implemented Logistic Regression, in vectorized Python,R and Octave, with a wannabe Neural Network (a Neural Network with no hidden layers). In this second part, I implement a regular, but somewhat primitive Neural Network (a Neural Network with just 1 hidden layer). The 2nd part implements classification of manually created datasets, where the different clusters of the 2 classes are not linearly separable.
more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Integrating R Notebooks and R shiny with Tableau. Fisseha Berhane, DataScience+

Integrating R Notebooks and R shiny with Tableau. Fisseha Berhane, DataScience+ | Business Analytics & Data Science | Scoop.it

Integrating R Notebooks and R shiny with Tableau enables us to take advantage of the various statistical analysis and machine learning packages in R. In this short blog post, we will see how to integrate Tableau with R through R Notebooks and shiny. This approach helps us to have descriptive, inferential and predictive analytics in our Tableau story/dashboard. The data I am using is reading test from the Program for International Student Assessment (PISA).

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

A Tour of The Top 10 Algorithms for Machine Learning Newbies. James Le. Towards Data Science.

In machine learning, there’s something called the “No Free Lunch” theorem. In a nutshell, it states that no one algorithm works best for every problem, and it’s especially relevant for supervised learning (i.e. predictive modeling). 

For example, you can’t say that neural networks are always better than decision trees or vice-versa. There are many factors at play, such as the size and structure of your dataset. 

As a result, you should try many different algorithms for your problem, while using a hold-out “test set” of data to evaluate performance and select the winner.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Building a Daily Bitcoin Price Tracker with Coindeskr and Shiny in R. Abdul Made Raja, Data Science+

Building a Daily Bitcoin Price Tracker with Coindeskr and Shiny in R. Abdul Made Raja, Data Science+ | Business Analytics & Data Science | Scoop.it

Let’s admit it. The whole world has been going crazy with Bitcoin. Bitcoin (BTC), the first cryptocurrency (in fact, the first digital currency to solve the double-spend problem) introduced by Satoshi Nakamoto has become bigger than well-established firms (even a few countries). So, a lot of Bitcoin Enthusiasts and Investors are looking to keep a track of its daily price to better read the market and make moves accordingly. This tutorial is to help an R user build his/her own Daily Bitcoin Price Tracker using three packages, Coindeskr, Shiny and Dygraphs.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Neural Networks with Google CoLaboratory | Artificial Intelligence Getting started. Sagar Howal 

Google Recently Launched its internal tool for collaborating on writing Data Science Code. The Project called Google CoLaboratory (g.co/colab) is based on the Jupyter Open Source Project and is integrated with Google Drive. Colaboratory allows users to work on Jupyter Notebooks as easily as working on Google Docs or spreadsheets.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Deep Learning from first principles in Python, R and Octave – Part 1. Tinniam V. Ganesh. Giga Thougths. 

Deep Learning from first principles in Python, R and Octave – Part 1. Tinniam V. Ganesh. Giga Thougths.  | Business Analytics & Data Science | Scoop.it

“You don’t perceive objects as they are. You perceive them as you are.” “Your interpretation of physical objects has everything to do with the historical trajectory of your brain - and little to do with the objects themselves.” “The brain generates its own reality, even before it receives information coming in from the eyes and the other senses. This is known as the internal model".

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Tiny Art in Less Than 280 Characters. Fronkonstin.

Tiny Art in Less Than 280 Characters. Fronkonstin. | Business Analytics & Data Science | Scoop.it

Now that Twitter allows 280 characters, the code of some drawings I have made can fit in a tweet. In this post I have compiled a few of them. The first one is a cardioid inspired in string art.

more...
No comment yet.
Scooped by Carlos Lizarraga Celaya
Scoop.it!

Earth to exoplanet: Hunting for planets with machine learning. Chris Shallue and Andrew Vanderburg. 

Earth to exoplanet: Hunting for planets with machine learning. Chris Shallue and Andrew Vanderburg.  | Business Analytics & Data Science | Scoop.it

We used AI to search for planets in NASA Kepler data, and found two new planets, and the first 8-planet solar system outside of our own, in the process. 

For thousands of years, people have looked up at the stars, recorded observations, and noticed patterns. Some of the first objects early astronomers identified were planets, which the Greeks called “planētai,” or “wanderers,” for their seemingly irregular movement through the night sky. Centuries of study helped people understand that the Earth and other planets in our solar system orbit the sun—a star like many others. Today, with the help of technologies like telescope optics, space flight, digital cameras, and computers, it’s possible for us to extend our understanding beyond our own sun and detect planets around other stars. Studying these planets—called exoplanets—helps us explore some of our deepest human inquiries about the universe. What else is out there? Are there other planets and solar systems like our own?

more...
No comment yet.