Data is big
Follow
9.8K views | +1 today
 
Scooped by ukituki
onto Data is big
Scoop.it!

R: The most powerful and most widely used statistical software

In the last ten years, the open source R statistics language has exploded in popularity and functionality, emerging as the data scientist's tool of choice.

 

more...
No comment yet.
Data is big
&amp;amp;quot;The future is here. It's just not evenly distributed yet.&amp;amp;quot; - William Gibson     :::: Follow this topic for fresh resources and ideas related to Data Science, Machine Learning, Algorithms and #bigdata :::: <a href="http://www.dataisbig.co" rel="nofollow">http://www.dataisbig.co</a>/
Curated by ukituki
Your new post is loading...
Your new post is loading...
Scooped by ukituki
Scoop.it!

How to use XGBoost algorithm in R in easy steps

How to use XGBoost algorithm in R in easy steps | Data is big | Scoop.it
This tutorial explains the use of xgboost algorithm in R. This is done using a data set and building a predictive model with this algorithm
more...
No comment yet.
Scooped by ukituki
Scoop.it!

General Tips for participating Kaggle Competitions

The slides of a talk at Spark Taiwan User Group to share my experience and some general tips for participating kaggle competitions.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

word2vec, LDA, and introducing a new hybrid algorithm: lda2vec

Available with notes: http://www.slideshare.net/ChristopherMoody3/word2vec-lda-and-introducing-a-new-hybrid-algorithm-lda2vec (Data Day 2016) Standard natural …
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Shiny 0.13.0

Shiny 0.13.0 is now available on CRAN! This release has some of the most exciting features we’ve shipped since the first version of Shiny. Highlights include: Shiny Gadgets HTML templates Shiny mod...
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Making Facebook for Whales

Making Facebook for Whales | Data is big | Scoop.it
Marine biologists have crowdsourced a facial-recognition algorithm to help them identify the animals on the spot.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Scheduling R Markdown Reports via Email |analytics for fun

Scheduling R Markdown Reports via Email |analytics for fun | Data is big | Scoop.it
How to Schedule R Markdown Reports via Email. An example using Google Analytics Data.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

GTAC 2015: Statistical Data Sampling

http://g.co/gtac Slides: https://docs.google.com/presentation/d/1zAgKXFOQn02PVik9b4YkV0ZJ2wJIaGAf5oFY_dUyDD8/pub Celal Ziftci (Google) and Ben Greenberg (MIT...
ukituki's insight:

It is common practice to use a sample of production data in tests. Examples are:

* Sanity Test: Feed a sample of production data into your system to see if anything fails.
* A/B Test: Take a large chunk of production data, run it through the current and new versions of your system, and diff the outputs for inspection.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Yahoo Releases the Largest-ever Machine Learning Dataset for Researchers

Yahoo Releases the Largest-ever Machine Learning Dataset for Researchers | Data is big | Scoop.it

Today, we are proud to announce the public release of the largest-ever machine learning dataset to the research community. The dataset stands at a massive ~110B events (13.5TB uncompressed) of anonymized user-news item interaction data, collected by recording the user-news item interactions of about 20M users from February 2015 to May 2015.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

International Business Communication Standards #visualization #dataviz

The International Business Communication Standards (IBCS or IBCS Standards) are practical proposals for the design of reports and presentations, meaning, in most cases, the proper conceptual, perceptual and semantic design of charts and tables.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Robot Control with Distributed Deep Reinforcement Learning

Demonstration of Distributed Deep Reinforcement Learning in simulated racing car driving and actual robots control.
ukituki's insight:

Cars learn how to drive safely through intersections without traffic lights

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Human-level concept learning | Daily Mail Online

NYU Moore-Sloan data science fellow Brenden Lake speaks about human-level concept learning through probabilistic program induction.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Introducing Kaggle Datasets

Introducing Kaggle Datasets | Data is big | Scoop.it
At Kaggle, we want to help the world learn from data. This sounds bold and grandiose, but the biggest barriers to this are incredibly simple. It’s tough to access data. It’s tough to understand wha…
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Simple solutions to make videos with R

Simple solutions to make videos with R | Data is big | Scoop.it
I'm talking about streaming data displayed in video rather than chart format, like 200 scatter plots continuously updated, as in my recent video series from ch…
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Scraping via APIs

Scraping via APIs | Data is big | Scoop.it
ukituki's insight:

In the epic poem Rime of the Ancient Mariner, Samuel Taylor Coleridge states, “Water, water, everywhere, nor any a drop to drink.” Indeed, some would say the same about data. Data appear to be everywhere yet only a fraction are analyzed. There are several arguments as to why but one that has reached the concern of the White House is data accessibility. However, this is rapidly changing as growth in technology and resources are quickly opening the doors of many data vaults to the masses. We, the public minions, now have access to a wide range of data; from social, financial, government, and ecommerce data to geospatial, search engine, and even ant data. We just need to know how to get it. Enter APIs.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

New Year Resolutions for a Data Scientist

New Year Resolutions for a Data Scientist | Data is big | Scoop.it
Here are new year resolutions 2016 for an aspiring data scientist categorized into beginner, intermediate and advanced with suggested courses
more...
No comment yet.
Scooped by ukituki
Scoop.it!

New Data Sources for R

New Data Sources for R | Data is big | Scoop.it

by Joseph Rickert Over the past few months, a number of new CRAN packages have appeared that make it easier for R users to gain access to curated data. Most of these provide interfaces to a RESTful API written by the data publishers while a few just wrap the data set inside the package. 

more...
No comment yet.