Data is big
Follow
6.1K views | +2 today
 
Scooped by ukituki
onto Data is big
Scoop.it!

The Gigaom guide to deep learning: Who's doing it, and why it matters

The Gigaom guide to deep learning: Who's doing it, and why it matters | Data is big | Scoop.it
Deep learning is one of the hottest trends in big data right now and is currently underpinning the cutting edge in areas such as natural language processing and image recognition. Here’s a brief guide about what it is about who’s doing it.
more...
No comment yet.

From around the web

Data is big
&amp;amp;quot;The future is here. It's just not evenly distributed yet.&amp;amp;quot; - William Gibson     :::: Follow this topic for fresh resources and ideas related to Data Science, Machine Learning, Algorithms and #bigdata :::: <a href="http://www.dataisbig.co" rel="nofollow">http://www.dataisbig.co</a>/
Curated by ukituki
Your new post is loading...
Your new post is loading...
Scooped by ukituki
Scoop.it!

Python for Image Understanding: Deep Learning with Convolutional Neural Nets

Talk given at PyData 2015 London. June 21, 2015. Practical approach to how to train your deep neural net for images.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Finding the K in K-Means Clustering

Finding the K in K-Means Clustering | Data is big | Scoop.it
A couple of weeks ago, here at The Data Science Lab we showed how Lloyd's algorithm can be used to cluster points using k-means with a simple python implementation. We also produced interesting vis...
more...
No comment yet.
Scooped by ukituki
Scoop.it!

The Rapid Rise of Deep Learning Computer Vision Technology

The Rapid Rise of Deep Learning Computer Vision Technology | Data is big | Scoop.it
This article highlights companies using deep learning-based computer vision technology and/or providing platforms featuring automatic photo tagging and classification services.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Opportunities for #AI to enable #DataScience with Knowledge and Explanation: https://t.co/DsQPgMK

Opportunities for #AI to enable #DataScience with Knowledge and Explanation: https://t.co/DsQPgMK | Data is big | Scoop.it

DataIsBig.co

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Ggplot2 cheat sheet: Search by task

Ggplot2 cheat sheet: Search by task | Data is big | Scoop.it
Here's your easy-to-use guide to dozens of useful ggplot2 R data visualization commands in a handy, searchable table. Plus, download code snippets to save yourself a boatload of typing.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Reviving the Statistical Atlas of the United States with New Data

Reviving the Statistical Atlas of the United States with New Data | Data is big | Scoop.it

Ever since I found out about the Statistical Atlas of the United States, it annoyed me that there wasn't one in the works for the 2010 Census due to cuts in funding.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Identifying music genres using Clarifai's deep learning API

Identifying music genres using Clarifai's deep learning API | Data is big | Scoop.it
An experiment using Clarifai's deep learning API and Google Prediction to automatically classify artist music genres based on their album covers.
more...
No comment yet.
Rescooped by ukituki from SNA - Social Network Analysis ... and more.
Scoop.it!

The Rise of Mobile and Analytics

http://www.netbiscuits.com Learn how marketers can harness two of the biggest digital trends to optimize content strategy as content marketing comes of age.

Via João Greno Brogueira
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Apache Spark 1.4 adds R language and hardened machine-learning

Apache Spark 1.4 adds R language and hardened machine-learning | Data is big | Scoop.it
With support for stats language R, along with a range of new features, the latest update to in-memory data-processing engine Apache Spark is now out.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

0.5%: The Margin Between Good and Great, and How to Find It

0.5%: The Margin Between Good and Great, and How to Find It | Data is big | Scoop.it
David Epstein
Author
 
ukituki's insight:

As sports have become high stakes, global competitions, the performance margins that differentiate good,great and legendary have shrunk dramatically. The importance of finding tiny advantages is greater than ever. Where Moneyball was once a novelty, now it is the norm in every sport as data is mined to find those vanishingly small advantages. And yet, even as sports are awash in data, much of it is applied to surprisingly little effect. David Epstein will discuss the importance of combining big data with cutting edge science about expertise to emerge with "small data": the kind that reveals where those tiny advantages are hiding. 
Read more at http://library.fora.tv/2015/06/10/05_the_margin_between_good_and_great_and_how_to_find_it#XucuJ30Lg9HhBgVB.99

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Obama-RNN — Machine generated political speeches.

Obama-RNN — Machine generated political speeches. | Data is big | Scoop.it
Political speeches are among the most powerful tools leaders use to influence entire populations. Throughout history, political speeches…
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Appraiser : How Airbnb Generates Complex Models in Spark for Demand prediction

Spark summit 2015 talk on Airbnb's price tips algorithms and implementation details in Spark using Aerosolve.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Running RStudio on Digital Ocean, AWS etc Using Tutum and Docker Containers

Running RStudio on Digital Ocean, AWS etc Using Tutum and Docker Containers | Data is big | Scoop.it
Via RBloggers I noticed a tutorial today on Setting Rstudio server using Amazon Web Services (AWS). In the post Getting Started With Personal App Containers in the Cloud I described how I linked my...
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Tutorial@SIGMOD'15: Mining and Forecasting of Big Time-series Data

Tutorial@SIGMOD'15: Mining and Forecasting of Big Time-series Data | Data is big | Scoop.it

Given a large collection of time series, such as web-click logs, electric medical records and motion capture sensors, how can we efficiently and effectively find typical patterns? How can we statistically summarize all the sequences, and achieve a meaningful segmentation? What are the major tools for forecasting and outlier detection? Time-series data analysis is becoming of increasingly high importance, thanks to the decreasing cost of hardware and the increasing on-line processing capability. 

ukituki's insight:

The objective of this tutorial is to provide a concise and intuitive overview of the most important tools that can help us find patterns in large-scale time-series sequences.

 

We review the state of the art in four related fields: (1) similarity search and pattern discovery, (2) linear modeling and summarization, (3) non-linear modeling and forecasting, and (4) the extension of time-series mining and tensor analysis. The emphasis of the tutorial is to provide the intuition behind these powerful tools, which is usually lost in the technical literature, as well as to introduce case studies that illustrate their practical use. 

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Working with Apache SparkR-1.4.0 in RStudio

Working with Apache SparkR-1.4.0 in RStudio | Data is big | Scoop.it
In this practical implementation I show how to install sparkR package in RStudio in Windows Operating System environment
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Building a food recommendation engine with Spark / MLlib and Play

Building a food recommendation engine with Spark / MLlib and Play | Data is big | Scoop.it
Recommendation engines have become very popular in the last decade with the explosion of e-commerce, on demand music and movie services, dating sites, local reviews, news aggregation and advertisin...
more...
No comment yet.
Scooped by ukituki
Scoop.it!

The art of structured thinking and analyzing

The art of structured thinking and analyzing | Data is big | Scoop.it
Structured thinking is a process of putting a framework to an unstructured problem. Having a structure not only helps an analyst understand the problem at a macro level.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Static and dynamic network visualization with R

Static and dynamic network visualization with R | Data is big | Scoop.it
ukituki's insight:

Fantastic tutorial on networks visualization in R by @Ognyanova

#rstats

more...
No comment yet.
Scooped by ukituki
Scoop.it!

SparkR: Distributed data frames with Spark and R

SparkR: Distributed data frames with Spark and R | Data is big | Scoop.it
(This article was first published on Revolutions, and kindly contributed to R-bloggers)
R is now integrated with Apache Spark, the open-source cluster computing framework.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Comparing machine learning classifiers based on their hyperplanes or decision boundaries - Data Scientist in Ginza, Tokyo

Comparing machine learning classifiers based on their hyperplanes or decision boundaries - Data Scientist in Ginza, Tokyo | Data is big | Scoop.it
In Japanese version of this blog, I've written a series of posts about how each kind of machine learning classifiers draws various classification hyperplanes or decision boundaries. So in this post I want to show you a summary of the series and how their hyperplanes or decision boundaries vary  
more...
No comment yet.