Data is big
Follow
Find
1.9K views | +0 today
Scooped by ukituki
onto Data is big
Scoop.it!

UCI Machine Learning Repository

UCI Machine Learning Repository | Data is big | Scoop.it
ukituki's insight:

We currently maintain 253 data sets as a service to the machine learning community. You may view all data sets through our searchable interface. Our old web siteis still available, for those who prefer the old format.

more...
No comment yet.
Data is big
"The future is here. It's just not evenly distributed yet." William Gibson
Curated by ukituki
Your new post is loading...
Your new post is loading...
Scooped by ukituki
Scoop.it!

Understanding Random Forests: From Theory to Practice

Understanding Random Forests: From Theory to Practice | Data is big | Scoop.it
more...
No comment yet.
Scooped by ukituki
Scoop.it!

15 interviews with 15 data scientists

15 interviews with 15 data scientists | Data is big | Scoop.it

Interesting PDF document featuring 15 data scientists (mostly co-founders of various start-ups or well known data science websites), with an average of 9 pages  per interview. 

ukituki's insight:
Parham Aarabi: Visual Image Extraction - CEO of ModiFace & University of Toronto Professor Pete Warden: Object Recognition - Co-Founder & CTO of Jetpac Trey Causey: Data Science & Football - Founder of the spread, Data Scientist at zulily Ravi Parikh: Modernizing Web and iOS Analytics - Co-Founder of Heap Analytics (YC W13) Ryan Adams: Intelligent Probabilistic Systems - Leader of Harvard Intelligent Probabilistic Systems Group Kang Zhao: Machine Learning & Online Dating - Assistant Professor, Tippie College of Business, University of Iowa Dave Sullivan: Future of Neural Networks and MLaaS - Founder and CEO of Blackcloud BSG - company behind Ersatz Wolfgang van Loeper: Big Data & Agriculture - Founder & CEO of MySmartFarm Laura Hamilton: Predicting Hospital Readmissions - Founder & CEO of Additive Analytics Harlan Harris: Building a Data Science Community - Founder and President of Data Community DC Abe Gong: Using Data Science to Solve Human Problems - Data Scientist at Jawbone, DataScienceWeekly.orgK. Hensien & C. Turner: ML => Energy Efficiency - Senior Product Development at Optimum Energy, Data Scientist at The Data Guild Andrej Karpathy: Training DL Models in a Browser - Machine Learning PhD student at Stanford, Creator of ConvNetJS George Mohler: Predictive Policing - Chief Scientist at PredPol, Asst. Professor Mathematics & CS, Santa Clara University Carl Anderson: Data Science & Online Retail - Director of Data Science at Warby Parker
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Data Analytics and Visualisation

Data Analytics and Visualisation | Data is big | Scoop.it
data visualisation examples data-analytics.github.io Selected Tools is a collection of tools that worls for the people on a daily basis and recommend warmly.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Caffe - deep learning framework developed with cleanliness, readability, and speed in mind.

ukituki's insight:

Caffe already powers academic research projects, startup prototypes, and even large-scale industrial applications in vision, speech, and multimedia. There is an active discussion and support community on Github.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Understanding Convolutions - colah's blog

Understanding Convolutions - colah's blog | Data is big | Scoop.it
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Data Structure Visualization

more...
No comment yet.
Scooped by ukituki
Scoop.it!

10 R Packages to Win Kaggle Competitions

10 R Packages to Win Kaggle Competitions | Data is big | Scoop.it
10 R Packages to Win Kaggle Competitions by Xavier Conort
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Definitive guide to prepare for an analytics interview

Definitive guide to prepare for an analytics interview | Data is big | Scoop.it
This article lays out all you wanted to know about analytics interview, best practices to follow and things to avoid during an interview.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Predicting CTR with online machine learning | MLWave

Predicting CTR with online machine learning | MLWave | Data is big | Scoop.it
more...
No comment yet.
Scooped by ukituki
Scoop.it!

One Hundred Million Creative Commons Flickr Images for Research

One Hundred Million Creative Commons Flickr Images for Research | Data is big | Scoop.it
by David A. Shamma


Today the photograph has transformed again. From the old world of unprocessed rolls of C-41 sitting in a fridge 20 years ago to sharing photos on the 1.5” screen of a point and shoot camera 10 years back.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Cohort analysis in R - "layer-cake graph"

Cohort analysis in R - "layer-cake graph" | Data is big | Scoop.it
Cohort analysis is one of the most powerful and demanded technique available to marketers for assessing long-term trends in customer retention and calculating life-time value. If you studied custor...
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Course: Big Data 2014

Course: Big Data 2014 | Data is big | Scoop.it
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Circular Migration Flow Plots in R

Circular Migration Flow Plots in R | Data is big | Scoop.it
A article of mine was published in Science today. It introduces estimates for bilateral global migration flows between all countries. The underlying methodology is based on the conditional maximisa...
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Shiny - Save your app as a function

Shiny - Save your app as a function | Data is big | Scoop.it
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Extreme Learning Machine: Learning Without Iterative Tuning

Neural networks (NN) and support vector machines (SVM) play key roles in machine learning and data analysis. However, it is known that there exist some challenging issues with them such as...
more...
No comment yet.
Scooped by ukituki
Scoop.it!

The infinite MNIST dataset

The infinite MNIST dataset | Data is big | Scoop.it
ukituki's insight:

This code produces an infinite supply of digit images derived from the well known MNIST dataset using pseudo-random deformations and translations. This is a streamlined version of the code used for the experiments reported in (Loosli, Canu, Bottou, 2007). A subset of the examples generated by this code are known as MNIST8M.  

more...
No comment yet.
Scooped by ukituki
Scoop.it!

How to implement an algorithm from a scientific paper | Code Capsule

How to implement an algorithm from a scientific paper | Code Capsule | Data is big | Scoop.it
This article is a short guide to implementing an algorithm from a scientific paper. I have implemented many complex algorithms from books and scientific
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Quoc Le’s Lectures on Deep Learning | Gaurav Trivedi

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Radim Řehůřek : Word2vec Tutorial

Radim Řehůřek : Word2vec Tutorial | Data is big | Scoop.it
I never got round to writing a tutorial on how to use word2vec in gensim. It s simple enough and the API docs are straightforward, but I know some people prefer more verbose formats. Let this post be a tutorial and a reference example.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Tutorials from useR! 2014 Los Angeles conference

Tutorials from useR! 2014 Los Angeles conference | Data is big | Scoop.it
The annual useR! international R User conference is the main meeting of the R user and developer community. In 2014, the conference will be held at the campus of the University of California in Los Angeles (UCLA).
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Predicting repeat buyers using purchase history

Predicting repeat buyers using purchase history | Data is big | Scoop.it
Another Kaggle contest means another chance to try out Vowpal Wabbit. This time on a data set of nearly 350 million rows.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Getting and Cleaninig Data in R

ukituki's insight:

1. Intro.

2. Downloading and Reading Data

3. Reading From MySQL

4. Data in HDF5 Format

5. Reading Data from Web (webscraping)

6. Getting Data from APIs

7. Reading from Other Sources

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Shiny - The Shiny Cheat sheet

Shiny - The Shiny Cheat sheet | Data is big | Scoop.it
ukituki's insight:

The Shiny cheat sheet is a quick reference guide for building Shiny apps.

 
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Cohort analysis in R - Retention charts

Cohort analysis in R - Retention charts | Data is big | Scoop.it
When we spend more money for attracting new customers then they bring us by the first but, usually, by the next purchases, we appeal to customer's life-time value (CLV). We expect that customers wi...
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Material for the Deep Learning Course

Material for the Deep Learning Course | Data is big | Scoop.it
ukituki's insight:

Full NYU course on #deeplearning by @ylecun #datasci  

more...
No comment yet.