Data is big
Follow
4.7K views | +9 today
 
Scooped by ukituki
onto Data is big
Scoop.it!

20 white papers and power point presentations

20 white papers and power point presentations | Data is big | Scoop.it
Posted on AnalyticBridge over the last few years, related to analytics, data science, big data, statistics, visualization. Feel free to add yours.

Introductio…
more...
No comment yet.

From around the web

Data is big
&amp;amp;quot;The future is here. It's just not evenly distributed yet.&amp;amp;quot; - William Gibson     :::: Follow this topic for fresh resources and ideas related to Data Science, Machine Learning, Algorithms and #bigdata :::: <a href="http://www.dataisbig.co" rel="nofollow">http://www.dataisbig.co</a>/
Curated by ukituki
Your new post is loading...
Your new post is loading...
Scooped by ukituki
Scoop.it!

Strategies To Speed Up R Code

Strategies To Speed Up R Code | Data is big | Scoop.it
The for-loop in R, can be very slow in its raw un-optimised form, especially when dealing with larger data sets. There are a number of ways you can make your logics run fast, but you will be really surprised how fast you can actually go.
This chapter shows a number of approaches including simple
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Cross correlation for NASDAQ 100 stocks with #quantmod R package

Cross correlation for NASDAQ 100 stocks with #quantmod R package | Data is big | Scoop.it
Heaven, I'm in heaven, and my heart beats so that I can hardly speak, and I seem to find the happiness I seek, when we're out together dancing cheek to cheek (Cheek To Cheek, Irving Berlin)  
more...
No comment yet.
Scooped by ukituki
Scoop.it!

The Data Science Ecosystem in One Tidy Infographic

The Data Science Ecosystem in One Tidy Infographic | Data is big | Scoop.it
It probably comes as no surprise, but we talk to a lot of data scientists at CrowdFlower. We like learning the tools they use, the programs that make their liv…
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Flowbox.io & Luna - amazing new programming language

Flowbox.io & Luna - amazing new programming language | Data is big | Scoop.it
Woyciech Danilo from flowbox.io talks about their new programming language, Luna. Flowbox develops professional video compositing software, which is powered by a new programming language...
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Retail Forecasting: Step 1 of 6, data preprocessing

Retail Forecasting: Step 1 of 6, data preprocessing | Data is big | Scoop.it
Accurate and timely forecast in retail business drives success. It is an essential enabler of supply and inventory planning, product pricing, promotion, and placement. As part of Azure ML offering, Microsoft provides a template letting data scientists easily build and deploy a retail forecasting solution.
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Data Science Specialization Course Notes by sux13

I have compiled notes for all 9 courses of the Johns Hopkins University/Coursera Data Science Specialization. The notes are all written in R Markdown format and encompass all concepts covered in class, as well as additional examples I have compiled from lecture, my own exploration, StackOverflow, and Khan Academy. 

ukituki's insight:

These documents are intended to be comprehensive sources of reference for future use and they have served me wonderfully in completing the assignments for each course. So I hope you will find them helpful as well.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Machine Learning -on demand version - Stanford University (Coursera)

Machine Learning -on demand version - Stanford University (Coursera) | Data is big | Scoop.it
Machine Learning from Stanford University. Machine learning is the science of getting computers to act without being explicitly programmed. In the past decade, machine learning has given us self-driving cars, practical speech recognition, effective web search, and a vastly improved understanding of the human genome. Machine learning is so pervasive today that you probably use it dozens of times a day without knowing it.  
ukituki's insight:

The on demand format allows you to work through the materials at your own pace. All materials are available at any time, and there are no deadlines for exercises or assignments.

 

If you joined Machine Learning in a previous session but didn’t quite complete the coursework, I hope you’ll consider revisiting the materials in the on demand format. You can now take as much time as you need to fully understand each lesson and complete each assignment successfully.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Guide to Data Science Cheat Sheets

Guide to Data Science Cheat Sheets | Data is big | Scoop.it
ukituki's insight:

Selection of the most useful Data Science cheat sheets, covering SQL, Python (including NumPy, SciPy and Pandas), R (including Regression, Time Series, Data Mining), MATLAB, and more.

more...
No comment yet.
Rescooped by ukituki from Big Data Analysis in the Clouds
Scoop.it!

Power to the new people analytics | McKinsey & Company

Power to the new people analytics | McKinsey & Company | Data is big | Scoop.it
Techniques used to mine consumer and industry data may also let HR tackle employee retention and dissatisfaction. A McKinsey Quarterly article.

Via Ángel Yustas Domínguez, Klaus Meschede, Pierre Levy
more...
No comment yet.
Scooped by ukituki
Scoop.it!

CDO = IS + IG + IR + IE | Blog post

ukituki's insight:

Capgemini’s 2015 survey of 1,000 senior decision-makers across nine industries and 10 countries revealed that some 43% of organizations are restructuring to exploit data opportunities. Encouragingly, 33% of the surveyed companies have appointed a Chief Data Officer (CDO) or a similar C-level role to lead and exploit such data opportunities, with another 19% planning to do so over the next 12 months.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Text Understanding from Scratch

for Torch implementation go here: https://github.com/zhangxiangxiao/Crepe

ukituki's insight:

This article demontrates that we can apply deep learning to text understanding from character-level inputs all the way up to abstract text concepts, using temporal convolutional networks (ConvNets). We apply ConvNets to various large-scale datasets, including ontology classification, sentiment analysis, and text categorization. We show that temporal ConvNets can achieve astonishing performance without the knowledge of words, phrases, sentences and any other syntactic or semantic structures with regards to a human language. Evidence shows that our models can work for both English and Chinese.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Another great R cheatsheet from RStudio - Data #Visualization #rstats

Another great R cheatsheet from RStudio - Data #Visualization #rstats | Data is big | Scoop.it

We've added a new cheatsheet to our collection. Data Visualization with ggplot2 describes how to build a plot with ggplot2 and the grammar of graphics.

ukituki's insight:

You  will find helpful reminders of how to use:

geomsstatsscalescoordinate systemsfacetsposition adjustmentslegends, andthemes

The cheatsheet also documents tips on zooming.

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Replay: Reproducible data analysis with the checkpoint package

Thanks to all who attended my webinar earlier this week, Reproducibility with Revolution R Open and the Checkpoint Package. If you missed the live session, you can catch up with the slides and video replay which I've embedded below. If you just want to check out the demo of the checkpoint package, it starts at 18:30 in the video below. If you want to follow along at home, you can download the demo script here. Revolution Analytics webinars: Reproducibility with Revolution R Open and the Checkpoint Package
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Deep Learning, NLP, and Representations - colah's blog Powered by RebelMouse

Deep Learning, NLP, and Representations - colah's blog Powered by RebelMouse | Data is big | Scoop.it

This post reviews some extremely remarkable results in applying deep neural networks to natural language processing (NLP). In doing so, I hope to make accessible one promising answer as to why deep neural networks work. I think it's a very elegant perspective.http://colah.github.io/posts/2014-07-NLP... ;

ukituki's insight:

http://colah.github.io/posts/2014-07-NLP-RNNs-Representations/

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Reducing your R memory footprint by 7000x

Reducing your R memory footprint by 7000x | Data is big | Scoop.it
R is notoriously a memory heavy language. I don't necessarily think this is a
bad thing--R wasn't built to be super performant, it was built for analyzing
data! That said, there are times when there are some implementation patterns
that are quite...redundant. As an example, I'm going to show you how you can
prune a 330 MB glm to 45KB without losing significant functionality.


----->


Let's trim the R fat
Le Model
Our model is going ...
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Deep Image: Scaling up Image Recognition. Slides of Ren Wu, Scientist @BaiduResearch, #deeplearning

Deep Image: Scaling up Image Recognition. Slides of Ren Wu, Scientist @BaiduResearch,  #deeplearning | Data is big | Scoop.it

Data is big. 

more...
No comment yet.
Scooped by ukituki
Scoop.it!

Pinnability: Machine learning in the home feed

Pinnability: Machine learning in the home feed | Data is big | Scoop.it
Our unique data set contains abundant human-curated content, so that Pin, board and user dynamics provide informative features for accurate Pinnability prediction. These features fall into three general categories: Pin features, Pinner features and interaction features:

Pin features capture the intrinsic quality of a Pin, such as historical popularity, Pin freshness and likelihood of spam. Visual features from Convolutional Neural Networks (CNN) are also included.
Pinner features are about the particulars of a user, such as how active the Pinner is, gender and board status.
Interaction features represent the Pinner’s past interaction with Pins of a similar type.
more...
No comment yet.
Rescooped by ukituki from Data Science
Scoop.it!

Disruptive Tools In The Data Science Toolkit (Dr. Gurjeet Singh) - Exponential Finance 2014

Dr. Gurjeet Singh of Ayasdi, named Fast Company's 2014 Most Innovative Company in Big Data, addresses the cutting edge of big data and how machine learning/b...

Via Karlo Jara
more...
No comment yet.
Scooped by ukituki
Scoop.it!

Trick to enhance power of Regression model

Trick to enhance power of Regression model | Data is big | Scoop.it
We, as analysts, specialize in optimization of already optimized processes. As the optimization gets finer, opportunity to make the process better gets thinner.  One of the predictive modeling technique used frequently use is regression (Linear or Logistic). Another equally competing technique (typically considered as a challenger) is Decision tree.  
more...
No comment yet.