Code: Big Data
Follow
Find
1.8K views | +0 today
Code: Big Data
Research and Practice in Big Data Analytics
Curated by Jose Menes
Your new post is loading...
Your new post is loading...
Scooped by Jose Menes
Scoop.it!

Why Zipf's law explains so many big data and physics phenomenons

Why Zipf's law explains so many big data and physics phenomenons | Code: Big Data | Scoop.it
The Zipf's law states that in many settings (that we are going to explore), the volume or size of entities is inversely proportional to a power s (s > 0…
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Leaving Data on the Table: Data Scientists Reveal Obstacles to Big Data - insideBIGDATA

Leaving Data on the Table: Data Scientists Reveal Obstacles to Big Data - insideBIGDATA | Code: Big Data | Scoop.it

The huge volume of Big Data produced by sensors, genomic sequencers, electronic exchanges, and connected devices continues to generate headlines but it’s the diverse types of data, not the volume, that’s a bigger challenge to data scientists and is causing them to “leave data on the table.”

more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Building a Better Brain: Saffron Cognitive Computing Platform Replicates How We Associate Facts - insideBIGDATA

Building a Better Brain: Saffron Cognitive Computing Platform Replicates How We Associate Facts - insideBIGDATA | Code: Big Data | Scoop.it
Building a Better Brain: Saffron Cognitive Computing Platform Replicates How We Associate Facts
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

70+ websites to get large data repositories for free « Big Data Made Simple

70+ websites to get large data repositories for free « Big Data Made Simple | Code: Big Data | Scoop.it
Do you require GBs of data to check the performance of your app? The easiest way is to download samples of data from free data repositories available on the Web. But the main disadvantage of this approach is the data will have very less unique content and it may not
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Mode raises $2M and opens 'GitHub for data' to the public

Mode raises $2M and opens 'GitHub for data' to the public | Code: Big Data | Scoop.it
Mode is trying to do for data scientists and analysts what GitHub did for developers by giving them a place where they can find, collaborate and work on data. Formation8 led the new round, which also included Reddit’s Alexis Ohanian.
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

A Tour of Machine Learning Algorithms

A Tour of Machine Learning Algorithms | Code: Big Data | Scoop.it
Originally published by Jasonb on MachineLearningMastery.com.
From the Ensemble Methods section
Learning Style
There are different ways an algorithm can model…
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Big Data is the New Engine of the Internet

Big Data is the New Engine of the Internet | Code: Big Data | Scoop.it
Mobile devices and expanding networks of sensors are generating huge amounts of data that are increasingly searchable and can be shared to discern patterns
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Is Python Becoming the King of the Data Science Forest? - Experfy Insights

Is Python Becoming the King of the Data Science Forest? - Experfy Insights | Code: Big Data | Scoop.it
Python is quickly gaining ground over R for data science. We consider the reasons for this shift and whether R has a future within the big data ecosystem.
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Twitter to Release All Tweets to Scientists: A Trove of Billions of Tweets Will Be a Research Boon and An Ethical Dilemma

Twitter to Release All Tweets to Scientists: A Trove of Billions of Tweets Will Be a Research Boon and An Ethical Dilemma | Code: Big Data | Scoop.it
A trove of billions of tweets will be a research boon and an ethical dilemma
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Big Data Lessons From Netflix | Innovation Insights | WIRED

Big Data Lessons From Netflix | Innovation Insights | WIRED | Code: Big Data | Scoop.it

In a data-driven environment like Netflix, data visualization plays a key role.

more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

NSA plans to FREE YOUR DATA with range of cloud services, analytics

NSA plans to FREE YOUR DATA with range of cloud services, analytics | Code: Big Data | Scoop.it

st'Why pay providers to store or analyse your stuff? We've already done it'

Jose Menes's insight:

Keep in mind this was posted on April 1st. That is all. :)

more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

17 short tutorials all data scientists should read (and practice)

17 short tutorials all data scientists should read (and practice) | Code: Big Data | Scoop.it
I hope I find the time to write a one-page survival guide for UNIX, Python and Perl. Here's one for R. The links to core data science concepts are below - I ne…
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

One Page R: A Survival Guide to Data Science with R

One Page R: A Survival Guide to Data Science with R | Code: Big Data | Scoop.it
From Togaware.

Many of the documents have been developed and tested whilst visiting the Shenzhen Institutes of Technology as an International Visiting Profess…
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Fast clustering algorithms for massive datasets

Fast clustering algorithms for massive datasets | Code: Big Data | Scoop.it
Here we discuss two potential algorithms that can perform clustering extremely fast, on big data sets, as well as the graphical representation of such complex…
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

A Large set of Machine Learning Resources for Beginners to Mavens

A Large set of Machine Learning Resources for Beginners to Mavens | Code: Big Data | Scoop.it
Note : I regularly update this list. Machine Learning 101: I. Introduction to Machine Learning http://homepages.inf.ed.ac.uk/rbf/IAPR/researchers/MLPAGES/mltut.htm http://jeremykun.com/2012/08/04/machine-learning-introduction/ http://www.omidrouhani.com/research/machinelearning/html/machinelearning.htm http://www.youtube.com/playlist?list=PLD63A284B7615313A (cal tech class) II.  Linear Regression http://en.wikipedia.org/wiki/Linear_regression http://www.youtube.com/watch?v=ExVhaN36jBs http://en.wikipedia.org/wiki/Simple_linear_regression http://www.youtube.com/watch?v=ocGEhiLwDVc     III) Linear Algebra http://ocw.mit.edu/courses/mathematics/18-06sc-linear-algebra-fall-2011/Syllabus/ https://www.khanacademy.org/math/linear-algebra online text http://joshua.smcvt.edu/linearalgebra/book.pdf - see http://joshua.smcvt.edu/linearalgebra/ for usage rights V) Linear Regression with Multiple Variables - Gradient Descent http://en.wikipedia.org/wiki/Gradient_descent http://www.youtube.com/watch?v=umAeJ7LMCfU (discusses above wiki …
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Views from the front lines of the data-analytics revolution | McKinsey & Company

Views from the front lines of the data-analytics revolution | McKinsey & Company | Code: Big Data | Scoop.it
At a unique gathering of data-analytics leaders, new solutions began emerging to vexing privacy, talent, organizational, and frontline-adoption challenges. A McKinsey Quarterly article.
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Conjecture: Scalable Machine Learning in Hadoop with Scalding « Code as Craft

Conjecture: Scalable Machine Learning in Hadoop with Scalding « Code as Craft | Code: Big Data | Scoop.it

Predictive machine learning models are an important tool for many aspects of e-commerce.  At Etsy, we use machine learning as a component in a diverse set of critical tasks. For instance, we use predictive machine learning models to estimate click rates of items so that we can present high quality and relevant items to potential buyers on the site.

more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

The Philosophy of Data

The Philosophy of Data | Code: Big Data | Scoop.it
Our ability to gather and process huge amounts of data does many things, including correcting intuitive biases and illuminating patterns of behavior.
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Skymind launches with open-source, plug-and-play deep learning features for your app

Skymind launches with open-source, plug-and-play deep learning features for your app | Code: Big Data | Scoop.it
In Silicon Valley, deep learning ranks as one of the hottest technologies. Now, this startup sees a chance to let lots of developers incorporate deep learning into their apps. Deep learning essenti...
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

10 Big Data Pros To Follow On Twitter - InformationWeek

10 Big Data Pros To Follow On Twitter - InformationWeek | Code: Big Data | Scoop.it
Looking for big data expertise on Twitter? Start by following these 10 industry players.
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

100+ Interesting Data Sets for Statistics

100+ Interesting Data Sets for Statistics | Code: Big Data | Scoop.it
Looking for interesting data sets? Here's a list of more than 100 of the best stuff, from dolphin relationships to political campaign donations to death row prisoners.
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Harvest machine data using Hadoop and Hive

Harvest machine data using Hadoop and Hive | Code: Big Data | Scoop.it
Machine data can come in many different formats and quantities. Weather sensors, fitness trackers, and even air-conditioning units produce massive amounts of data, which begs for a big data solution. But how do you decide what data is important, and how do you determine what proportion of that information is valid, worth including in reports, or valuable in detecting alert situations? This article covers some of the challenges and solutions for supporting the consumption of massive machine data sets that use big data technology and Hadoop.
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Big Data has big problems

Big Data has big problems | Code: Big Data | Scoop.it
Writing in the Financial Times, Tim Harford (The Undercover Economist Strikes Back, Adapt, etc) offers a nuanced, but ultimately damning critique of Big Data and its promises.
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

21 Thought-Leader Professors in Data Science

21 Thought-Leader Professors in Data Science | Code: Big Data | Scoop.it
The field of data science continues to grow, and with it come thought leaders who contribute to the industry through outreach and education. Many of the data s…
more...
No comment yet.
Scooped by Jose Menes
Scoop.it!

Lies, Damned Lies...: Building your own web analytics system using Big Data tools

It’s been a busy couple of years here at Microsoft. For the dwindling few of you who are keeping track, at the beginning of 2012 I took a new job, running our “Big Data” platform for Microsoft’s Online Services Division (OSD) – the division that owns the Bing search engine and MSN, as well as our global advertising business. As you might expect, Bing and MSN throw off quite a lot of data – around 70 terabytes a day.(that’s over 25 petabytes a year, to save you the trouble of calculating it yourself). To process, store and analyze this data, we rely on a distributed data infrastructure spread across tens of thousands of servers. It’s a pretty serious undertaking; but at its heart, the work we do is just a very large-scale version of what I’ve been doing for the past thirteen years: web analytics. One of the things that makes my job so interesting, however, is that although many of the data problems we have to solve are familiar – defining events, providing a stable ID, sessionization, enabling analysis of non-additive measures, for example – the scale of our data (and the demands of our internal users) has meant...
more...
No comment yet.