Big Data Analytic...
Follow
Find
9.6K views | +0 today
The latest in what you need to know to handle and explore big data.
Curated by Dahl Winters
Your new post is loading...
Your new post is loading...
Scooped by Dahl Winters
Scoop.it!

eBay’s new Pulsar framework will analyze your data in real time

eBay’s new Pulsar framework will analyze your data in real time | Big Data Analytics and Science | Scoop.it
eBay has a new open-source, real-time analytics and stream-processing framework called Pulsar that the company claims is in production and is available for others to download, according to an eBay blog post on Monday. The online auction site is now using Pulsar to gather and process all the data pertaining to user interactions and their…
more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

BlazeGraph - Open-Source Scalable Graph Database

BlazeGraph - Open-Source Scalable Graph Database | Big Data Analytics and Science | Scoop.it

SYSTAP is very pleased to launch it’s new graph database platform Blazegraph™. It is built on the same open source GPLv2 platform and maintains 100% binary and API compatibility with Bigdata®. Blazegraph™ will take over as SYSTAP’s flagship graph database. It is specifically designed to support big graphs offering both Semantic Web (RDF/SPARQL) and Graph Database (tinkerpop, blueprints, vertex-centric) APIs. It features robust, scalable, fault-tolerant, enterprise-class storage and query and high-availability with online backup, failover and self-healing. It is in production use with enterprises such as Autodesk, EMC, Yahoo7!, and many others. Blazegraph™ provides both embedded and standalone modes of operation.

Blazegraph has a High Availability and Scale Out architecture. It provides robust support for Semantic Web (RDF/SPARQ)L and Property Graph (Tinkerpop) APIs. Highly scalable Blazegraph graph can handle 50 Billion edges on a single node.

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Data Science with Apache Hadoop: Predicting Airline Delays

Data Science with Apache Hadoop: Predicting Airline Delays | Big Data Analytics and Science | Scoop.it
In this multi-part blog post, we will demonstrate Machine Learning techniques using existing modeling tools on Apache Hadoop. Part 1 uses Pig and Python.
more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Inside the Apache Software Foundation's newest Top-Level Project: Apache Flink

Inside the Apache Software Foundation's newest Top-Level Project: Apache Flink | Big Data Analytics and Science | Scoop.it
Flink contributors talk Big Data processing, open-source community and the future of the newly minted TLP

 

Flink is an open-source Big Data system that fuses processing and analysis of both batch and streaming data. The data-processing engine, which offers APIs in Java and Scala as well as specialized APIs for graph processing, is presented as an alternative to Hadoop’s MapReduce component with its own runtime. Yet the system still provides access to Hadoop’s distributed file system and YARN resource manager.

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

FAIR open sources deep-learning modules for Torch

FAIR open sources deep-learning modules for Torch | Big Data Analytics and Science | Scoop.it
The modules are significantly faster than the default ones in Torch and have accelerated research projects by allowing users to train larger neural nets in less time.
more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Big Data vs. Cancer: Algorithm Identifies Genetic Changes across Cancers

Big Data vs. Cancer: Algorithm Identifies Genetic Changes across Cancers | Big Data Analytics and Science | Scoop.it
Using a computer algorithm that can sift through mounds of genetic data, researchers from Brown University have identified several networks of genes that, when hit by a mutation, could play a role in the development of multiple types of cancer.
more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

IBM detects skin cancer more quickly with visual machine learning - Computerworld

IBM detects skin cancer more quickly with visual machine learning - Computerworld | Big Data Analytics and Science | Scoop.it
Skin cancer can be detected more quickly and accurately by using cognitive computing-based visual analytics, researchers at IBM Research have found, in collaboration with New York's Memorial Sloan Kettering Cancer Center.
more...
No comment yet.
Rescooped by Dahl Winters from Docker
Scoop.it!

A Docker Image for Graph Analytics on Neo4j with Apache Spark GraphX

A Docker Image for Graph Analytics on Neo4j with Apache Spark GraphX | Big Data Analytics and Science | Scoop.it
This docker image is a great addition to Neo4j if you're looking to do easy PageRank or community detection on your graph data. Additionally, the results of the graph analysis are applied back to Neo4j.

Via Docker
more...
No comment yet.
Rescooped by Dahl Winters from Amazing Science
Scoop.it!

Discovering the Undiscovered - DOE Joint Genome Institute

Discovering the Undiscovered - DOE Joint Genome Institute | Big Data Analytics and Science | Scoop.it
Advancing New Tools to Fill in the Microbial Tree of Life To paraphrase a famous passage from Coleridge’s “The Rime of the Ancient Mariner”: microbes, microbes everywhere, though most we do not know. This is changing, though.

Via Dr. Stefan Gruenwald
Dahl Winters's insight:

A big use of big data - exploring the genomes of life on Earth.  One of the biggest data sets in the world is the one we carry around with us and on us every day.

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

National Data Science Bowl

The National Data Science Bowl is a first-of-its-kind competition asking data scientists to use their skills and big data for social good. Learn more: datasciencebowl.com.

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

You Need an Algorithm, Not a Data Scientist - blogs.hbr.org (blog)

You Need an Algorithm, Not a Data Scientist - blogs.hbr.org (blog) | Big Data Analytics and Science | Scoop.it
Manually crunching numbers isn’t scalable.
Dahl Winters's insight:

It might actually take a data scientist to help inform the algorithms, but once a few algorithms are developed they could do a world of good.  It's not a matter of human vs. machine, it's the act of using one to complement the other.

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Facebook shows off its deep learning skills with DeepFace

Facebook shows off its deep learning skills with DeepFace | Big Data Analytics and Science | Scoop.it
A Facebook research paper details a new method for recognizing the people in images by combining deep learning techniques with a method for recomposing angled images as straight-on ones.
more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

How Deep Learning Analytics Mimic the Mind

How Deep Learning Analytics Mimic the Mind | Big Data Analytics and Science | Scoop.it

Many advances in analytics and machine learning have been based on our understanding of how the brain works.   Deep learning is no exception — it takes its inspiration from our understanding of the cortex in the brain.  

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Docker and Mesos: Like peanut butter and jelly

Docker and Mesos: Like peanut butter and jelly | Big Data Analytics and Science | Scoop.it
You want to run Docker containers, but how do you do so at hyper scale? Apache Mesos may be the answer. Matt Asay explains.
more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Great list of resources: data science, visualization, machine learning, big data

Great list of resources: data science, visualization, machine learning, big data | Big Data Analytics and Science | Scoop.it
Fantastic resource created by Andrea Motosi. I've only included the 5 categories that are the most relevant to our audience, though it has 31 categories total,…
more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Lockheed Martin Releases Open-Source GUI for Real-Time Apache Storm Data Processing

Lockheed Martin Releases Open-Source GUI for Real-Time Apache Storm Data Processing | Big Data Analytics and Science | Scoop.it

StreamFlow™ is a stream processing tool designed to rapidly build and monitor processing workflows. The ultimate goal of StreamFlow is to make working with stream processing frameworks such as Apache Storm easier, faster, and with "enterprise" like management functionality.

StreamFlow provides a graphical user interface for non-developers such as data scientists, analysts, or operational users to rapidly build scalable data flows and analytics.

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Image Classification with Convolutional Neural Networks – my attempt at the NDSB Kaggle Competition

Image Classification with Convolutional Neural Networks – my attempt at the NDSB Kaggle Competition | Big Data Analytics and Science | Scoop.it

On December 15th, Kaggle started the National Data Science Bowl competition (which runs till the end of March 2015). The competition consists of classifying images of ocean plankton in 121 different classes, with a supplied training set of around 30,000 labeled images, and a test set of 130,000 for which you have to provide the classification. The images are black and white, and in different sizes and shapes, with width and heights ranges roughly between 30 pixels and over 200 pixels. This is a real-world problem to tackle, while also providing through the leaderboard an ability to track your progress, as well as how you do compared to others.

Dahl Winters's insight:

A good overview of getting started fast with deep learning on a real-world problem.

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

The Dark Corners of Our DNA Hold Clues about Disease

The Dark Corners of Our DNA Hold Clues about Disease | Big Data Analytics and Science | Scoop.it
A “deep-learning” algorithm shines a light on mutations in once obscure areas of the genome
more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

How Big Data, Business Intelligence and Analytics Are Fueling Mobile Application Development

How Big Data, Business Intelligence and Analytics Are Fueling Mobile Application Development | Big Data Analytics and Science | Scoop.it
You can’t separate successful mobile application development from either data or analytics.
more...
No comment yet.
Rescooped by Dahl Winters from Frontiers of Journalism
Scoop.it!

Legislative Explorer - data patterns of lawmaking

Legislative Explorer - data patterns of lawmaking | Big Data Analytics and Science | Scoop.it
Interactive visualization that allows anyone to explore actual patterns of lawmaking in Congress

Via M. Edward (Ed) Borasky
more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Big data scours public records to predict crime

New software allowing for predictive policing may be coming to a police department near you. Beware, made by telecommunications company Intrado, searches billions of records to find and predict...
Dahl Winters's insight:

Only 6 minutes about an innovative use of big data.

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Big Data Technologies - deep learning and more

Big Data Technologies - deep learning and more | Big Data Analytics and Science | Scoop.it
New technologies in big data - deep learning, image processing, machine learning, and more.

 

The picture describes an open-sourced deep learning algorithm (on Github) that translates images into logical, descriptive sentences.  For more on deep learning and other cutting-edge big data topics, come visit this related Scoop.it site.

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Hadoop successor sparks a data analysis evolution

Hadoop successor sparks a data analysis evolution | Big Data Analytics and Science | Scoop.it
If 2014 was the year that Apache Hadoop sparked the big data revolution, 2015 may be the year that Apache Spark supplants Hadoop with its superior capabilities for richer and more timely analysis.
more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

Can MapReduce solve planning problems?

Can MapReduce solve planning problems? | Big Data Analytics and Science | Scoop.it

To solve a planning or optimization problem, some solvers tend to scale out poorly: As the problem has more variables and more constraints, they use a lot more RAM memory and CPU power. They can hit hardware memory limits at a few thousand variables and few million constraint matches. One way their users typically work around such hardware limits, is to use MapReduce. 

more...
No comment yet.
Scooped by Dahl Winters
Scoop.it!

How Big Data Analytics is Aiding Search for Flight 370

How Big Data Analytics is Aiding Search for Flight 370 | Big Data Analytics and Science | Scoop.it

As the hours and days go by following the sudden and mysterious disappearance of Malaysia Airlines Flight 370 somewhere in Southeast Asia, more people and organizations are joining the search party. And they are using every tool at their disposal -- not the least of which big data analytics -- to try and locate the Boeing 777, which carried 239 people and has thousands of family members and friends heartsick.

more...
M. Edward (Ed) Borasky's comment, March 16, 2014 11:04 PM
How Vendors Are Newsjacking an Act of Piracy ;-)
Ken Byrne's curator insight, March 17, 2014 12:42 PM

Can data analytics help in the search--try 

Tomnod and see