BigStuff
2.7K views | +0 today
Follow
Your new post is loading...
Your new post is loading...
Scooped by Erwin J. van Eijk
Scoop.it!

Apache Spark Machine Learning Example code review

Spark Machine Learning code review from our Apache Spark with Scala training course available at ...
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Big Data Processing with Apache Spark - Part 4: Spark Machine Learning

In this fourth installment of Apache Spark article series, author Srini Penchikala discusses machine learning concepts and Spark MLlib library for running predictive analytics using a sample application.
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Spark 2.0 prepares to catch fire

Spark 2.0 prepares to catch fire | BigStuff | Scoop.it
Apache Spark 2.0 is almost upon us. If you have an account on Databricks’ cloud offering, you can get access to a technical preview today; for the rest of us, it may be a week or two, but by Spark Summit next month, I expect Apache Spark 2.0 to be...
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Apache Spark: 5 Pitfalls You MUST Solve Before Changing Your Architecture - DZone Big Data

Apache Spark: 5 Pitfalls You MUST Solve Before Changing Your Architecture - DZone Big Data | BigStuff | Scoop.it

The top 5 things you need to know before moving to Apache Spark.

more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Leveraging Big Data for Security Analytics

Leveraging Big Data for Security Analytics | BigStuff | Scoop.it

Advanced Cyber Security Project Metron Approved As Apache Incubator

more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Apache Spark -- Basic Cluster Configuration and Concepts

I am just barely setting up this Plus and Youtube account and would greatly appreciate if you would go Subscribe to my brand-new Youtube Channel at https://g...

more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Self-Learn Yourself Apache Spark in 21 Blogs - #1

Self-Learn Yourself Apache Spark in 21 Blogs - #1 | BigStuff | Scoop.it

We have received many requests from friends who are constantly reading our blogs to provide them a complete guide to sparkle in Apache Spark. So here we have c…

more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

How Spotify Scales Apache Storm

How Spotify Scales Apache Storm | BigStuff | Scoop.it
Spotify has built several real-time pipelines using Apache Storm for use cases
Erwin J. van Eijk's insight:

Interesting.

more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

RapidMiner Rides Apache Storm to Deliver Predictive Analytics - IT Business Edge (blog)

RapidMiner Rides Apache Storm to Deliver Predictive Analytics - IT Business Edge (blog) | BigStuff | Scoop.it
With more data moving into Apache Storm clusters, it makes sense to shift the locus of where predictive analytics is taking place.
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Why we pulled Apache Storm from Production in Pursuit of Performance

Why we pulled Apache Storm from Production in Pursuit of Performance | BigStuff | Scoop.it
Log management at scale isn’t easy. Here's why we pulled Apache Storm from Production months after launching in pursuit of even higher performance.
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

MapR declines Open Data Platform invitation - SD Times

MapR declines Open Data Platform invitation - SD Times | BigStuff | Scoop.it
Hortonworks and MapR spar over relevance of Hadoop initiatives and the role of the Open Data Platform in Big Data’s future
Erwin J. van Eijk's insight:

Right or wrong I don't know, but it looks like knives are shown.

more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Announcing Apache Hadoop 2.6.0

Announcing Apache Hadoop 2.6.0 | BigStuff | Scoop.it
It gives me great pleasure to announce that the Apache Hadoop community has released Apache Hadoop 2.6.0 !
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Apache Kafka + Spark + Database = Real-Time Trinity - The New Stack

Apache Kafka + Spark + Database = Real-Time Trinity - The New Stack | BigStuff | Scoop.it
As technology fits into our lives and onto our wrists, demands increase for intelligent and real-time mobile applications. These applications need to deliver information and services that are relevant and immediate.
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Could Concord topple Apache Spark from its big data throne?

Could Concord topple Apache Spark from its big data throne? | BigStuff | Scoop.it
A new stream-based data processing project threatens to upend Spark's reign before it begins in earnest.
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Spark-TS: A New Library for Analyzing Time-Series Data with Apache Spark - Cloudera Engineering Blog

Spark-TS: A New Library for Analyzing Time-Series Data with Apache Spark - Cloudera Engineering Blog | BigStuff | Scoop.it

Time-series analysis is becoming mainstream across multiple data-rich industries.

more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

How Spark fits into YARN framework

A detailed Visualization on how Apache Spark components fit into YARN 's components .

more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Business Information Systems

Business Information Systems | BigStuff | Scoop.it
This book contains the refereed proceedings of the 18th International Conference on Business Information Systems, BIS 2015, held in Poznań, Poland, in June 2015. The BIS conference series follows trends in academic and business research; thus, the theme of the BIS 2015 conference was “Making Big Data Smarter.” Big data is now a fairly mature concept, recognized and widely used by professionals in both research and industry. Together, they work on developing more adequate and efficient tools for data processing and analyzing, thus turning "big data" into "smart data."The 26 revised full papers were carefully reviewed and selected from 70 submissions. In addition, two invited papers are included in this book. They are grouped into sections on big and smart data, semantic technologies, content retrieval and filtering, business process management and mining, collaboration, enterprise architecture and business−IT alignment, specific BIS applications, and open data for BIS.
Erwin J. van Eijk's insight:

This book contains a piece of research done by the University of Leipzig in which they compare Apache Flink and Apache Spark.

 

TL;DR Flink is better for relational, Apache better for batch processing.

more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Health Informatics and Survival Prediction of Cancer with Apache Spark Machine Learning Library

Health Informatics and Survival Prediction of Cancer with Apache Spark Machine Learning Library | BigStuff | Scoop.it

In this article, author discusses the survival prediction of colorectal cancer as a multi-class classification problem and how to solve that problem using the Apache Spark's MLlib Java API.

more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Is Apache Spark Enterprise Ready? - Enterprise Apps Today

While Apache Spark could supplant Hadoop's MapReduce engine, it is not yet enterprise ready, some experts say.
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Hadoop Security: Is it a Different Paradigm?

Hadoop Security: Is it a Different Paradigm? | BigStuff | Scoop.it
This guest blog post is from Srikanth Venkat, director of product management at Dataguise, a Hortonworks security partner.
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

World record set for 100 TB sort by open source and public cloud team - opensource.com

World record set for 100 TB sort by open source and public cloud team - opensource.com | BigStuff | Scoop.it
How Databricks set a new world record for sorting 100 terabytes (TB) of data, or 1 trillion 100-bytes, in 23 minutes with open source software Apache Spark and public cloud infrastructure EC2.
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Hadoop successor sparks a data analysis evolution - Computerworld Australia

Hadoop successor sparks a data analysis evolution - Computerworld Australia | BigStuff | Scoop.it
If 2014 was the year that Apache Hadoop sparked the big data revolution, 2015 may be the year that Apache Spark supplants Hadoop with its superior capabilities for richer and more timely analysis.
more...
No comment yet.
Scooped by Erwin J. van Eijk
Scoop.it!

Spark standalone cluster tutorial by mbonaci

Spark standalone cluster tutorial by mbonaci | BigStuff | Scoop.it
Install Apache Spark in Debian/Ubuntu with Maven. Tutorial. http://t.co/GPtXT0mRJ5
more...
No comment yet.