Large-scale Incre...
Follow
Find
5.9K views | +5 today
Large-scale Incremental Processing
Your new post is loading...
Your new post is loading...
Scooped by Jaeboo Jeong
Scoop.it!

The Question of Docker, The Future of OS Virtualization

The Question of Docker, The Future of OS Virtualization | Large-scale Incremental Processing | Scoop.it
In this article I'm going to take a look at Docker and OS Virtualization autonomously of each other. There's a reason, which will unfold as I dig through some data and provide this look into what i...
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Best Practices for YARN Resource Management | MapR

Best Practices for YARN Resource Management | MapR | Large-scale Incremental Processing | Scoop.it
In this blog post, I will discuss best practices for YARN resource management. The fundamental idea of MRv2(YARN) is to split up the two major functionalities—resource management and job scheduling/monitoring, into separate daemons. The idea is to have a global ResourceManager (RM) and per-application ApplicationMaster (AM).
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Colossus

Colossus IO Framework: Built at Tumblr
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

The inherent complexity of stream processing

Talk given to Storm NYC meetup group on 3/18/2015
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

XLDB2015: Accelerating Deep Learning at Facebook

XLDB2015: Accelerating Deep Learning at Facebook | Large-scale Incremental Processing | Scoop.it
Speaker: Keith Adams / Facebook

XLDB-2015 website: http://xldb.org/2015

Copyright 2015 Stanford University
This work is licensed under a Creative Commons Attribution-NonCommercial-
NoDerivs 3.0 Unported License. http://creativecommons.org/licenses/by-nc-
nd/3.0/
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Real-time analytics within the transaction - O'Reilly Radar

Real-time analytics within the transaction - O'Reilly Radar | Large-scale Incremental Processing | Scoop.it
Data generation is growing exponentially, as is the demand for real-time analytics over fast input data. Traditional approaches to analyzing data in batch mode overcome the computational problems...
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Large-scale Data Science and Machine Learning with Spark

Large-scale Data Science and Machine Learning with Spark | Large-scale Incremental Processing | Scoop.it
[Full disclosure: I’m an advisor to Databricks.] At last year's Spark Summit in SF, Ali Ghodsi gave the first public demo of Databricks Cloud and Workspace. As I noted at the time, it was a showsto...
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Graphs in the world: Modeling systems as networks - O'Reilly Radar

Graphs in the world: Modeling systems as networks - O'Reilly Radar | Large-scale Incremental Processing | Scoop.it
Get notified when our free report, “Mapping Big Data: A Data Driven Market Report” is available for download. Networks of all kinds drive the modern world. You can...
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Why We Love Presto

Why We Love Presto | Large-scale Incremental Processing | Scoop.it
Concurrent with acquiring Hadoop companies Hadapt and Revelytix last year, Teradata opened the Teradata Center for Hadoop in Boston. Teradata recently announced that a major new initiative of this Had
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

The Art of Incremental Stream Processing

Purely functional, elegant, correct, incremental and composable stream processing that is CPU and memory efficient. This is our (worthy) goal, but where do we start?

This problem space is being extensively explored across a variety of languages and libraries, each with subtly different trade-offs and not-so subtly different APIs and terminology. However, these libraries share common goals, and most share common ancestry from Oleg Kiselyov's original Iteratee work or its Free Monad based derivatives.

This talk aims to build up an intuition for stream processing in general by first building up the core concepts and language of stream processing, and then grounding those by carefully examining the trade-offs and internals of several productionised implementations. Of particular interest are the pipes and conduits libraries from the Haskell community, and scalaz-stream from the Scala community.
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Here’s the solution to the Uber and Airbnb problems — and no one will like it | Nick Grossman's Slow Hunch

Here’s the solution to the Uber and Airbnb problems — and no one will like it | Nick Grossman's Slow Hunch | Large-scale Incremental Processing | Scoop.it
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

HDFS vs. MapR FS – 3 Numbers for a Superior Architecture – Whiteboard Walkthrough | MapR

HDFS vs. MapR FS – 3 Numbers for a Superior Architecture – Whiteboard Walkthrough | MapR | Large-scale Incremental Processing | Scoop.it
In this week's Whiteboard Walkthrough, Ted Dunning, Chief Application Architect at MapR, talks about the architectural differences between HDFS and MapR-FS that boil down to three numbers.
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Configuring and Deploying Apache Spark

Configuring and Deploying Apache Spark | Large-scale Incremental Processing | Scoop.it
I gave this talk at the inaugural SF Spark and Friends Meetup group in San Francisco during the week of the Spark Summit this year. While researching this talk, I realized there is very little material out there giving an overview of the many rich options for deploying and configuring Apache Spark. There are some specific articles by vendors - targeting YARN, or DSE, etc., but I think what developers really want is a broad overview. So, this post will give you that, but you will have to look through the slides here to dig through the meat of it.
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

What’s Next for Impala: More Reliability, Usability, and Performance at Even Greater Scale

What’s Next for Impala: More Reliability, Usability, and Performance at Even Greater Scale | Large-scale Incremental Processing | Scoop.it
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Introduction to the Zeta Architecture #WhiteboardWalkthrough - YouTube

http://bit.ly/1GZQlPh – In this week's Whiteboard Walkthrough, Jim Scott, Director of Enterprise Strategy and Architecture at MapR, gives you an introduction...
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Using Apache Spark DataFrames for Processing of Tabular Data | MapR

Using Apache Spark DataFrames for Processing of Tabular Data | MapR | Large-scale Incremental Processing | Scoop.it
This post will help you get started using Apache Spark DataFrames with Scala on the MapR Sandbox.
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

How To Scan Salted Tables With Region Specific Key Ranges In MapReduce – FINRA Technology

From creating the Nasdaq – the world’s first electronic stock market – to revolutionizing big data solutions in the cloud, FINRA’s effectiveness as a regulator depends heavily on its technology.
more...
No comment yet.
Scooped by Jaeboo Jeong
Scoop.it!

Docker, and Spark, and Hadoop. Oh My. | BlueData

Docker, and Spark, and Hadoop. Oh My. | BlueData | Large-scale Incremental Processing | Scoop.it
Seeing the power of Docker containers, we’ve doubled down on our vision of running Big Data in a flexible, automated, and elastic virtual environment. With BlueData, Hadoop and Spark can now run on Docker containers.
more...
No comment yet.