hi bigdata
3.0K views | +0 today
Follow
hi bigdata
big data project
Curated by JerryJung
Your new post is loading...
Your new post is loading...
Rescooped by JerryJung from Everything is related to everything else
Scoop.it!

MLeap: Providing (Near) Real-time Data Science with Apache Spark

MLeap: Providing (Near) Real-time Data Science with Apache Spark | hi bigdata | Scoop.it
How MLeap allowed us to scale our existing predictive platform from our local machines to Apache Spark in the cloud with zero loss of functionality and sub-second response times.
At Red Ventures, we partner with the nation’s top brands to seamlessly connect customers with the products and services they need most using our advanced digital marketing and sales capabilities. Along a customer’s journey, each interaction with Red Ventures presents an opportunity to make an influential decision: from the website creative they see to the time they spend waiting in a queue to speak to an agent. However, those decisions aren’t meaningful if they can’t use data and make a recommendation in real time. To that end, we have developed a machine learning platform that is constantly making decisions and is constantly learning to account for new data and trends.

Via Fernando Gil
more...
No comment yet.
Rescooped by JerryJung from Large-scale Incremental Processing
Scoop.it!

Introducing Chaperone: How Uber Engineering Audits Kafka End-to-End - Uber Engineering Blog

Introducing Chaperone: How Uber Engineering Audits Kafka End-to-End - Uber Engineering Blog | hi bigdata | Scoop.it
Uber Engineering explains why and how we built Chaperone, our in-house auditing system for monitoring Kafka pipeline health.

Via Jaeboo Jeong
more...
No comment yet.
Rescooped by JerryJung from Large-scale Incremental Processing
Scoop.it!

The current state of machine intelligence 3.0

The current state of machine intelligence 3.0 | hi bigdata | Scoop.it
Watching the appeal and applications of machine intelligence expand.
Via Jaeboo Jeong
more...
No comment yet.
Rescooped by JerryJung from Large-scale Incremental Processing
Scoop.it!

Why your Spark job is failing

Why your Spark job is failing

Via Jaeboo Jeong
more...
No comment yet.
Rescooped by JerryJung from Cloud & Bigdata Watching
Scoop.it!

Something about Kafka - Why Kafka is so fast

This slide briefly introduced the reason why kafka is so fast in performance.

Via Wonil Lee Ph.D.
more...
No comment yet.
Rescooped by JerryJung from Cloud & Bigdata Watching
Scoop.it!

Data-focused Docker clustering - ClusterHQ

Data-focused Docker clustering - ClusterHQ | hi bigdata | Scoop.it
This post is intended to start a conversation about how Docker should handle data volumes for distributed applications. Today we publicly launched Flocker 0.1, an open-source volume and container manager for Docker. Flocker 0.1 is an early model of how we believe storage and networking could be handled in a distributed system. Support for portable …

Via Wonil Lee Ph.D.
more...
No comment yet.
Rescooped by JerryJung from Large-scale Incremental Processing
Scoop.it!

Databricks Spark Reference Applications

Databricks Spark Reference Applications | hi bigdata | Scoop.it
Reference Applications demonstrating Apache Spark - brought to you by Databricks.

Via Jaeboo Jeong
more...
No comment yet.
Rescooped by JerryJung from Cloud & Bigdata Watching
Scoop.it!

Hadoop Summit 2014: Building a Self-Service Hadoop Platform at Linked…

Hadoop comprises the core of LinkedIn’s data analytics infrastructure and runs a vast array of our data products, including People You May Know, Endorsements, …

Via Wonil Lee Ph.D.
more...
No comment yet.
Rescooped by JerryJung from Cloud & Bigdata Watching
Scoop.it!

Interactive Analytics in Human Time

Interactive Analytics in Human Time S u p r e e t h R a o , S u n i l G u p t a ⎪ J u n e 4 , 2 0 1 4 2 0 1 4 H a d o o p S u m m i t , S a n J o s e , C a l i…

Via Wonil Lee Ph.D.
more...
No comment yet.
Scooped by JerryJung
Scoop.it!

Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)

At Spotify we collect huge volumes of data for many purposes. Reporting to labels, powering our product features, and analyzing user growth are some of our mos…
more...
No comment yet.
Rescooped by JerryJung from All about Software Technology
Scoop.it!

Grafana - The Graphite dashboard frontend, editor and graph composer

Grafana - The Graphite dashboard frontend, editor and graph composer | hi bigdata | Scoop.it
A unique graphite dashboard aimed to be a general purpose dashboard that looks nice and makes it easy to construct and edit dashboards through the UI. It also contains an advanced and unique graph editor and graphite target expression / function editor. Other notible features are fast client side rendering, select to zoom in, multiple y-axes and graph templating.

Via Steve Hyounggi Min
more...
No comment yet.
Rescooped by JerryJung from Deep_In_Depth: Deep Learning, ML & DS
Scoop.it!

Data Cleansing with Apache Spark and Optimus

Data Cleansing with Apache Spark and Optimus | hi bigdata | Scoop.it
Outdated, inaccurate, or duplicated data won’t drive optimal data driven solutions. When data is inaccurate, leads are harder to track and nurture, and insights may be flawed. The data on which you base your big data strategy must be accurate, up-to-date, as complete as possible, and should not contain duplicate entries. Clean data results in better decisions.

Via Eric Feuilleaubois
more...
No comment yet.
Rescooped by JerryJung from Large-scale Incremental Processing
Scoop.it!

Edge Intelligence for IoT with Apache MiNiFi - Hortonworks

Edge Intelligence for IoT with Apache MiNiFi - Hortonworks | hi bigdata | Scoop.it
MiNiFI is a subproject of NiFi designed to solve the difficulties of managing and transmitting data feeds to and from the source of origin, often the first

Via Jaeboo Jeong
more...
No comment yet.
Scooped by JerryJung
Scoop.it!

Data Lake vs. Data Warehouse: Is the warehouse going under the lake?

Data Lake vs. Data Warehouse: Is the warehouse going under the lake? | hi bigdata | Scoop.it
Understand the differences between a data lake vs. data warehouse and find out if data lakes will replace a data warehouse or will they coexist.
more...
No comment yet.
Scooped by JerryJung
Scoop.it!

Kafka in Action: 7 Steps to Real-Time Streaming From RDBMS to Hadoop - DZone Big Data

Here is an in-depth example of using Flume with Kafka to stream real-time RDBMS data into a Hive table on HDFS.
more...
No comment yet.
Rescooped by JerryJung from Large-scale Incremental Processing
Scoop.it!

DataTorrent - Hadoop's Most Powerful Platform for
Real-Time Stream Analytics

DataTorrent - Hadoop's Most Powerful Platform for <br/>Real-Time Stream Analytics | hi bigdata | Scoop.it
DataTorrent is Hadoop's Most Powerful Platform for Real-Time Stream Analytics

Via Jaeboo Jeong
more...
No comment yet.
Rescooped by JerryJung from JavaScript for Line of Business Applications
Scoop.it!

Data Visualization with JavaScript

Data Visualization with JavaScript | hi bigdata | Scoop.it

If you’re developing web sites or web applications today, there’s a good chance you have data to communicate, and that data may be begging for a good visualization. But how do you know what kind of visualization is appropriate? And, even more importantly, how do you actually create one? Answers to those very questions are the core of this book. In the chapters that follow, we explore dozens of different visualizations and visualization techniques and tool kits. Each example discusses the appropriateness of the visualization (and suggests possible alternatives) and provides step-by-step instructions for including the visualization in your own web pages.


Contents:

IntroductionImplementation vs DesignCode vs. StylingSimple vs. ComplexReality vs. an Ideal WorldSource Code for ExamplesAcknowledgementsGraphing DataMaking Charts InteractiveIntegrating Charts in a PageCreating Specialized GraphsShowing TimelinesVisualizing Geographic DataCustom Visualizations with D3.jsBuilding Data-Driven Web ApplicationsManaging Data in the Browser



Via Jan Hesse
more...
No comment yet.
Rescooped by JerryJung from Cloud & Big Data Platform
Scoop.it!

Haeinsa is linearly scalable multi-row, multi-table transaction library for HBase. Haeinsa uses two-phase locking and optimistic concurrency control for implementing transaction. The isolation leve...

Haeinsa Overview (HBase Transaction Library)

Via Steve Hyounggi Min
more...
No comment yet.
Rescooped by JerryJung from Cloud & Bigdata Watching
Scoop.it!

Improving Hadoop Performance via Linux

Administering a Hadoop cluster isn't easy. Many Hadoop clusters suffer from Linux configuration problems that can negatively impact performance. With vast and …

Via Wonil Lee Ph.D.
more...
No comment yet.
Rescooped by JerryJung from Cloud & Bigdata Watching
Scoop.it!

Google adds a big data service and lots of monitoring to its cloud

Google adds a big data service and lots of monitoring to its cloud | hi bigdata | Scoop.it
Google rolled out a slew of new cloud services at I/O, including one called Dataflow that’s meant to put standard MapReduce to shame. It’s advertised a much simpler way to build data pipelines that can handle both batch processing and streaming data.

Via Wonil Lee Ph.D.
more...
No comment yet.
Rescooped by JerryJung from Large-scale Incremental Processing
Scoop.it!

Classifiying documents using Naive Bayes on Apache Spark / MLlib

Classifiying documents using Naive Bayes on Apache Spark / MLlib | hi bigdata | Scoop.it
In recent years, Apache Spark has gained in popularity as a faster alternative to Hadoop and it reached a major milestone last month by releasing the production ready version 1.0.0. It claims to be...

Via Jaeboo Jeong
more...
No comment yet.
Rescooped by JerryJung from Cloud & Bigdata Watching
Scoop.it!

Introducing R for Big Data with PivotalR | Pivotal P.O.V.

Introducing R for Big Data with PivotalR | Pivotal P.O.V. | hi bigdata | Scoop.it
Pivotal releases PivotalR to run R in-database & in-Hadoop - currently supported by PostreSQL, Greenplum, PivotalHD http://t.co/tFm7Ja22EK

Via Wonil Lee Ph.D.
more...
No comment yet.