Enjoy IT - BigData, Fast Data and the fun of IT
903 views | +0 today
Follow
Enjoy IT - BigData, Fast Data and the fun of IT
Everything interesting about BigData and FastData
Curated by Guido Schmutz
Your new post is loading...
Your new post is loading...
Rescooped by Guido Schmutz from Social Network Analysis #sna
Scoop.it!

WEF16 Davos Twitter Performance Analysis

WEF16 Davos Twitter Performance Analysis | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it

During the World Economic Forum in Davos 2016 (WEF16) we collected over 480 thousand Tweets with content relating to #wef


Via ukituki
more...
No comment yet.
Rescooped by Guido Schmutz from Big Data and NoSQL Daily
Scoop.it!

Replephant: Analyzing Hadoop Cluster Usage with Clojure - Michael G. Noll

Replephant: Analyzing Hadoop Cluster Usage with Clojure - Michael G. Noll | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it
Introducing Replephant: a Clojure library to perform interactive analysis of Hadoop cluster usage via REPL and to generate usage reports.

Via Sergeyan
more...
Sergeyan's curator insight, September 25, 2013 4:19 AM

Hadoop cluster usage analysis is useful to properly manage and operate it. This article introduces Replephant - a Hadoop cluster usage analyzer tool. 

Rescooped by Guido Schmutz from Cloud & Big Data Platform
Scoop.it!

In-Stream Big Data Processing

In-Stream Big Data Processing | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it

The shortcomings and drawbacks of batch-oriented data processing were widely recognized by the Big Data community quite a long time ago. It became clear that real-time query processing and in-strea...


Via Steve Hyounggi Min
more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

Hunk: Raw data to analytics in < 60 minutes

Hunk: Raw data to analytics in < 60 minutes | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it

Summary of what we’ll do

1. Set up the environment
2. Configure Hunk
3. Analyze some data

more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

Tutorial: Machine Learning on Big Data (SIGMOD 2013)

Tutorial: Machine Learning on Big Data (SIGMOD 2013) | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it
Below the fold, there is an embedded version of our slides. You can also download them in PowerPoint or PDF file. If you have any questions or comments, please feel free to leave a comment below.
more...
No comment yet.
Rescooped by Guido Schmutz from pdg-technologies.com
Scoop.it!

Using Avro in MapReduce jobs with Hadoop, Pig, Hive - Michael G. Noll

Using Avro in MapReduce jobs with Hadoop, Pig, Hive - Michael G. Noll | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it
Example MapReduce jobs in Java, Hadoop Streaming, Pig and Hive that read and/or write data in Avro format.

Via Kun Le
more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

Two forthcoming R books

Two forthcoming R books | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it

The first is Applied Predictive Modeling by Max Kuhn and Kjell Johnson. Max Kuhn is the author of the caret package, an extremely useful and powerful R package for fitting and optimizing all kinds of predictive models in R. It's available now on Amazon Kindle and will be published in hardcover by Springer in July.

The second is Dynamic Documents with R and knitr by Yihui Xie, the author of the knitr package. With knitr you can easily create beautiful documents and reports, with text, tables and figures all dynamically generated by R. It will also be available in July.

more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

Top 3 R resources for beginners

The community team at Revolution Analytics has just updated this list of resources to learn about R on the Web. Included is this list of the top 3 resources for absolute beginners getting started with R:

 

more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

List of Machine Learning APIs

Below is a compilation of APIs that have benefited from Machine Learning in one way or another, we truly are living in the future so strap into your rocketship and prepare for blastoff.

more...
No comment yet.
Rescooped by Guido Schmutz from Cloud & Bigdata Watching
Scoop.it!

Hadoop 2.0 & Beyond: Reinventing Hadoop - Christian Prokopp

Hadoop 2.0 & Beyond: Reinventing Hadoop - Christian Prokopp | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it

Hortonworks' announcement of the Stinger Initiative last week claiming to make Hive 100 times faster utilising and introducing new core Hadoop technologies. It could result in a near-BigQuery-style performance for a wide range of Hadoop users. That would enable ad hoc and interactive data querying for large datasets, something that requires sophisticated, large, traditional data warehouse setups. Hadoop clusters would instantly solve a highly significant use-case in many companies and potentially become dual use.


Via Ian Sykes, Wonil Lee Ph.D.
more...
No comment yet.
Rescooped by Guido Schmutz from Scala & Cloud Playing
Scoop.it!

An example “lambda architecture” for real-time analysis of hashtags using Trident, Hadoop and Splout SQL

An example “lambda architecture” for real-time analysis of hashtags using Trident, Hadoop and Splout SQL | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it
#Lambda #architecture for real-time analysis of hashtags using #Trident, #Hadoop and #Splout #SQL http://t.co/pjiJMyDj via @datasalt

Via Wonil Lee Ph.D.
more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

Rise of the Mobile Machines - Oracle Reveals Strategy on Internet of Things (Oracle SOA Suite - Team Blog)

Rise of the Mobile Machines - Oracle Reveals Strategy on Internet of Things (Oracle SOA Suite - Team Blog) | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it
Blogs.Oracle.Com - Oracle SOA Suite - Team Blog (Oracle Device to Data Center #strategy revealed #oep #Java #fastdata #IoE #bigdata http://t.co/1dkglAWm5X)...
more...
No comment yet.
Rescooped by Guido Schmutz from Large-scale Incremental Processing
Scoop.it!

Stream Processing and Probabilistic Methods: Data at Scale

Stream Processing and Probabilistic Methods: Data at Scale | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it
Stream processing and related abstractions have become all the rage following the rise of systems like Apache Kafka, Samza, and the Lambda architecture. Applying the idea of immutable, append-only ...

Via Jaeboo Jeong
more...
No comment yet.
Rescooped by Guido Schmutz from API Magazine
Scoop.it!

API Design: Do You Swagger, Blueprint or RAML?

API Design: Do You Swagger, Blueprint or RAML? | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it

I’m spending the next couple weeks going through each of the leading API design approaches: API Blueprint, RAML and Swagger....


Via Manfred Bortenschlager
more...
Manfred Bortenschlager's curator insight, January 17, 2014 5:50 PM

The API Evangelist Kin Lane doing some research about differences between API description and documentation approaches: Swagger, Blueprint, and RAML.

Rescooped by Guido Schmutz from Cloud & Big Data Platform
Scoop.it!

C* Summit 2013: CMB: An Open Message Bus for the Cloud by Boris Wolf

The Comcast Silicon Valley Innovation Center has developed a general purpose message bus for the cloud. The service is API compatible with Amazon's SQS/SNS and

Via Steve Hyounggi Min
more...
No comment yet.
Rescooped by Guido Schmutz from Cloud & Big Data Platform
Scoop.it!

Analyzing Twitter Data with Apache Hadoop

Analyzing Twitter Data with Apache Hadoop | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it
Cloudera offers enterprises a powerful new data platform built on the popular Apache Hadoop open-source software package.

Via Steve Hyounggi Min
more...
No comment yet.
Rescooped by Guido Schmutz from Large-scale Incremental Processing
Scoop.it!

Summingbird: Streaming MapReduce at Twitter // Speaker Deck

Summingbird: Streaming MapReduce at Twitter // Speaker Deck | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it

Via Jaeboo Jeong
more...
Andreas Petter's curator insight, January 20, 2014 2:38 PM

This is a tutorial style slide-set to summingbird with some coding examples. Without sound, familiarity with the basic projects and architecture of Summingbird is of great advantage.

Scooped by Guido Schmutz
Scoop.it!

Data Scientists vs. Data Engineers

Data Scientists vs. Data Engineers | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it

More and more frequently we see organizations make the mistake of mixing and confusing team roles on a data science or "big data" project - resulting in over-allocation of responsibilities assigned to data scientists. For example, data scientists are often tasked with the role of data engineer leading to a misallocation of human capital. Here the data scientist wastes precious time and energy finding, organizing, cleaning, sorting and moving data. The solution is adding data engineers, among others, to the data science team.

more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

The arteries of the world, in Tweets

The arteries of the world, in Tweets | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it
What happens when you plot billions of geotagged Tweets on a map? You can see the arteries of the world.
more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

Free Datascience books

I've been impressed in recent months by the number and quality of free datascience/machine learning books available online. I don't mean free as in some guy paid for a PDF version of an O'Reilly book and then posted it online for others to use/steal, but I mean genuine published books with a free online version sanctioned by the publisher. That is, "the publisher has graciously agreed to allow a full, free version of my book to be available on this site."

more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

London to create airport of the future with 'Internet of Things'

London to create airport of the future with 'Internet of Things' | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it
London City Airport wants to eradicate the many nuisances associated with flying. (This is indeed #bigdata @ibmbigdata http://t.co/rb8ydH6QeV)
more...
No comment yet.
Rescooped by Guido Schmutz from Cloud & Bigdata Watching
Scoop.it!

Apache Hadoop Now Next and Beyond

With the rise of Apache Hadoop, a next-generation enterprise data architecture is emerging that connects the systems powering business transactions and business

Via Wonil Lee Ph.D.
more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

Society's next big challenge: infinite data

Society's next big challenge: infinite data | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it
Over the past 10 years or so, many organizations have recognized the conceptual value of data and have started recording and retaining more and more of it. But after doing this for a while, they’re...
more...
No comment yet.
Scooped by Guido Schmutz
Scoop.it!

(Nathan Marz) Big Data Lambda Architecture

(Nathan Marz) Big Data Lambda Architecture | Enjoy IT - BigData, Fast Data and the fun of IT | Scoop.it

In order to meet the challenges of Big Data, you must rethink data systems from the ground up. You will discover that some of the most basic ways people manage data in traditional systems like the relational database management system (RDBMS) is too complex for Big Data systems. The simpler, alternative approach is a new paradigm for Big Data. In this article based on chapter 1, author Nathan Marz shows you this approach he has dubbed the “lambda architecture.”

more...
No comment yet.