Big Data & Digital Marketing
37.8K views | +4 today
Follow
Big Data & Digital Marketing
Data analytics as the key to know your customers and offer them what they really want.
Curated by Luca Naso
Your new post is loading...
Your new post is loading...
Scooped by Luca Naso
Scoop.it!

5 things to know about Hadoop v. Apache Spark

5 things to know about Hadoop v. Apache Spark | Big Data & Digital Marketing | Scoop.it
Hadoop and Apache Spark are both big-data frameworks, but they don't really serve the same purposes.
Luca Naso's insight:

In my opinion, Spark should NOT be compared with Hadoop but with MapReduce. However, people usually compare Hadoop and Spark (probably because they are buzzwords).

 

5 things to keep in mind:

 

1. They do different things -

Hadoop is a distributed data infrastructure (HDFS),

Spark is a data-processing tool.

 

2. Hadoop is more complete -

Hadoop also includes a data-processing tool (MapReduce),

Spark does not have its own filesystem and needs to be integrated with some.

 

3. Spark is (much) faster -

MapReduce operates in step;

Spark operates in one shot (because it is in-memory).

 

4. Speed is not always what you need -

For batch processing you do not need Spark's high velocity;

Common applications for Spark are those requiring real-time analysis.

 

5. Failure recovery -

both Hadoop and Spark are resilient to failures.

more...
Filipa Alves Fonseca's curator insight, April 18, 2016 3:23 PM
understanding big data
Hobiana Rakotonirina's curator insight, November 7, 2016 7:17 AM

In my opinion, Spark should NOT be compared with Hadoop but with MapReduce. However, people usually compare Hadoop and Spark (probably because they are buzzwords).

 

5 things to keep in mind:

 

1. They do different things -

Hadoop is a distributed data infrastructure (HDFS),

Spark is a data-processing tool.

 

2. Hadoop is more complete -

Hadoop also includes a data-processing tool (MapReduce),

Spark does not have its own filesystem and needs to be integrated with some.

 

3. Spark is (much) faster -

MapReduce operates in step;

Spark operates in one shot (because it is in-memory).

 

4. Speed is not always what you need -

For batch processing you do not need Spark's high velocity;

Common applications for Spark are those requiring real-time analysis.

 

5. Failure recovery -

both Hadoop and Spark are resilient to failures.

neutronupdate's comment, December 19, 2016 1:26 AM
Nice
Scooped by Luca Naso
Scoop.it!

Hadoop and the Internet of Things: Better together

Hadoop and the Internet of Things: Better together | Big Data & Digital Marketing | Scoop.it

 

The Internet of Things continues to grow more popular, and the network of devices connected to it gets bigger every day. Gartner has estimated that there will be 26 billion devices connected to the IoT within the next six years.

 
Luca Naso's insight:

Up to now the Internet of Things has mainly focused on the data generation part (sensors and devices).

 

It is now time for the Analytics side to take over. Here is where Hadoop can make the difference.

 

However, my expectations are that IoT will boom with Real-Time analytics, and Hadoop can be of little use in this scenario.

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

Learn About Microsoft's Hadoop Implementation - Free eBook

Learn About Microsoft's Hadoop Implementation - Free eBook | Big Data & Digital Marketing | Scoop.it
A new, free eBook, Introducing Microsoft Azure HDInsight, covers Microsoft's implementation of Big Data through Hadoop compliance.
Luca Naso's insight:

120 pages for professional programmers on how Microsoft is using Hadoop to dive into Big Data:

1. Intro to Big Data

2. Intro to HDInsight

3. Programmgin with HDInsight

4. HDInsight and Data

5. Customisation

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

Hadoop's rise: Why you don't need petabytes for a big data opening

Hadoop's rise: Why you don't need petabytes for a big data opening | Big Data & Digital Marketing | Scoop.it

So there's not a huge percentage of enterprises in production yet but now the momentum is building, and a huge production wave is coming for Hadoop.

Luca Naso's insight:

At the moment, companies are nowhere near reaching their individual data frontiers. They only use 12% of the data they already have, because data are siloed and they have a portfolio of hundreds of applications!

 

Data science is very different from traditional analytics.

Traditional analytics are based on managers' theories and is a human-driven approach. On the data science side, it's very different. We don't need a big meeting. We don't need your hypotheses. We don't need your ideas. What we need is all the data you've got.

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

Microsoft's big data service available, after a year in preview

Microsoft's big data service available, after a year in preview | Big Data & Digital Marketing | Scoop.it
Windows Azure HDInsight Service lets customers to spin up Hadoop clusters in the cloud
Luca Naso's insight:

Good news for all Microsoft's Softwares users:


Standard Apache Hadoop is available as a service in Microsoft's Azure cloud, allowing to deploy and shut down Hadoop clusters easily.


Integration with the Microsoft data platform means that one can access and analyze data with PowerPivot, Power View and other Microsoft BI tools, like Microsoft SQL Server Analysis Services (SSAS).

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

Big Data Analytics with Apache Hadoop: Advantages and Disadvantages

Big Data Analytics with Apache Hadoop: Advantages and Disadvantages | Big Data & Digital Marketing | Scoop.it

Apache Hadoop Big Data Analytics Tool and Technology: Defination, Advantages, Disadvantages

Luca Naso's insight:

Pro and cons of one of Apache Hadoop, one of the most famous and used platform for working with Bigh Data.

 

Pro:

1. Cheap

2. Fast

3. Scales to large amounts of big data storage

4. Scales to large amounts of big data computation

5. Flexible with types of big data

6. Flexible with programming languages


Cons:

1. Hard to to set up

2. Hard to manage

3. Hard to keep alive

4. Hard to use

5. Is not secure

6. Is not optimized for your hardware

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

Big Data, Big Insights for Social Media with IBM | Business 2 Community

Big Data, Big Insights for Social Media with IBM | Business 2 Community | Big Data & Digital Marketing | Scoop.it
Big data was the buzzword of the day today during our second day of IBM training on analytics and, for my social media marketers, today had a BIG
Luca Naso's insight:

Think about all those Tweets and Facebook Status Updates about your brand. Think about all those consumers talking about unmet needs, dissatisfaction with their current products, and new features they’d love to see in the products they own.


How much do you hear of what your customers are saying about your products?


Start developing a plan on how to collect, store and analyse those data.

And then give them to your marketing experts. They will be very grateful to you ;)

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

The Perfect Big Data Combination: Hadoop and SAP HANA

The Perfect Big Data Combination: Hadoop and SAP HANA | Big Data & Digital Marketing | Scoop.it
With Big Data solutions spawning the requirement for applications focusing on Data Analytics, HANA’s capabilities are serving as the perfect partner for Hadoop. Read this article to know more about this perfect combination
Luca Naso's insight:

SAP HANA and Hadoop are very different, that's why they could be a good combination in terms of complementing each other.

For example, SAP HANA is in-memory and uses predefined schema, while Hadoop is on disk and has no schema.

 

A scalable column-oriented database management for real-time analytics (SAP HANA) meets a technology platform that supports any kind of data for analyzing massive amount of data (Hadoop).

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

Cloud computing is going to absorb your big data workloads, too

Cloud computing is going to absorb your big data workloads, too | Big Data & Digital Marketing | Scoop.it
There has been a spate of product announcements and integrations over the past few weeks signaling that many big data workloads — including, and especially, Hadoop — will soon be ready to run reliably in the cloud.
Luca Naso's insight:

I cannot think of any other place for Big Data but the cloud!

Amazon and Microsoft are clear leaders at the moment, but there is a lot of movements and Oracle might well catch up.

My Hadoop clusters have always been running on the cloud and will always be (I am currently using HDInsight on Azure).

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

What is Hadoop? – Simplified!

What is Hadoop? – Simplified! | Big Data & Digital Marketing | Scoop.it
Hadoop is a savior of this big data world. This article gives an introduction to this tool.
Luca Naso's insight:

This is an extremely basic description of Hadoop, and yet it introduces relevant concepts: parallelism and MapReduce.

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

Hadoop Innovation Summit

Hadoop Innovation Summit | Big Data & Digital Marketing | Scoop.it

The Hadoop Innovation Summit features two days of engaging content from the most hands-on engineers & architects working with Hadoop. Check back regularly on the evolving and growing schedule here.

Luca Naso's insight:

By 2015, 65 percent of applications with advanced analytics will come embedded with Hadoop. There's never been a better time to unlock the power of your Big Data. 

The Hadoop Innovation Summit returns to San Diego at the Marriott Marquis & Marina, on February 19 & 20, 2014. 

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

Hadoop Alternatives: When Your Data Isn't as Big as You Thought

This post from Chris Stuccio's blog takes a critical look at the use of Hadoop and Big Data as buzzwords by asking an interesting question: What if your data isn't as big as you think?

Luca Naso's insight:

Hadoop works well with Big Data, but really Big data.

Before diving yourself into Hadoop check whether you really need it or maybe your data isn't that Big.

 

Here you can find solutions for datasets of a variety of sizes:

1. Hundreds of megabytes

2. Ten-ish gigabytes

3. A couple of terabytes

4. Five terabytes and larger

more...
No comment yet.
Scooped by Luca Naso
Scoop.it!

Big Data Field Report: Hadoop Summit 2013

Big Data Field Report: Hadoop Summit 2013 | Big Data & Digital Marketing | Scoop.it
Data Scientist Daniel D. Guiterrez's highlights from the 6th Hadoop Summit.
Luca Naso's insight:

If there was any doubt that the Apache Hadoop platform has captured the hearts and minds of big data believers everywhere, the recent Hadoop Summit in San Jose on June 26 and 27, 2013, may have settled the question once and for all.

more...
No comment yet.