Much of what we believe about Big Data is wrong, which will be demonstrated in 2014.
For many enterprises, Big Data remains a nebulous goal, rather than a current reality. Yet it's a goal that more and more enterprises are pushing to the top of their priority list. As Gartner surveys have shown, everyone is keen to board the Big Data bandwagon, yet a comparative few really understand why. And as Gartner analyst Svetlana Sicular points out, the myths that hold back Big Data adoption vary depending on where along the adoption curve an enterprise happens to be.In 2014, many of the sillier Big Data myths will crumble to be replaced by increased experience with data-driven applications...
As big data offers unprecedented awareness of phenomena — particularly of consumers' actions and attitudes — will we see much improvement on the predictions of previous-generation methods? Let's look at the evidence so far, in three areas where better prediction of consumer behavior would clearly be valuable.
As its name implies, MemSQL achieves its fast performance in part by keeping data in memory, but it doesn’t use memcached like Facebook does to keep its massive MySQL deployment up to speed. Rather, MemSQL takes a lesson learned from HipHop — Facebook’s tool for converting PHP code into faster C++ — and converts SQL to C++.
“This is like HipHop for SQL, essentially,” Frenkiel told me during a recent call. All told, he claims MemSQL performs up to 30 times faster than disk-based databases. “If you make money off your data and you actually measure time in microsecond or milliseconds,” he said, “then using a lightning fast DB like ours makes a lot of sense.”
For the company that invented MapReduce, Google didn’t have much of a presence in the commercial Big Data market until just last month (with the public release of BigQuery.) While Yahoo! engineers took Google’s concept and spearheaded the open source Hadoop movement, Google was happy to quietly develop its own Big Data platform for its own internal use.
With the evolution from “Web Analytics” to “Digital Intelligence”, there is no doubt digital analysts should gradually shift from website-centricity and channel specific tactics – as experts as we are - to a more strategic, business oriented and...
Richard Daley, one of the founders and chief strategy officer of analytics and business intelligence specialist Pentaho, believes that such a stack will begin to come together this year as consensus begins to develop around certain big data reference architectures--though the upper layers of the stack may have more proprietary elements than LAMP does.
In the ever-evolving world of enterprise IT, choice is generally considered a good thing – albeit having too many choices can create confusion and uncertainty. For those application owners, database administrators and IT directors who pine for the good old days when one could count the number of enterprise-class databases (DBs) on one or two hands, the relational-database-solves-all-our-data-management-requirements days are long gone.
CMSWire State of Big Data 2013 + MapR, Wandisco, DataTorrent Help Hadoop Mature ... CMSWire He confirmed what we already knew, that Hadoop has come a long way, but that it still has a ways to go. It's not pervasive in Enterprises quite yet.
Google has launched ‘Free Zone’, a new mobile web service aimed at giving millions of people in the developing world to access the Internet (and Google's ads) via basic mobile phones without data charges.
Thanks to Jed for spotting this. Barry Devlin's blog offers a good insight into the emerging discusions around unified data management platforms that can scale from small to big data architectures. Barry's recent white paper on Data Zoo's (linked from article) expands on the points made. Both are worth a read. Click on the image or the title for more info.
When thinking about the value of the dataa company collects vs. the traditional value of the product it may produce, collecting and analyzing broad categories of customer + product data is becoming equally — if not more — valuable than the product itself.
Spark is an open source cluster computing framework that can outperform Hadoop by 30x by storing datasets in memory across jobs. Shark is a port of Apache Hive onto Spark, which provides a similar speedup for SQL queries, allowing interactive exploration of data in existing Hive warehouses.