To many, the term “Big Data” refers to algorithms and software programs that help companies or researchers make discoveries and unearth trends by allowing them to visualize and analyze information better.
But it has another meaning too.
Big Data literally means big data, dizzying amounts of customer records, sound recordings, images, text messages, Facebook comments and technical information that has to be stored, retrieved and understood in its proper context to be any good to anyone.
You can’t have the first Big Data without the second. Information retrieval has, in fact, become one of the biggest challenges – and consequently one of the largest opportunities – in high tech.
“The problems we’re looking at aren’t computationally driven per se, but more information management problems,” Mark Dean, an IBM fellow and director of the Almaden Research Center, said back in 2008. “Computation is not the hard part anymore.”
Although carbon copies probably seem as ancient as clay tablets and scribes to many of you, the dominance of digital data is a fairly recent phenomenon. In 1993, only 3 percent of the world’s information was stored on digital devices like hard drives or optical disks, according to an article last year by Martin Hilbert and Priscila Lopez in Science. Audiocassettes and vinyl LPs actually played a larger role (6 percent) in archiving information 18 years ago.
In 2000, amid the first Internet boom, digital storage still only accounted for 25 percent of the world’s total information storage capacity. 2002 marked the first year that the amount of data stored digitally surpassed the amount stored on paper, old-fashioned videotapes and other analog storage devices, Hilbert and Priscila noted.
But by 2007, DVDs, CDs, memory cards and other digital devices accounted for 94 percent of the information storage capacity around the globe. Hard drives alone accounted for 52 percent of the total, up from just five percent seven years earlier. In all, the world had 295 optimally compressed exabytes of storage space in 2007. Think of it: every e-mail or text message creates data files on multiple computers.
And that was five years ago. The total amount of digital information in the world will come to 2.7 zettabytes - that’s 2.7 followed by 21 zeros – in 2012, according to IDC, a 48 percent increase from 2011. 90 percent of it will be unstructured data like digital video, sound files and images that is challenging to search and retrieve.
Via Chaturika Jayadewa