Big Data Blog: Using Lean Agile Methodologies for Planning & Implementing a Big Data Project @ "Data Informed Live!" on Dec. 10 | Agile Big Data | Scoop.it

In i . t ., big information is a collection of data sets so large and complex so it becomes difficult to process using on-hand database management tools. The difficulties include capture, curation, storage, search, sharing, analysis, and visualization. The excitement to larger data sets is caused by any additional information derivable from analysis of the single large group of related data, compared to separate smaller sets with the same total volume of data, allowing correlations found to "spot business trends, determine quality of research, prevent diseases, link legal citations, combat crime, and determine real-time roadway traffic conditions."
Though a moving target, at the time of 2008 limits were on the order of petabytes to exabytes of internet data. Scientists regularly encounter limitations as a result of large data makes its presence felt many areas, including meteorology, genomics, connectomics, complex physics simulations, and biological and environmental research. The limitations also affect Google search, finance and business informatics. Big Data Planning sets grow in proportions partly because they're increasingly being gathered by ubiquitous information-sensing cellular phones, aerial sensory technologies (remote sensing), software logs, cameras, microphones, radio-frequency identification readers, and wireless sensor networks. The world's technological per-capita capacity to store information has roughly doubled every 40 months because the 1980s; at the time of 2012, everyday 2.5 quintillion (2.5×1018) bytes of web data were created.
Big info is hard to talk with using relational databases and desktop statistics and visualization packages, requiring instead "massively parallel software running on tens, hundreds, or even thousands of servers". What on earth is considered Big Data Planning varies according to the capabilities of the organization handling the set. "For most organizations, facing a huge selection of gigabytes of knowledge initially may trigger a need to reconsider data management options. For others, it could take tens or many terabytes before data size becomes a significant consideration.