Today, many scientific fields can be described as data-intensive disciplines, which turn raw data into information and then knowledge. If this sounds familiar it’s because this represents the late and influential computer scientist Jim Gray’s vision of the fourth research paradigm. Gray divided up the evolution of science into four periods or paradigms. One thousand years ago, science was experimental in nature, a few hundred years ago it became theoretical, a few decades ago it moved to a computational discipline, and today it’s data driven. Researchers are reliant on e-science tools to enable collaboration, federation, analysis, and exploration to address this data deluge, equal to about 1.2 zettabytes each year. If 11 ounces of coffee equaled one gigabyte, a zettabyte would be the same volume as the Great Wall of China.
The amount of data is so much that journals such as Neuroscience have stopped accepting supplementary files along with research manuscripts in order to better handle the peer review process. To answer this problem, some are creating infrastructures and software that are set to radically transform the way scientific publishing is done, which has been little changed for centuries.