Big data changes the way we do science and replicate it
Physicist Alessandro Vespignani is one of the main experts in networks and statistical and numerical simulations. An Italian scientist, he is currently working at the College of Computer and Information Sciences at Northeastern University, Boston, Massachusetts, USA. In this EuroScientist podcast interview, he shares his views on the need to re-think the concepts of replicability and reproducibility. (...) - Euroscientist, by Luca Tancredi Barone, 29 April 2015
Via Tree of Science
While the #bigdata revolution is ongoing, there are new challenges in data reproducibility #openscience #openresearch #opendata
One of the key aspect of science is reproducibility. For me that is "if some affirmation said to be science cannot be claimed, based on an independent study, by another team/person, there is NO scientific process taking place. Hence reproducibility is impossible". In order to allows for reproducibility, there are some necessary conditions : the full process must be described, the data must be accessible too. Either by independent production if possible, or by giving full access to the data first analyzed. And the software used must be fully accessible too, because it contains the details of the process, if any software is used. And we all know that the devil hides in the details.
Alessandro Vespignani, rightfully says "the (scientific) validation happens when you have different teams that work at the same time on the same set of data to recover results.” Hence, the data must often, at least if their production or storage is difficult or expensive, be open if not libre (in the sense of libre access)
In short, in many (most ?) cases, in 2015, in order to do science, the description of the results (= the paper), the data, the software and the comments by the community must all be open and sharable without barrier. Cost is one such barrier.