Computers Pose Barriers to Scientific Reproducibility

December 9, 2015

These days, it is hard to imagine performing scientific research without the help of computers. details the problem that poses in its thorough article, “How Computers Broke Science—And What We Can Do to Fix It.” Many of us learned in school that reliable scientific conclusions rest on a foundation of reproducibility. That is, if an experiment’s results can be reproduced by other scientists following the same steps, the results can be trusted. However, now many of those steps are hidden within researchers’ hard drives, making the test of reproducibility difficult or impossible to apply. Writer, Ben Marwick points out:

“Stanford statisticians Jonathan Buckheit and David Donoho [PDF] described this issue as early as 1995, when the personal computer was still a fairly new idea.

‘An article about computational science in a scientific publication is not the scholarship itself, it is merely advertising of the scholarship. The actual scholarship is the complete software development environment and the complete set of instructions which generated the figures.’

“They make a radical claim. It means all those private files on our personal computers, and the private analysis tasks we do as we work toward preparing for publication should be made public along with the journal article.

This would be a huge change in the way scientists work. We’d need to prepare from the start for everything we do on the computer to eventually be made available for others to see. For many researchers, that’s an overwhelming thought. Victoria Stodden has found the biggest objection to sharing files is the time it takes to prepare them by writing documentation and cleaning them up. The second biggest concern is the risk of not receiving credit for the files if someone else uses them.”

So, do we give up on the test of reproducibility, or do we find a way to address those concerns? Well, this is the scientific community we’re talking about. There are already many researchers in several fields devising solutions. Poetically, those solutions tend to be software-based. For example, some are turning to executable scripts instead of the harder-to-record series of mouse clicks. There are also suggestions for standardized file formats and organizational structures. See the article for more details on these efforts.

A final caveat: Marwick notes that computers are not the only problem with reproducibility today. He also cites “poor experimental design, inappropriate statistical methods, a highly competitive research environment and the high value placed on novelty and publication in high-profile journals” as contributing factors. Now we know at least one issue is being addressed.

Cynthia Murrell, December 9, 2015

