Simulation Reproducibility with Python and Pweave

As the amount and complexity of model source code, configuration files, and resulting data for simulative experiments are ever increasing, it becomes a real challenge to reliably and efficiently reproduce simulation data and their analysis results publish

  • PDF / 2,555,527 Bytes
  • 19 Pages / 439.36 x 666.15 pts Page_size
  • 55 Downloads / 203 Views

DOWNLOAD

REPORT


Simulation Reproducibility with Python and Pweave Kyeong Soo (Joseph) Kim

8.1 Introduction Reproducible research is a key to a scientific method [14] and ensures repeating an experiment and the results of its analysis with a high degree of agreement among researchers. In a practical sense, we can say that a study is reproducible when it satisfies the following minimum criteria [7]: • All methods are fully reported. • All data and files used for the analysis are (publicly) available. • The process of analyzing raw data is well reported and preserved. Therefore, reproducible research is to ensure that with the same data and analysis scripts, one can generate the same results and thereby reach the same conclusions. When the results of a study are not reproducible; however, its claims—no matter what they are—cannot be accepted and used as a basis for further research. Consider in this regard the Schön scandal [2], a notable example of unreproducible research related with data fraud. In 2001, Jan Hendrik Schön produced a series of high-profile research papers at a peculiar pace of publishing one paper every 8 days. In one of the papers published in Nature, which was withdrawn later, he claimed that he had produced a transistor on the molecular scale, i.e., a single-molecule transistor, that was regarded by many in the field as the holy grail of the molecular computer. Soon after Schön published the work, however, several physicists alleged that there were anomalies (e.g., two experiments carried out at very different temperatures had identical noise) and duplicates in his data,

K. S. (Joseph) Kim () Department of Electrical and Electronic Engineering, Xi’an Jiaotong-Liverpool University, Suzhou, Jiangsu, People’s Republic of China e-mail: [email protected] © Springer Nature Switzerland AG 2019 A. Virdis, M. Kirsche (eds.), Recent Advances in Network Simulation, EAI/Springer Innovations in Communication and Computing, https://doi.org/10.1007/978-3-030-12842-5_8

281

282

K. S. (Joseph) Kim

which triggered a formal investigation by a committee set up by Bell Labs in 2002. The committee found evidence of Schön’s scientific misconduct in at least 16 allegations out of 24 considered [13]. The problem is that Schön had kept no laboratory notebooks and data for his groundbreaking experiments and he was unable to reproduce the claimed results. This scandal clearly shows the importance of handling experimental data and record keeping even after the publication and the need of reproducible research in carrying out any scientific research. The detection of cosmic gravitational waves reported by the Laser Interferometer Gravitational-Wave Observatory (LIGO) team, on the other hand, provides a good example of reproducible research [1]. Together with the paper, they also published online an IPython [10] notebook and datasets so that readers can better understand their work, reproduce the results of their analysis, and lead into the same inferences based on them [15]. In many fields of science and technology, computer s