Notes for chapter 6: "Scientific knowledge in the digital age"

Scientific communication

The opportunities for improving scientific communication by integrating algorithms and data into our publications are illustrated by the examples collected on Explorable Explanations.

In his talk "Media for Thinking the Unthinkable", Bret Victor explains why this approach is powerful, and shows some additional examples.

An impressive example of explaining maths using advanced visualization techniques that rely on real-time computation is given in a presentation by Steve Witten.

The Oriole Online Tutorials add a video narrative to a computational protocol, an idea that has a lot of potential for scientific communication.

For an example of computation-oriented publishing from data-intensive scientific research, see the discussion of the signal processing techniques used in the LIGO experiment to detect gravitational waves. It has been published as a Jupyter notebook that has been made on-line executable using the binder tool. Clicking this link should take you to a Web page where you can explore the data interactively.

An overview of the technological ecoystem around Jupyter notebooks provides a snapshot of tools available today.

The references address mainly the user interface aspects of publishing computational science. Most of the examples shown are relatively simple. Complex algorithms and models for complex systems pose the additional problem of representating complex digital knowledge in a form that is both precise enough for computers and manageable for human readers. My article "Scientific notations for the digital era" explains the need for a new approach to integrating digital knowledge into scientific communication and suggests strategies for implementing this approach.

Computerized proofs in mathematics

Mathematicians use computers to generate and verify proofs that are much too long for any human to verify. This raises issues very similar to those resulting from complex computations in science. Can we trust these proofs? What is the status of this kind of knowledge?

A famous example for such a proof of an equally famous theorem is discussed in the article "Formal proof: The four-color theorem". A more detailed description is available from the author's Web site, as is the source code of the proof in the Coq language.

A few articles on the subject illustrate the debate that is still going on:


Outside of scientific research, reproducibility is becoming a goal in software engineering for a different reason: ensuring that software can be trusted, i.e. that it is free of viruses and that it is not secretly spying out its users. This can only be ensured if the software's source code is publicly inspectable, and if the path from source code to executables can be verified. An initiative towards this goal deals with problems very similar to those encountered in reproducible computational research.

Constructing and maintaining complete reproducible system software installations is the goal of several Linux-related projects: NixOS, Guix, Reproducible Debian.