I wrote a two-page white paper for a DOE workshop on software productivity for extreme-scale science. In this paper, I coin a new term (at least I think it is new!): dark software. I explain this concept below:
Scientific discovery is the result not of individual simulations but of complex end-to-end research processes. These processes frequently involve, for example, the ingest and analysis of simulation, experimental, and observational data; the invocation of simulations within larger design optimization and uncertainty quantification activities; validation through comparison of experimental and simulation data; and the dissemination of output data to communities for analysis. The software created and used by extreme scale scientists must address all such tasks—and the productivity of those scientists will be determined by the sum of the times taken for all tasks.
But while the software used to perform simulations on extreme-scale computers is often carefully engineered, the software used for other tasks in the end-to-end workflow is typically not. Indeed, it often involves ad-hoc scripts, one off programs, and other non-scalable, non-shareable, and error-prone components. Scientists may not even think of this code as software, even though it consumes much time and energy. Thus, by analogy with dark matter in physics—the stuff that, while invisible, is hypothesized to account for a large part of the total mass in the universe—I term this code dark software . I believe that dark software accounts for a substantial fraction of the total “mass” of an extreme-scale project as measured in lines of code developed by individual scientists—and the time spent with that code during a project’s lifetime. I suggest that a program to improve software productivity for extreme scale science must address dark software if it is to be impactful. I discuss where dark software arises in research and propose a research program to address associated challenges.
Let me know what you think.
Details on the publication:
Foster, Ian (2014): Dark software: Addressing the productivity challenges of extreme-scale science on-ramps and off-ramps. figshare.