New book: Cloud Computing for Science and Engineering

I am excited to announce the availability of Cloud Computing for Science and Engineering, a new book written by Dennis Gannon and myself and published by MIT Press. The full text is also available online at, along with associated Jupyter notebooks and other supporting material.

Clouds operated by Amazon, Microsoft, Google, and others provide convenient on-demand access to storage and computing. They also provide powerful services for organizing data, processing data streams, machine learning, and many other tasks. Every scientist and engineer needs to understand what these services can and cannot do,

Read more ›

A new research network in New Zealand

I was fortunate enough to spend three days recently with a tremendously smart and vital group of New Zealand scientists at the launch of Te Punaha Matatini, a new “Center for Research Excellence” funded by the NZ government.

Te Punaha Matatini is Maori for “the meeting place of many faces”, and what a tremendous variety of faces I met. A few I already knew, like director Shaun Hendy, nanoscientist, econophysicist, and co-author of the wonderful Get off the Grass; climate economist Suzi Kerr, co-founder of Motu;

Read more ›

What is data worth? Let’s put a price on it

We see much discussion of late about how the importance of not only preserving scientific data but making data freely accessible. It is surely the case that much valuable data, produced at great cost, is subsequently discarded. But making data accessible incurs a cost (in some cases, a substantial cost). Thus, any systematic program aimed at preserving more data requires some systematic process for deciding what data is worth keeping.

I suggest that we can usefully approach the question of data preservation from an economic perspective. Data costs money to collect or create, a cost that may depend on both the data d involved and the time t at which it is created: thus,

Read more ›

Economics of platforms: Implications for cyberinfrastructure

I recommend an interesting paper by Glen Weyl and Alexander White, “Let the Best ‘One’ Win: Policy Lessons from the New Economics of Platforms.”  The abstract summarizes the message:

The primary policy problem in platform markets is usually considered to be excessive lock-in to a potentially inefficient dominant platform. We argue that, once one accounts for sophisticated platform pricing strategies, such concerns are overblown. Instead the greater market failure is excessive fragmentation and insufficient participation. These problems, in turn, call for a very different policy response: aiding winners in taking all,

Read more ›

Parallel algorithms for the spectral transform method

Screenshot 2014-02-27 23.25.21

I sometimes wonder about the relationship between my affection for a paper that I have written, the effort required to write it, and how much recognition the paper obtained. I think that the papers that took the most time are still the ones I like the best, even if sometimes they did not generate so much buzz in the community.

One paper that took a lot of time and of which I remain proud is Parallel Algorithms for the Spectral Transform Method, which Pat Worley and I published in the SIAM Journal of Scientific Computing in 1997.

Read more ›

Why not store personal health information in the cloud?

At the recent American Association for the Advancement of Science (AAAS) meeting in Chicago, my colleague Bob Grossman organized what was by all accounts a fascinating session on How Big Data Supports Biomedical Discovery. It being a Saturday, I had family duties. But I read with interest a synopsis of remarks made by speaker Lincoln Stein: “Legal and ethical issues with using commercial cloud vendors for cancer data. If Comcast buys Amazon, who owns data?” (As you can tell by the abbreviated style, this remark was communicated by Twitter.)

Lincoln published in 2010 The Case for Moving Genome Informatics to the Cloud,

Read more ›

Thoughts on dark software

I wrote a two-page white paper for a DOE workshop on software productivity for extreme-scale science. In this paper, I coin a new term (at least I think it is new!): dark software. I explain this concept below:

Scientific discovery is the result not of individual simulations but of complex end-to-end research processes. These processes frequently involve, for example, the ingest and analysis of simulation, experimental, and observational data; the invocation of simulations within larger design optimization and uncertainty quantification activities; validation through comparison of experimental and simulation data;

Read more ›

Micrometrics as a solution to software invisibility

Software is central to modern science. But software is also largely invisible, and in consequence, is undervalued, poorly understood, and subject to what appear to be underinvestment and policy decisions that are not driven by data. We must do better if we want science to address the challenges faced by humankind in a time of massive scientific opportunity but limited resources. I argue here that micrometrics can help us do better.

Read more ›

The History of the Grid: Comments invited

GUSTOTwo years ago, Carl Kesselman and I published a rather lengthy paper that purports to recount the “history of the grid.” (I. Foster, C. Kesselman, The History of the Grid (PDF), in Cloud Computing and Big DataIOS PressAmsterdam , 2013; 37 pages, 176 references).

We believe that this paper includes useful material. We also know that it can be much improved, and to that end we plan a second edition.

Read more ›