Even in the clearest cases of climate research, the main problems with climate science are almost all repeated across whole swaths of science.
There's an article at zdnet by Andrew Jones describing some of the problems in "HPC", i.e., "high performance computing" a.k.a. "big iron". It's fairly lightweight; it won't have anything new for people familiar with the field. But it may offer food for thought for outsiders.
In short:
The trick then must be to ensure the scientist code developer understands the methods of numerical software engineering, as well as its issues. Software engineers on the team must equally understand that the code is just part of the science, and not usually a goal in its own right.(Ooh, I just noticed. He said "trick"!)
Some unusual features of climate codes:
- there are no obvious whole system tests
- the constituent equations are not known at the spatial scale of interest
- the long-term ensemble average is of more interest than the particular solution
So while the problems are the same across the sciences at the broad-brush level, at the implementation level completely different thinking needs to go into the code; the architecture needs to reflect the testing strategy and the testing strategy needs to reflect the science.
In principle it's a beautiful, interdisciplinary problem. In practice, the relevant disciplines (computational science, software engineering, meteorology, oceanography and statistics) share little in the way of language or methods; even getting the closest pair (meteorology and oceanography) communicating was difficult.
Perhaps similar things can be said about other disciplines. I don't doubt that biological and biomedical applications have some uniqueness at the level of implementation detail such that a specialized branch of software engineering is required there too.
The existing pattern where software people are considered secondary has both the flaw that it is wrong in principle, and the flaw that it doesn't attract the best software people in practice. Throw in the truly dreadful legacy of hundreds of subtly incompatible versions of fortran and rapidly shifting platforms, and you have, well, a mess.
This doesn't mean that the work to date is anything short of a remarkable achievement. It may mean that we are reaching diminishing returns with existing practice.
Andrew Jones again:
However, we must also beware of the temptation to drive towards heavily engineered code throughout. Otherwise we run the risk that each piece of code gains a perceived value from historic investment that is hard to discard. And perhaps in some cases, what we need as much as renovation is to discard and restart.
13 comments:
I think the first two of your 'particulars' (no obvious test & unknown constituent eqns) are coupled. The obvious test the rest of us use is grid convergence; that's hard to do for climate models because of choices made in how some of the parameterizations were designed/implemented (discontinuities / look-up tables being difficulties that come immediately to mind). You're right, you've got to start design with the testing in mind.
Agree. We need to write codes that are more mathematics-friendly. The disconnect between climate modeling labs and computational scientists bites hard on this point.
"Amateurish" may be a little harsh. You could have written "scienceish", unless you are aware of other scientific disciplines that do a much better job (with similar tasks, ie one-off codes under constant development specially designed for supercomputers, etc etc).
Not that I'm disagreeing with your suggestion that it may be possible to do a better job...in fact it seems like the people here have even abandoned any semblance of version control for their model, which has splintered into at least 3 versions used by competing factions...
"unknown constituent equations"
There is a literature on constitutive laws which provide scale=up from known continuum mechaincs writ small. There is also an emerging literature on multi-scale modeling with impotant successes in material science & engineering and maybe elsewhere.
I don't see this as, at all, a software development problem. Just properly encapsulate so that alternate modules can be slipped in when the time comes.
I suggest a quick read of Languages, Levels, Libraries and Longevity, especially the first section on software 5,000 years from now. Looking backwards there, see the comments on McIlroy or Wheeler.
People have been complaining about this for the mere 40+ years I've been involved in computing, and I'm sure they were complaining about it before that.
We cannot turn all scientists into numerical analysts, statisticians and good software engineers. At Bell Labs, we had relatively vast resources (25,000 people in R&D), in all phases of R&D (R2-D@ @ Dot Earth).
*We* didn't try. We certainly expected different behavior from researchers than from switching-system software people in 200-person projects.
Some of what we did do is the sort of thing mentioned in the first URL. We always had some specialist numerical software folks writing general-use routines for our computer centers.
Too many scientists and engineers were writing their own statistics codes.
Bad idea.
Maybe we should have insisted on training everybody.
But, better idea:
John Chambers did S, ancestor of R.
But, the *real* problem, at heart is that of too many people *writing* too much code, rather than finding good code, reusing it, and sometimes contributing to the pool. The toolset for this is a *lot* better than it used to be, but for history, see Small is Beautiful (1977).
I posted this on stoat but it probably belongs here. I think you're being a bit tough - yeah it is scientists who don't care about elegance when writing code, but the stuff seems to work. Unfortunately I doubt I'll ever get a bloated Fortran climate model running in my new area of interest i.e. GPU parallel programming via CUDA etc.
I'm not sure how much we're allowed to talk about the proprietary MO & Hadley models -- but in my experience (porting various Hadley models on Linux & Mac & Windoze) I'd have to agree with stoat -- numerically sound but "engineeringly" a bit frighteningly. Just navigating the 500 huge files (the first 3000 lines being comments & variable declarations following by a few more thousand lines of code) was pretty horrifying. Having to dive into the D1 superarrays etc to get diagnostics was frightening and often ulcer-inducing. And discovering all the "interesting" climates you could get just by compiler options, bit options etc. I basically did a Monte Carlo experiment just of a standard model with all sorts of potential compiler options and could get widely different "climates" out of the same params.
I've been out of the loop on things for about a year (since last I looked at hadcm3 & hadgem1) but the new subversion-ish system the MO is now using seems to be a big improvement on that bizarrely antiquated "mods" from before. In an ideal world I would have loved to have been able to put or merge back the many mods I made in C & Fortran for cross-platform portability but there was neither the time or money for that.
Oh yeah the worst I found in the Had stuff was all the various ancillary scripts & executables you needed to run just to get things off the ground & running, i.e. set 100 environmental variables, and process ancil files etc. It seemed that could be streamlined a bit. I haven't found anything analogous to my commercial work of 20 years in IT; maybe if I started in legacy systems of 1972 or something! ;-)
I'm afraid I haven't looked at any other GCMs other than quick looks at CCSM (US) & ECHAM5 (Germany) so I can't really compare it to anything else.
Supercomputers today means massively parallel machines. Problems which lend themselves to such computing spend 99%+ of the time executing 1% of the code (Eli is allowed a very slight exaggeration) and it is THAT 1%, usually carefully engineered and optimized equation solvers and the like, which need to be optimized. The rest is merely input and output handling, well within the reach of your average scientific programmer, whose real job is to
a. Formulate the problem
b. Put it into a form that makes best use of the parallel subroutines.
This was obvious even back in the late 80s when parallel computing first made its way into scientific computing (John will tell Eli that this happened in the 70s:)
BTW, go read Steve Easterbrook's blog Serendipity for an expert view of what the situation really is
Steve Easterbrook and I communicate frequently, and it is in his honor that I am trying to revive the conversation here.
I know another person who has applied formal software metrics to climate software and has come to the opposite conclusion.
Also, Stoat has now got experience on both sides of the fence and is starting to sound like me.
The key trouble is that the system being modeled is intrinsically tightly coupled. It is very difficult to identify appropriate integration tests when we don't have clear ideas of how the system OUGHT to behave.
If anybody is arguing that scientific coders aren't smart, it ain't me. But whole systems matter.
And huge swaths of computational science are being ignored, too...
Massive parallel computing goes back to ILLIAC IV, mid-19790s or thereabouts.
However, the real expansion of parallel computing, in terms of more widespread usage (as compared to tiny handfuls of rare, expensive machines) indeed came in the late 1980s. The most successful example was the SGI Power Series, which had 2-8 MIPS R3000s, and (more importantly) parallelization directives and compilers that got to be good at parallelizing code, so that regular engineers and scientists could use moderate-cost machines and get decent speedups.
Carl C --- There is quite a bit of code with "ancillary scripts & executables" for start up and even shut down. I'm not ready yet to try to explain why this always happens of some classes of (big) software.
This really is a topic for computing/coding types to hash out. Sorry but I want to share this:
I thought I would learn something useful last week going to a talk regarding prediction of Fraser River temperatures in 2080-2100 versus the last 20 years. The Fraser River basin occupies only three 3x4 degree grid cells in a particular GCM. The salmon in it are quite sensitive to temperatures above given thresholds. The idea was to predict, on average, how many days per migration season temperatures would exceed those thresholds.
The beauty of using the GCM to do this, rather than some specific hydrological model accounting for local effects, is apparently the fact that using the GCM to do all the work ensures conservation of the mass of water. They're trying to predict temperature change and daily variation based on air temperatures and water volumes. The main problems with this approach arise because there's almost no local information in the model. Under A1b, they predicted a 1-2 Celsius (closer to 1) change over the 100 years (even though there has been an increase of ~1.6 C in just the last 50 years).
I was greatly surprised to learn that there's no topography in the model. If there was no south-facing versus north facing slopes, I probably wouldn't be surprised -- but no topography? The result is that all the water tended to come out too early (the hydrograph didn't reflect any snow remaining at high elevation into summer -- summer flows were driven by precipitation). To me this suggests that there is also no accounting for the melting alpine glaciers that ameliorate exceptionally warm river temperatures. So I was left with the impression that the model should overestimate temperature change in the summertime Fraser River. On the contrary, looking at the temperature change predictions, well, they look like underestimates.
I don't know how to fit this into a context of allowing more scientific experimentation etc. But it seems to me that local/regional climate predictions require local/regional context. Is this relevant to "is climate modeling still stuck?"
The most successful thing Eli has seen for local modeling are neural nets. Unfortunately they do not answer the question of why, merely what.
Michael Tobis said:
The key trouble is that the system being modeled is intrinsically tightly coupled. It is very difficult to identify appropriate integration tests when we don't have clear ideas of how the system OUGHT to behave.
If anybody is arguing that scientific coders aren't smart, it ain't me. But whole systems matter.
A snippet from a tangentially related field you might find interesting (emphasis added):
Unfortunately, as discussed in Ref. [32], when solving complex PDEs, a computational scientist finds it virtually impossible to decouple the distinct problems of mathematical correctness, algorithm correctness, and software-implementation correctness. For instance, algorithms often represent nonrigorous mappings of the mathematical model to the underlying discrete equations. Two examples of such mappings are (1) approximate factorization of difference operators, and (2) algorithms that are derived assuming high levels of smoothness of the dependent variables in the PDEs, when in reality the algorithms are applied to problems with little or no continuity of the derivatives of the variables. Whether such algorithms produce correct solutions to the PDEs cannot be assessed without executing the code on specific problems; the execution of the code is, in turn, coupled to the software implementation. One consequence of these couplings among mathematics, algorithms, and the software implementation is that the source of a numerical inaccuracy cannot be easily identified. These couplings also suggest that there is a greater overlap between PDE complexities, discrete mathematics, and SQE than some practitioners might prefer.
Verification and Validation Benchmarks
Post a Comment