"Our greatest responsibility is to be good ancestors."

-Jonas Salk

Wednesday, May 6, 2015

Why I am not a Paid Scientist

Why I quit being what I called a "senior junior" scientist. Specifically, samples of the sheer awfulness of the job of the staff scientist in a high performance software capacity appear at:


This documents some of my struggles trying to get some results out of climate models. The intellectual aspect is essentially nil - neither the scientist nor the programmer impulse is satisfied in any way by this nonsense. Basically I am dealing with a poorly documented code on a poorly documented platform with obscure error messages. The spawning of distributed memory jobs on a unique collection of processors with what amounts to a homebrew filesystem and a homebrew operating system which changes under your feet.

Once I had compiled code and tried to make a small change. It failed. I undid the change. IT STILL FAILED. I had a copy of the old executable and the old source. The old source and current source matched. The old executable and the current executable did not.

All of this is perfectly orthogonal to whether climate models are good models. But (if NCAR is any example) they are lousy software in terms of results per unit human effort.

My belief is that atmosphere models are excellent models for many purposes, ocean models are excellent models for some purposes, and coupled models are perhaps less fully tested than one might like but are absolutely necessary for our current state of knowledge.

But the experience of working with them is hellish, and being a single person trying to get them to do anything outside the environment in which they were developed is a lonely and demoralizing task. Frankly it (along with some other stresses outside the lab to be sure) almost killed me, and I am glad I am out of there and somewhat recovered.

Kate a.k.a. ClimateSight refers to this problem in a recent and interesting posting and has some comparable kvetching here; I promise you getting this stuff running at scale on an unsupported non-NCAR supercomputer facility is much harder than getting it to do a few timesteps on a commercial box.

Being a tenure-track scientist is tough, but settling for a staff scientist career (which I thought would be less stressful) really didn't turn out to be any picnic. There is much to say about this, some of which projects onto climateball obsessions and much of which doesn't. In general, climate folk don't complain much, partly for fear of giving aid and comfort to the enemy, and partly because there's a ludicrous geek cred in solving absurd and arbitrary problems - it doesn't do to whine too much.

But the problem is among those endemic to computational science. Brown. Knepley and Smith have a very cogent article on the subject "Run-Time Extensibility and Librarization of Simulation Software" whose abstract reads:
Build-time configuration and environment assumptions are hampering progress and usability in scientific software. This situation, which would be utterly unacceptable in nonscientific software, somehow passes for the norm in scientific packages. The scientific software community needs reusable, easy-to-use software packages that are flexible enough to accommodate next-generation simulation and analysis demands.
It is not only an important and excellent piece, but it also has quite an amusing opening section, though perhaps only to those who have suffered through the problem.

And that, friends, is the story of how I pissed away several years of my life.


Anonymous said...

sympathise with what you're getting at here, and I think I've been very lucky by comparison. Apart from an early stint as an experimentalist/observer, most of my career has also involved a big chunk of high-performance computing. However, whether by luck or design (probably a combination of the two), I've avoided the frustrations of trying to extract information from extremely complex models; mainly by not allowing them to get too complex.

I've been fortunate to have been involved in an area where you can get away with trying to understand things at a reasonably phenomenological level, and in which you don't need to introduce too much complexity to do interesting things. Others have done so, but I've tended to simply move onto something a little different, since there are enough basic problems that still need to be solved that you don't need to simply add more complexity in order to get something worth presenting.

Michael Tobis said...

Hell is other people's code.


Jamie said...

Thank you for sharing your story. I'm sorry that the staff scientist position turned out to be a bust, and I'm glad you were able to find greener pastures.

I'm a climate modeler one deck below the one you used to occupy on the sinking ship that is publicly-funded U.S. science, and will likely jump on a life boat for the private sector once my postdoc is up. Despite the frustrations that accompany climate modeling (as a mere user, most of my irritation stems from porting to new machines and debugging build scripts every time my institution updates software), I've learned a ton about experimental design, parallel computing, processing large amounts of data, and other "techy" things since I took it upon myself to run some simulations with little more than 1 semester of FORTRAN, the model user manual, a basic familiarity with UNIX environment, and a few scripts from my graduate adviser.

I'd always envisioned myself as a research scientist rather than a professor (never was interested in teaching), but it is clear that permanent salaried slots at places like NCAR are being phased out in favor of temporary postdoc/visiting scientist gigs, and soft money research staff positions. As much as I dislike the idea of working for the private sector, I'd rather spend the rest of my life wondering if my job will be axed without the additional burden of fighting over the dwindling supply of federal research dollars.

Kate said...

I found CESM (the NCAR flagship model) to be the most difficult to port. In fact I never got it running. Since then I have successfully ported several other models, and done little bits of development on UVic which was already ported for me. This involved a lot of swearing at compilers but at least no dead ends, I always got things working after a few days.

I find it really fun...but in small doses. Ask me again in a few months (I've just begun merging two different branches of an ocean model) and I'll probably be ready to go back to reading papers all day!

Michael Tobis said...

Thanks, Kate. I had wondered if NCAR stuff was particularly awful in this regard; it was mandated in my case and in this position I never looked at anything outside the NCAR suite and the TACC computers and a couple of compilers, all three parts of which kept changing.

Also as far as I know there is a beautiful interpolation routine hiding under NCAR's Ferret but nobody can get it out of its ferrety cage. I know of no alternative software to regrid atmosphere or ocean data to new axis systems. It accessed only via a crude special purpose interpreter that is subtly broken. So I lost some time with that bit of backwardness too.

steven said...

"Also as far as I know there is a beautiful interpolation routine hiding under NCAR's Ferret but nobody can get it out of its ferrety cage. I know of no alternative software to regrid atmosphere or ocean data to new axis systems. It accessed only via a crude special purpose interpreter that is subtly broken. So I lost some time with that bit of backwardness too."

Hmm, what is your from and to? I've done some work, know some guys... etc

I started my life in OPC and for the longest time have stayed in OPC. OPC is hell, but I enjoy hell.
after 3 months even my own code seems like someone else wrote it, because most of it is written in a trance of sorts

I do all my work in R, so nothing as big as a GCM anymore, still there is a certain joy in walking through OPC seeing how someone elses mind works and bending your own to that style or approach.

hmm come to think of it, i've worked with mcintyre code, willis code, romanM code, tamino code, stokes code, robert ways code, and a bunch more in my work with R packages.
some times I want to scream.. but in the end I treat it as a meditation of sorts, a puzzle..

Still, I can empathize.. hm long time ago there was TTS (text to speach) stuff that I could never unravel, hours and hours with nothing to show for it.. went back to marketing.. haha.

Michael Tobis said...

At a relevant thread at Stoat I wrote

In email, Steve Easterbrook argued that the elephant in the room is a lack of funding and career path for people who are professional software engineers in science.

True indeed.

Leaving aside questions of my own career, some of the tenure/promotion decisions I’ve heard of for the most productive scientist/coders have been inexcuseable.

I agree with dhogaza, that this doesn’t necessarily impact the validity of the results, but to say it doesn’t affect their credibility seems an overstatement.

It also affects recruiting and retention. Working in this sludgy software environment appeals neither to the engineer nor the scientist part of the potential contributor’s ambitions.

Engineering has moved on, and despite its importance much scientific software is a computational methods backwater and in some corners even a backwater in ordinary software competence. Climate science is a poster child for this problem.

I’m not surprised to find UAH code is a hideous mess. The community climate models at least have a lot of eyeballs on them, albeit focused on the model fidelity more than its usability.

But it’s the nature of software – one-offs from labs are likely to be broken, as defensive coding practices are absent from the social milieu. The pressure is to publish something credible, nbot to publish something correct.

The smaller the user base the more likely the code is to be broken. Getting code right is expensive and there’s little institutional motivation for it. Investigators are motivated to minimize/trivialize problems. The fewer the eyes on the code, the more likely errors are to persist. (Corrolary to Linus’s Law of Eyeballs: with few enough eyeballs, all bugs are deep.)

That Spencer knows what he is looking for only makes matters worse, of course, but in this he isn’t as much an outlier as we might want to believe. Everyone wants their own intuitions confirmed. Much of scientific method is to avoid fooling oneself. But we haven’t yet applied that in a systematic and serious way to complicated computations.

here http://scienceblogs.com/stoat/2015/04/26/now-we-know-why-uah-v6-is-so-late/#comment-53234

James Annan said...

"And that, friends, is the story of how I pissed away several years of my life."

Reminds me of George Best: "People ask me what I did with my money: I spent quite a lot of it on booze, women and fast cars; the rest I just frittered away." There are probably worse things you could have done!

I ducked out the HPC thing a few years ago by aiming at the analysis of other peoples' model output: seems to have been a decent decision though I am no longer a paid scientist either!

So, what are you up to these days anyway?

EliRabett said...

FWIW, computer chemistry code seems, at least to an outsider, to be slightly better than this, if for no other reason that some groups have commercialized it.

Of course, programming is sausage making.

Michael Tobis said...

"So, what are you up to these days anyway?"

Trying to figure out a better answer to that question than this one.