"Our greatest responsibility is to be good ancestors."

-Jonas Salk

Sunday, July 8, 2007

Mann, Tree Rings, etc.

Thanks to Jim Manzi, who in commentary to my posting about Oreskes' talk "How do we Know We're not Wrong?" asks a couple of provocative questions.

I am not an expert on tree rings and 1000-year scale reconstructions, so I will just say my peice on the subject and leave it at that. I'll have more to say about his modeling questions later, where I'm better armed, and more sure he is on shaky ground with his critique.

Anyway, for what it's worth.

I haven't met any of the tree ring people people and am almost a layperson on the subject except for a single journal club meeting at U of C, led by Rodrigo Caballero, with David Archer and Ray Pierrehumbert in attendance, so it was one of the many undeserved priviliges I had at U of C.

It was about von Storch's criticisms of Mann's statististics.

Mann, as is well known, produced a time series with considerable uncertainty but very little variation prion to 1900. This especially presented a threat to those who argued for large natural variation. To modelers, it was something of a surprise, because Mann showed our models to be exactly right, in fact, more right than most of us would have expected. (Mann's hockey stick reconstruction, modulo a couple of early bumps, does pretty what AOGCMs do.) It was also an iconic figure because the difference between preindustrial and postindustrial behavior of the atmosphere was so obvious.

In fact it appears Mann was in some sense "wrong". The way statisticians jump on people who aren't statisticians is, in fact, a bit obnoxious. They tend not to engage in the work of other fields until after it is published. (Fortunately there are exceptions, and our group does have a close collaboration with a statistician.) The fact remains that roughly 5% of the record of anyone else's reconstruction falls outside Mann's confidence bounds, so in a sense he was entirely right.

The other problems Mann had with bitwise replicability are very interesting and revealing of how science is conducted in climatology. Our practice is less formal than you might like in a life science lab and much closer to what happens in a small software shop, where the focus is on the output and not on the process. We think the importance of our work is so great compared to our small resources that will resist any imposition of process that reduces productivity. I see the other side of that argument but I'm not in the majority on that. In fact, I would be very pleased with a mandate that all public sector computing (except for a very small subset of security related matters) be performed entirely on an open source tool chain. I think the only reason a compelling argument to that effect hasn't been made is that it's a lost cause in the present political and economic context.

In practice, software in the scientific sector gets done by people trained in science and self-trained in software, and the maintenance and documentation issues for small-lab output (this does not include high performance models like GCMs to the same extent) would be considered amateurish and completely unacceptable in even the most casually run commercial software company.

Should somebody producing results you don't like should be held to higher standards retroactively? Called on the carpet in front of congress? Investigated publicly? That's the sort of thing that drives the best people out of science.

Von Storch pointed out that Mann's method systematically eliminated low frequency variability in the record. Subsequent reconstructions did show more secular variability, and since this was the point of greatest interest to the critics, they have declared him "wrong". The conclusion that contemporary temperatures are probably higher than they have been in a millenium, however, stands.

So, if the hockey stick is wiggly, what does that mean in practice? Our journal group was left with a damned-if-you-do damned-if-you-don't conclusion. If Mann's reconstruction was right, the detection of the greenhouse signal would be unequivocal. (Remember this was a few years ago when there was a still tiny shred of hope that we were wrong about the basic physics.)

On the other hand, if the stick had more wiggles, if natural variability were higher, this would weaken the detection argument, but would be cause for concern that the climate system is "tippy", with a tendency to wander further from its equilibrium than models show. This means that perturbing the system would have larger century-scale effects, and that models likely exclude phenomena that would cause the prognosis to be worse than expected.

Many arguments about model inadequacy go like this: it's more bad if the models are overoptimistic than it is good if they overpessimistic. So risk-weighting means the less we trust the models the more we should worry. The pseudo-skeptics invariably get this one wrong, and the real skeptics (Hansen, Broecker, Lovelock) are quite worried as a consequence.

The bumpiness of the reconstruction also is used backwards in the arguments; the bumpy record does not argue for complacency. Yes, if the record is bumpier, the detection problems becomes harder, but that one is in the bag already. The bumpier the record, the more evidence we have of models missing system modes at time scales that we have to worry about even in conventional policy terms. If Mann is wrong about the bumpiness, we are in bigger trouble, not less!

Regarding Jim's "bow vs stick" question, that's a sort of Rorschach test, isn't it? There is no doubt that from about 4000 BC until 1900 AD there was a gradual cooling trend. Nobody is claiming that present temperatures are the warmest in the postglacial period. Yet.

Regarding the performance of the reconstructions within the 20th C, my understanding is that there are all sorts of confounds introduced by the onset of anthropogenic forcings, not least of which is CO2 fertilization. That said, I wonder why that effect wouldn't artificially steepen the curve rather than flatten it out, if we assume that any individual specimen grows more under conditions of more warmth and more CO2.

You have plumbed the depths of what I know about this. I don't make a big deal out of this particular question and I don't think Oresekes does either. These guys think it is unusually warm, and I tend to believe them because that is what I expect, but the reasons I expect it have little to do with their work. I am sure they share my expectations, and I am not sure how effectively they separate their expectations from their results.

Oresekes is pointing out that their evidence is consistent with other lines of reasoning. I'm more familiar with the other lines of reasoning and am happier defending them, but if I had to bet I'd bet that we are already at the hottest point in the lasst 1000 years and will probably exceed the hottest point in the last 100,000 years (which happened about 6000 years ago) soon.


Anonymous said...

"Should somebody producing results you don't like should be held to higher standards retroactively? Called on the carpet in front of congress? Investigated publicly? That's the sort of thing that drives the best people out of science."

Well, yes, but the sums, or equivalently, the amount of time, that the public are starting to be asked to spend to ameliorate global warming are simply staggering. For example, using public transit easily turns a one-hour roundtrip into a three-hour ordeal. People are not going to rush to turn their lives upside-down based on software with amateurish quality control, and rightly so.

Michael Tobis said...

Yes, this will not be cheap. I'm not one of the people pretending it will be. On the other hand, we just need to be smart. We do not need to turn our lives upside down; we need to find the right solutions and apply them.

If it were all based on grad-student software, I'd agree with you. It isn't. It's based on science. That is the point. There are multiple coherent streams of evidence. By now we know that the atmosphere-ocean system's equilibrium response to a CO2 doubling is probably between 2.5 C and 3 C, and almost certainly between 1.5C and 5 C.

That's a mouthful but it's important. It means we are out of the range where we can ignore the problem. You don't need one or another thread of evidence. It's not that kind of reasoning.

In Oresekes' terminology, it is consilient. Many forms of evidence have converged to tell this tale. If we are wrong, we are wrong because we missed something important, and if we missed something important it is at least as likely to make matters worse as better.

That is why we need to respond. There is no such thing as inaction. We are already committed to a cetain amount of disruption by our actions, whether we adjust our practices or not. We have to manage under uncertainty, and "I'll do nothing until you prove your case" is an idiotic extreme.

Note that "I will pass no laws" is only one version of "do nothing"; the other is "I will burn nothing". Surely reason lies in between somewhere.

Anonymous said...


As always, you're being very reasonable.

A few quick points:

1. To be clear, I was quoting North on the "Bow vs. Stick" comment. They were not precise (to say the least) about confidence intervals, but seemed quite clear that they considered the proxies to be less-than-reliable prior to about 1600.

2. It seems to me that getting the atrribution analysis right is THE output in this case.

3. I'm not sure "greater natural variation" = "more tippy", if all of that variation was within a band that is within a control range that is no big deal to live with.

4. I think that climate sensitivity "almost certainly" being within 1.5C - 5C is dependent on accepting the GCM models, which you know that I consider unvalidated.

5. For what it's worth, I think the open source idea is awesome (and, unfortunately, as politically unlikely as you do).

Best regards,
Jim Manzi

EliRabett said...

One thing you left out, the most executed parts of the software are usually optimized subprograms which were written by professionals, otherwise the damn things would take forever to execute.

The other thing is that professionals tend to focus less on the formal mathematics then on whether the model is physically reasonable. Thus the dichotomy about whether Mann's original method produced hockey stick curves (well yes, but at orders of magnitude less than what was seen in the climate record, and with no statistical power if you really screwed the inputs into a strange form). Wegman clearly needed to talk to someone who did calculations about physical systems.

Michael Tobis said...

Eli, I don't quite follow that bit in parentheses.

It is important to note that Mann made no claims about the low frequency spectrum of his analysis, nor did the WGI report that featured his curve explicitly do so. So it is my impression that he has not been "proven wrong" in any sense.

In retrospect, it is easy to see this as a sort of polemical overreaching, and I think that might arguably been what happenned.

At no time did Mann or WG I point to the curve and say "see, hardly any century scale variability! no Little Ice Age!"

It was only the fact that so much had been made of the "Little Ice Age" and century scale variability that made this such a target. In that sense it's a straw man.

Even if you dismiss this particular piece of evidence altogether, if you want to argue that there is little GHG sensitivity, and yet the predicted highly unlikely accelerating surface warming (not to mention the stratospheric cooling, etc.) is showing up anyway, you have an awful lot of 'splainin' to do.

That said, there is a great deal to learn from this episode and I don't see people learning it. I have heard an argument that the Mann case argues for closed source science, because if you publish your calculations politicians are likely to latch onto some small mistake and make a disproportionate amount of trouble!

The relationship between statistics and science is one very important issue that comes to light. I don't think the statisticians handled this matter appropriately at all.

Anonymous said...

Jim, I won't bother arguing with you about the rest, but I would suggest that a careful examination of long-term paleohistory should lead you to be rather less complacent about your point 3. In particular I would point to the extreme rareness (relative to the Phanerozoic) of Pleistocene climate conditions and that the mid-Pleistocene transition shows the whole thing to be in a rather delicate balance. Add to that the fact that glaciations tend to terminate rather rapidly once the process begins.

Michael Tobis said...

1) Thousand year proxies; I dunno. I've told you what I know. Basically the story is that not much happened that makes the past 1000 years stand out from the past 8000, until about 100 years ago.

You should look at the past 20K and the past 400K records before getting all worked up about 1K. They are dramatically more interesting.

Of course, some of your allies do not believe in durations in excess of 10Ka. I hope you are not going to try to hold us to that here.

2) First of all, attribution is pretty much in the bag anyway. Secondly, statistical significance is an arbitrary threshhold that has proven useful in clinical studies. It is not appropriate for what biologists call "single subject" investigations, and we only have one earth.

3) Thinking in terms of linear systems is actually very useful in geophysical fluid dynamics on eddy time scales, but on longer climate scales it gets you in trouble. What is happening on the century scale, if anything, is dynamically up for grabs. The prime suspect would be the thermohaline circulation, but we don't have much handle on it.

There is some concern that the THC can flip to a new regime. The experimental study of regimes in rotating fluids in the 1950s and 60s is a fascinating topic.

In short, if you propose that there is signal at a secular scale, you are proposing missing phenomenology. We can't just wave our hands and say it's stable under a strong perturbation if it was oscillatory in quasi-equilibrium.

4) What exactly do you mean by "validated"?

Existing models are clearly valid for many purposes.

Which brings me back to the things you say about Oreskes that strike me as most incorrect.

Taking it back to your first comment, the extent to which 1990 vintage theoretical and model extrapolation hit the nail on the head is more like "the team that bats first will hit into a triple play in the fifth but will manage to score seven runs in the eighth".

In short, we actually do know what we are talking about. I would like to see an economic model make valid predictions of unlikely events.

This will take some work but it's worth following up on. Stay tuned.

5) yep

Dano said...

Let me reiterate I read this blog for a reason.

Now, dancing on the edge of detectability:

My grad ecology shop was run by a landcover change modeler who sounds very much like you, Michael. And our sofware ran like Eli said, with Band-aids.

And we in ecology know that ecosytems are, in fact, 'tippy'. Stasis and stability are merely a function of temporal scale; e.g. are Western hemlock climax forests 'climax forests'? We don't know, because we don't have sufficient scale to measure. And west of Latitude 100 - rainfall is highly variable making ecosystems very tippy.

Similarly, I see the Mann reconstruction against everyone elses' ideas that show the tippiness, and I think this variability is borne out in tree rings, sediment cores, etc. Plants (I'm a plant guy) that have adapted mechanisms to withstand large-scale disturbance (ruderals) are the ones that are our invasives, which is your clue.

The plant people see this. We see northward and altitudinal migration and we understand why, as we have refugia in southern Appalachia and the Selkirks and the Marble Mts to refer to when we travel. Refugia in the Carpathians and many parts of China and Mongolia are harder to travel to, but are there nonetheless.

Anyway, yes.



Michael Tobis said...

Well, I write this blog for more than one reason, I reckon, but feedback like yours is definitely one of them.

Thanks, Dano!

Keep them cards and letters coming in, folks!

EliRabett said...

Sorry, what I meant was you (if your name was McKitrick, or McIntyre) could with a great deal of effort generate random noise that produced a hockey stick using Mann's method but 1. The stick shape was way lower than what was found using the proxies over a 600/1000 year period, and 2 statistical tests of the significance of the noise showed that it was noise.