I’ve been somewhat sidetracked on this series, mostly by starting up a company and having no time, but also by the voluminous distractions of IPCC AR5. The subject of attribution could be a series by itself but as I started the series *Natural Variability and Chaos* it makes sense to weave it into that story.

In Part One and Part Two we had a look at chaotic systems and what that might mean for weather and climate. I was planning to develop those ideas a lot more before discussing attribution, but anyway..

AR5, *Chapter 10: Attribution* is 85 pages on the idea that the changes over the last 50 or 100 years in mean surface temperature – and also some other climate variables – can be attributed primarily to anthropogenic greenhouse gases.

The technical side of the discussion fascinated me, but has a large statistical component. I’m a rookie with statistics, and maybe because of this, I’m often suspicious about statistical arguments.

### Digression on Statistics

The foundation of a lot of statistics is the idea of independent events. For example, spin a roulette wheel and you get a number between 0 and 36 and a color that is red, black – or if you’ve landed on a zero, neither.

The statistics are simple – each spin of the roulette wheel is an **independent event** – that is, it has no relationship with the last spin of the roulette wheel. So, looking ahead, what is the chance of getting 5 two times in a row? The answer (with a 0 only and no “00” as found in some roulette tables) is 1/37 x 1/37 = 0.073%.

However, after you have spun the roulette wheel and got a 5, what is the chance of a second 5? It’s now just 1/37 = 2.7%. The past has no impact on the future statistics. Most of real life doesn’t correspond particularly well to this idea, apart from playing games of chance like poker and so on.

I was in the gym the other day and although I try and drown it out with music from my iPhone, the Travesty (aka “the News”) was on some of the screens in the gym – with text of the “high points” on the screen aimed at people trying to drown out the annoying travestyreaders. There was a report that a new study had found that autism was caused by “Cause X” – I have blanked it out to avoid any unpleasant feeling for parents of autistic kids – or people planning on having kids who might worry about “Cause X”.

It did get me thinking – if you have let’s say 10,000 potential candidates for causing autism, and you set the bar at 95% probability of rejecting the hypothesis that a given potential cause is a factor, what is the outcome? Well, if there is a random spread of autism among the population with no actual cause (let’s say it is caused by a random genetic mutation with no link to any parental behavior, parental genetics or the environment) then you will expect to find about 500 “statistically significant” factors for autism simply by testing at the 95% level. That’s 500, when none of them are actually the real cause. It’s just chance. Plenty of fodder for pundits though.

That’s one problem with statistics – the answer you get unavoidably depends on your frame of reference.

The questions I have about attribution are unrelated to this specific point about statistics, but there are statistical arguments in the attribution field that seem fatally flawed. Luckily I’m a statistical novice so no doubt readers will set me straight.

On another unrelated point about statistical independence, only slightly more relevant to the question at hand, Pirtle, Meyer & Hamilton (2010) said:

In short, we note that GCMs are commonly treated as independent from one another, when in fact there are many reasons to believe otherwise. The assumption of independence leads to increased confidence in the ‘‘robustness’’ of model results when multiple models agree. But GCM independence has not been evaluated by model builders and others in the climate science community. Until now the climate science literature has given only passing attention to this problem, and the field has not developed systematic approaches for assessing model independence.

.. end of digression

### Attribution History

In my efforts to understand Chapter 10 of AR5 I followed up on a lot of references and ended up winding my way back to Hegerl et al 1996.

Gabriele Hegerl is one of the lead authors of Chapter 10 of AR5, was one of the two coordinating lead authors of the Attribution chapter of AR4, and one of four lead authors on the relevant chapter of AR3 – and of course has a lot of papers published on this subject.

As is often the case, I find that to understand a subject you have to start with a focus on the earlier papers because the later work doesn’t make a whole lot of sense without this background.

This paper by Hegerl and her colleagues use the work of one of the co-authors, Klaus Hasselmann – his 1993 paper “Optimal fingerprints for detection of time dependent climate change”.

Fingerprints, by the way, seems like a marketing term. Fingerprints evokes the idea that you can readily demonstrate that John G. Doe of 137 Smith St, Smithsville was at least present at the crime scene and there is no possibility of confusing his fingerprints with John G. Dode who lives next door even though their mothers could barely tell them apart.

This kind of attribution is more in the realm of “was it the 6ft bald white guy or the 5’5″ black guy”?

Well, let’s set aside questions of marketing and look at the details.

### Detecting GHG Climate Change with Optimal Fingerprint Methods in 1996

The essence of the method is to compare observations (measurements) with:

- model runs with GHG forcing
- model runs with “other anthropogenic” and natural forcings
- model runs with internal variability only

Then based on the fit you can distinguish one from the other. The statistical basis is covered in detail in Hasselmann 1993 and more briefly in this paper: Hegerl et al 1996 – both papers are linked below in the References.

At this point I make another digression.. as regular readers know I am fully convinced that the increases in CO2, CH4 and other GHGs over the past 100 years or more can be very well quantified into “radiative forcing” and am 100% in agreement with the IPCCs summary of the work of atmospheric physics over the last 50 years on this topic. That is, the increases in GHGs have led to something like a “radiative forcing” of 2.8 W/m² [*corrected, thanks to niclewis*].

And there isn’t any scientific basis for disputing this “pre-feedback” value. It’s simply the result of basic radiative transfer theory, well-established, and well-demonstrated in observations both in the lab and through the atmosphere. People confused about this topic are confused about science basics and comments to the contrary may be allowed or more likely will be capriciously removed due to the fact that there have been more than 50 posts on this topic (post your comments on those instead). See The “Greenhouse” Effect Explained in Simple Terms and On Uses of A 4 x 2: Arrhenius, The Last 15 years of Temperature History and Other Parodies.

Therefore, it’s “very likely” that the increases in GHGs over the last 100 years have contributed significantly to the temperature changes that we have seen.

To say otherwise – and still accept physics basics – means believing that the radiative forcing has been “mostly” cancelled out by feedbacks while internal variability has been amplified by feedbacks to cause a significant temperature change.

Yet this work on attribution seems to be fundamentally flawed.

Here was the conclusion:

We find that the latest observed 30-year trend pattern of near-surface temperature change can be distinguished from all estimates of natural climate variability with an estimated risk of less than 2.5% if the optimal fingerprint is applied.

With the caveats, that to me, eliminated the statistical basis of the previous statement:

The greatest uncertainty of our analysis is the estimate of the natural variability noise level..

..The shortcomings of the present estimates of natural climate variability cannot be readily overcome. However, the next generation of models should provide us with better simulations of natural variability. In the future, more observations and paleoclimatic information should yield more insight into natural variability, especially on longer timescales. This would enhance the credibility of the statistical test.

Earlier in the paper the authors said:

..However, it is

generally believedthat models reproduce the space-time statistics of natural variability on large space and long time scales (months to years) reasonably realistic. The verification of variability of CGMCs [coupled GCMs] on decadal to century timescales is relatively short, while paleoclimatic data are sparce and often of limited quality...We assume that the detection variable is Gaussian with zero mean, that is, that

there is no long-term nonstationarity in the natural variability.

[Emphasis added].

The climate models used would be considered rudimentary by today’s standards. Three different coupled atmosphere-ocean GCMs were used. However, each of them required “flux corrections”.

This method was pretty much the standard until the post 2000 era. The climate models “drifted”, unless, in deity-like form, you topped up (or took out) heat and momentum from various grid boxes.

That is, the models themselves struggled (in 1996) to represent climate unless the climate modeler knew, and corrected for, the long term “drift” in the model.

### Conclusion

In the next article we will look at more recent work in attribution and fingerprints and see whether the field has developed.

But in this article we see that the conclusion of an attribution study in 1996 was that there was only a “2.5% chance” that recent temperature changes could be attributed to natural variability. At the same time, the question of how accurate the models were in simulating natural variability was noted but never quantified. And the models were all “flux corrected”. This means that some aspects of the long term statistics of climate were considered to be known – in advance.

So I find it difficult to accept any statistical significance in the study at all.

If the finding instead was introduced with the caveat “*assuming the accuracy of our estimates of long term natural variability of climate is correct..*” then I would probably be quite happy with the finding. And that question is the key.

The question should be:

What is the likelihood that climate models accurately represent the long-term statistics of natural variability?

- Virtually certain
- Very likely
- Likely
- About as likely as not
- Unlikely
- Very unlikely
- Exceptionally unlikely

So far I am yet to run across a study that poses this question.

### References

Bindoff, N.L., et al, 2013: Detection and Attribution of Climate Change: from Global to Regional. In: Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change

Detecting greenhouse gas induced climate change with an optimal fingerprint method, Hegerl, von Storch, Hasselmann, Santer, Cubasch & Jones, *Journal of Climate* (1996)

What does it mean when climate models agree? A case for assessing independence among general circulation models, Zachary Pirtle, Ryan Meyer & Andrew Hamilton, *Environ. Sci. Policy* (2010)

Optimal fingerprints for detection of time dependent climate change, Klaus Hasselmann, *Journal of Climate* (1993)

## Natural Variability and Chaos – One – Introduction

Posted in Climate Models, Commentary, Statistics on July 22, 2014 | 19 Comments »

There are many classes of systems but in the climate blogosphere world two ideas about climate seem to be repeated the most.

In camp A:

And in camp B:

Of course, like any complex debate, simplified statements don’t really help. So this article kicks off with some introductory basics.

Many inhabitants of the climate blogosphere already know the answer to this subject and with much conviction. A reminder for new readers that on this blog opinions are not so interesting, although occasionally entertaining. So instead, try to explain what evidence is there for your opinion. And, as suggested in About this Blog:

## Pendulums

The equation for a simple pendulum is “non-linear”, although there is a simplified version of the equation, often used in introductions, which is linear. However, the number of variables involved is only two:

and this isn’t enough to create a “chaotic” system.

If we have a double pendulum, one pendulum attached at the bottom of another pendulum, we do get a chaotic system. There are some nice visual simulations around, which St. Google might help interested readers find.

If we have a forced damped pendulum like this one:

Figure 1 – the blue arrows indicate that the point O is being driven up and down by an external force-we also get a chaotic system.

## Digression on Non-Linearity for Non-Technical People

Common experience teaches us about linearity. If I pick up an apple in the supermarket it weighs about 0.15 kg or 150 grams (also known in some countries as “about 5 ounces”). If I take 10 apples the collection weighs 1.5 kg. That’s pretty simple stuff. Most of our real world experience follows this linearity and so we expect it.

On the other hand, if I was near a very cold black surface held at 170K (-103ºC) and measured the radiation emitted it would be 47 W/m². Then we double the temperature of this surface to 340K (67ºC) what would I measure? 94 W/m²? Seems reasonable – double the absolute temperature and get double the radiation.. But it’s not correct.

The right answer is 758 W/m², which is 16x the amount. Surprising, but most actual physics, engineering and chemistry is like this. Double a quantity and you

don’tget double the result.It gets more confusing when we consider the interaction of other variables.

Let’s take riding a bike [updated thanks to Pekka]. Once you get above a certain speed most of the resistance comes from the wind so we will focus on that. Typically the wind resistance increases as the square of the speed. So if you double your speed you get four times the wind resistance. Work done = force x distance moved, so with no head wind power input has to go up as the cube of speed (note 4). This means you have to put in 8x the effort to get 2x the speed.

On Sunday you go for a ride and the wind speed is zero. You get to 25 km/hr (16 miles/hr) by putting a bit of effort in – let’s say you are producing 150W of power (I have no idea what the right amount is). You want your new speedo to register 50 km/hr – so you have to produce 1,200W.

On Monday you go for a ride and the wind speed is 20 km/hr into your face. Probably should have taken the day off.. Now with 150W you get to only 14 km/hr, it takes almost 500W to get to your basic 25 km/hr, and to get to 50 km/hr it takes almost 2,400W. No chance of getting to that speed!

On Tuesday you go for a ride and the wind speed is the same so you go in the opposite direction and take the train home. Now with only 6W you get to go 25 km/hr, to get to 50km/hr you only need to pump out 430W.

In mathematical terms it’s quite simple: F = k(v-w)², Force = (a constant, k) x (road speed – wind speed) squared. Power, P = Fv = kv(v-w)². But notice that the effect of the “other variable”, the wind speed, has really complicated things.

The real problem with nonlinearity isn’t the problem of keeping track of these kind of numbers. You get used to the fact that real science – real world relationships – has these kind of factors and you come to expect them. And you have an equation that makes calculating them easy. And you have computers to do the work.

No, the real problem with non-linearity (the real world) is that many of these equations link together and solving them is very difficult and often only possible using “numerical methods”.

It is also the reason why something like climate feedback is very difficult to measure. Imagine measuring the change in power required to double speed on the Monday. It’s almost 5x, so you might think the relationship is something like the square of speed. On Tuesday it’s about 70 times, so you would come up with a completely different relationship. In this simple case know that wind speed is a factor, we can measure it, and so we can “factor it out” when we do the calculation. But in a more complicated system, if you don’t know the “confounding variables”, or the relationships, what are you measuring? We will return to this question later.

When you start out doing maths, physics, engineering.. you do “linear equations”. These teach you how to use the tools of the trade. You solve equations. You rearrange relationships using equations and mathematical tricks, and these rearranged equations give you insight into how things work. It’s amazing. But then you move to “nonlinear” equations, aka the real world, which turns out to be mostly insoluble. So nonlinear isn’t something special, it’s normal. Linear is special. You don’t usually get it.

..End of digression## Back to Pendulums

Let’s take a closer look at a forced damped pendulum. Damped, in physics terms, just means there is something opposing the movement. We have friction from the air and so over time the pendulum slows down and stops. That’s pretty simple. And not chaotic. And not interesting.

So we need something to keep it moving. We drive the pivot point at the top up and down and now we have a forced damped pendulum. The equation that results (note 1) has the massive number of three variables – position, speed and now time to keep track of the driving up and down of the pivot point. Three variables seems to be the minimum to create a chaotic system (note 2).

As we increase the ratio of the forcing amplitude to the length of the pendulum (β in note 1) we can move through three distinct types of response:

This is typical of chaotic systems – certain parameter values or combinations of parameters can move the system between quite different states.

Here is a plot (note 3) of position vs time for the chaotic system, β=0.7, with two initial conditions, only different from each other by 0.1%:

Forced damped harmonic pendulum, b=0.7: Start angular speed 0.1; 0.1001

Figure 1It’s a little misleading to view the angle like this because it is in radians and so needs to be mapped between 0-2π (but then we get a discontinuity on a graph that doesn’t match the real world). We can map the graph onto a cylinder plot but it’s a mess of reds and blues.

Another way of looking at the data is via the statistics – so here is a histogram of the position (θ), mapped to 0-2π, and angular speed (dθ/dt) for the two starting conditions over the first 10,000 seconds:

Histograms for 10,000 seconds

Figure 2We can see they are similar but not identical (note the different scales on the y-axis).

That might be due to the shortness of the run, so here are the results over 100,000 seconds:

Histogram for 100,000 seconds

Figure 3

As we increase the timespan of the simulation the statistics of two slightly different initial conditions become more alike.

So if we want to know the

stateof a chaotic system at some point in the future, very small changes in the initial conditions will amplify over time, making the result unknowable – or no different from picking the state from a random time in the future. But if we look at thestatisticsof the results we might find that they are very predictable. This is typical of many (but not all) chaotic systems.## Orbits of the Planets

The orbits of the planets in the solar system are chaotic. In fact, even 3-body systems moving under gravitational attraction have chaotic behavior. So how did we land a man on the moon? This raises the interesting questions of timescales and amount of variation. Planetary movement – for our purposes – is extremely predictable over a few million years. But over 10s of millions of years we might have trouble predicting exactly the shape of the earth’s orbit – eccentricity, time of closest approach to the sun, obliquity.

However, it seems that even over a much longer time period the planets will still continue in their orbits – they won’t crash into the sun or escape the solar system. So here we see another important aspect of some chaotic systems – the “chaotic region” can be quite restricted. So chaos doesn’t mean unbounded.

According to Cencini, Cecconi & Vulpiani (2010):

And bad luck, Pluto.

## Deterministic, non-Chaotic, Systems with Uncertainty

Just to round out the picture a little, even if a system is not chaotic and is deterministic we might lack sufficient knowledge to be able to make useful predictions. If you take a look at figure 3 in Ensemble Forecasting you can see that with some uncertainty of the initial velocity and a key parameter the resulting velocity of an extremely simple system has quite a large uncertainty associated with it.

This case is quantitively different of course. By obtaining more accurate values of the starting conditions and the key parameters we can reduce our uncertainty. Small disturbances don’t grow over time to the point where our calculation of a future condition might as well just be selected from a randomly time in the future.

## Transitive, Intransitive and “Almost Intransitive” Systems

Many chaotic systems have deterministic statistics. So we don’t know the future state beyond a certain time. But we do know that a particular position, or other “state” of the system, will be between a given range for x% of the time, taken over a “long enough” timescale. These are

transitivesystems.Other chaotic systems can be

intransitive. That is, for a very slight change in initial conditions we can have a different set of long term statistics. So the system has no “statistical” predictability. Lorenz 1968 gives a good example.Lorenz introduces the concept of

almost intransitivesystems. This is where, strictly speaking, the statistics over infinite time are independent of the initial conditions, but the statistics over “long time periods” are dependent on the initial conditions. And so he also looks at the interesting case (Lorenz 1990) of moving between states of the system (seasons), where we can think of the precise starting conditions each time we move into a new season moving us into a different set of long term statistics. I find it hard to explain this clearly in one paragraph, but Lorenz’s papers are very readable.## Conclusion?

This is just a brief look at some of the basic ideas.

## Other Articles in the Series

Part Two – Lorenz 1963

## References

Chaos: From Simple Models to Complex Systems, Cencini, Cecconi & Vulpiani,Series on Advances in Statistical Mechanics – Vol. 17(2010)Climatic Determinism, Edward Lorenz (1968) – free paper

Can chaos and intransivity lead to interannual variation, Edward Lorenz,

Tellus(1990) – free paper## Notes

Note 1– The equation is easiest to “manage” after the original parameters are transformed so that tω->t. That is, the period of external driving, T0=2π under the transformed time base.Then:

where θ = angle, γ’ = γ/ω, α = g/Lω², β =h0/L;

these parameters based on γ = viscous drag coefficient, ω = angular speed of driving, g = acceleration due to gravity = 9.8m/s², L = length of pendulum, h0=amplitude of driving of pivot point

Note 2– This is true for continuous systems. Discrete systems can be chaotic with less parametersNote 3– The results were calculated numerically using Matlab’s ODE (ordinary differential equation) solver, ode45.Note 4– Force = k(v-w)^{2}where k is a constant, v=velocity, w=wind speed. Work done = Force x distance moved so Power, P = Force x velocity.Therefore:

P = kv(v-w)

^{2}If we know k, v & w we can find P. If we have P, k & w and want to find v it is a cubic equation that needs solving.

Read Full Post »