`[Originally posted at Natural Variation.]`

I need to ask for the reader's indulgence, as this post is not about autism, except insofar as determining the merit of correlations has become a perseveration of mine. You see, it is trivial to come up with naive correlations of autism trends vs. practically anything about the modern world. The administrative prevalence of autism has been increasing almost always since records have been kept. Concurrent upward trends of nearly anything, from vaccines to environmental pollution, from trans fats to electromagnetic radiation, and so on, are easy to come by.

In my latest post at LB/RB I suggested that instead of correlating trends in a naive manner, we could attempt to correlate the residuals of time regression models of each trend. A residual is a

*delta*or difference between an observed value and a modeled value. (Here's a concise explanation).

When modeling real world phenomena, regression models will never (or almost never) be perfect fits. For all sorts of reasons, even if simply random fluctuation, there will be deviations from a modeled trend. If there's a causative relationship between two trends, the residuals of (or deviations from) corresponding close-fitting regression models should correlate with one another as well. By this I don't mean that the residuals should always be in the same direction; but they should be in the same direction more often than not, in average.

The nice thing about this technique is that it is completely accessible to anyone with Excel installed. It can also be illustrated graphically, as the reader will see.

So it occurred to me to test this idea in a different field of science where there's controversy over correlation vs. causation. I thought global warming would be a great candidate. After all, the spoof about a decrease in the number of pirates correlating with many other arbitrary trends appears to originate in the global warming debate (see this).

To summarize what I found, there is a strong and statistically significant correlation between cumulative human CO

_{2}emissions and northern hemisphere temperature anomalies.

**Because of the methodology used, I'm quite confident this cannot be explained by coincidence, data collection errors, solar output as a confound, or causation in the opposite direction**.

Now, I fully recognize that I'm only superficially familiar with the debate over anthropogenic global warming. I am also not versed in climatology. Therefore, I cannot be entirely sure that this type of analysis hasn't been done before. Google and Google Scholar searches didn't seem to turn up anything, and given the importance of the topic, I thought it was not only prudent but necessary to put this evidence out there. As always, scrutiny and discussion are welcome.

Northern hemisphere temperature data from 1850 to 2004 was obtained from the Climatic Research Unit of the University of East Anglia, UK.

Global CO

_{2}emission data was obtained from CDIAC. I did not use CO

_{2}atmospheric concentration data because temperature increases can theoretically cause this concentration to increase. Human emissions are what we're interested in. More specifically, I calculated

*cumulative*CO

_{2}emissions for every year since 1850. Greenhouse temperature anomalies are presumably caused by the total amount of CO

_{2}in the atmosphere, not by the emissions in any given year. Since CO

_{2}stays in the atmosphere for 50 to 200 years (source) modeling the cumulative human contribution of CO

_{2}should be adequate enough.

Figure 1 (click to enlarge) is a graph of the general time trends of these two sets of data. It also shows the modeled trend lines we will use to calculate residuals. In this analysis we're using third-order polynomial models. They seem to give a considerably closer fit than second-order polynomial models.

I calculated the residuals and built a scatter graph matching cumulative CO

_{2}(X axis) and temperature (Y axis) residuals for each year from 1850 to 2004. As expected, the slope of a linear regression of the scatter was positive (1.9x10

^{-5}) and statistically significant (95% confidence interval 1.13x10

^{-5}to 2.66x10

^{-5}).

[Note: Instructions on how to calculate the slope confidence interval of a linear regression with Excel can be found here.]

I suspected, however, that there should be lag between cumulative CO

_{2}fluctuations and temperature fluctuations. It presumably takes some time for heat to be trapped. I proceeded to create a moving average trend line of the temperature residuals. It did in fact have a similar shape to the cumulative CO

_{2}residuals graph, but it appeared to lag it by about 10 years. The reader should be able to roughly see this lag in Figure 1.

So I re-ran the whole analysis by only considering the years 1850 to 1997 and correlating CO

_{2}residuals with residuals of temperature

*10 years later*. The correlation between these two sets of data is remarkable. Let's start with a bar graph of both sets of residuals, Figure 2.

Figure 2 is a good graph to get a subjective sense of the correlation. Let's see if the math confirms this. Figure 3 is the scatter graph of the residuals.

The slope of a linear regression of the scatter is 2.6x10

^{-5}, and it is statistically significant (95% confidence interval 1.88x10

^{-5}to 3.33x10

^{-5}). Even the 99.99999999% confidence interval is entirely positive.

**Unless anthropogenic global warming is a reality, there is no apparent reason why the residuals of cumulative human CO**.

_{2}emissions should correlate so well with the residuals of temperature*10 years later*throughout the last 150 yearsThe slope of the scatter is actually more steep than expected, if you consider the naive correlation between cumulative CO

_{2}emissions and temperature. There are probably several reasons for this. The one I believe to be the most likely is that over time CO

_{2}does get removed from the atmosphere. Adding this consideration to the analysis should produce a more accurate slope. The other potential reasons don't bode so well for our species.

`[`

**Update 2/22/2010**: I have written a follow-up titled Statistical Proof of Anthropogenic Global Warming v2.0.]
## 3 comments:

Note: A more clear graph that illustrates the detrended cross-correlation can be found here.

Do you have any basis for this 10 year figure? I didn't see any reason for you to pick it other than it fits your data. If the science said there was a 15 year waiting period wouldn't your analysis be broken?

The 10-year lag is chosen based on goodness of fit of the correlation of the detrended series at different lags. The best correlations occur at around 10 years of lag. It could be 8 to 12 years.

I of course had no idea what the 10-year lag meant when I first wrote this post. Later, as I did some modeling based on the theoretical framework of climate sensitivity, equilibrium temperatures and so forth, I came up with a simple model that involved a hypothetical CO2 concentration series that changed like a sine-wave. The resulting temperature in the model changed like a sine wave as well, of course, but it lagged CO2 by about 10 years.

Post a Comment