Saturday, June 28, 2008

Hurricanes and Temperature are Indeed Associated

There is apparently considerable climate science that can be cited to show there's a clear association between global warming and either the number of hurricanes in any given season or their intensity. See, for example, Hurricanes and Global Warming - Is There a Connection?, written by a number of climate scientists who run There is both basic science and computer modeling that can be used to predict what should occur under certain warming scenarios.

I'm generally inclined to trust scientific consensus and published science, particularly if it's peer-reviewed, unless I can advance a seriously strong argument explaining why I do not. Nevertheless, there's nothing like analyzing data first hand. Because I understand this, and because I understand some people out there don't trust some published science at all under the pretext of "conflicts of interest," I've acquired the habit of writing posts where I walk the reader through very accessible analyses of publicly available data. I combine this with a very lenient comment policy. My pledge is to only remove comments that clearly violate Blogger's content policy.

I already did this type of analysis in my post titled Anthropogenic Global Warming is Absolutely Occurring. This time I will look into the claim that global warming might have had an effect in the number of named storms in the Atlantic Basin, given that some people appear to doubt this claim. In doing so, I will try to go over additional details of the methodology which I might have left out in my previous post.

I will use data on the number of named storms from 1851 to 2006 provided by NOAA. I will use ocean surface temperature data for the northern hemisphere provided by the Climatic Research Unit of the University of East Anglia. For accuracy, since we're interested in the hurricane season, I will use June-November averages for each year.

Let's start by putting these two data sets in a chart, side by side. This will be Figure 1, which also shows trend lines for both temperature and storm trends. The trend lines are third-order polynomial fits (easily produced with Excel).

hurricanes global warming

The reader will note that both trends are pointing upward, at least for the last 60 years. This is not what we are interested in, however. We want to control for the fact that there could be a coincidence of upward trends. That's where the third-order polynomial fits come in.

The polynomial fits provide us a time-based model of each trend. For any given year they tell us what the "expected" temperature and number of storms should be. Of course, a given year might have more or less storms than expected. It will also have a higher or lower temperature than expected. In the end, what we want to find out is whether years with higher temperature than expected tend to have more storms than expected, and vice versa.

By subtracting trend line equation values from observed values, residuals of temperature and storms can be produced for each year. These residuals represent how different from "expected" an observed value is in a given year. Residuals are generally time-independent. In our case, if you produce a scatter chart of year vs. temperature residual or storm residual, you will see the scatter trend is entirely flat. This is a basic confirmation that should be done after getting the set of residuals.

Figure 2 is a scatter chart of temperature residuals vs. storm residuals. The trend of this scatter should be flat, unless there's association between temperature and number of storms.

hurricanes global warming

What we see in Figure 2 is that if we try to fit a linear trend to the scatter, we do get a positive slope of 3.43. Now, we need to verify that we can state, with statistical confidence, that the slope is actually positive. In this case it is. The 95% confidence interval of the slope is 0.25 to 6.61. This is not a slam dunk finding like the one for the correlation between cumulative CO2 emissions and temperature, but it is statistically significant, which means an association between temperature and number of storms is demonstrated.

Given the methodology used, this result cannot be explained as a coincidental trend.

There are some peculiarities about the data which are interesting. For example, it is clear that the 2005 Atlantic season was an unusual one, even after controlling for the time trend of named storms. It could be placed in a group of seasons that only occur every 50 years or so. Evidently, the fact that the seasons that came after 2005 did not measure up is inconsequential to the finding that temperature associates with the number of named storms.

We can, however, pose the following question: What sort of temperature increase would be required for the average season to be like the 2005 season? Given the slope of the scatter in Figure 2, it would seem that a temperature anomaly of 4.05 degrees (C) would be required for this. The current temperature anomaly is about 0.6 degrees (C), so such an eventuality appears to be far off. Or is it?

I ran a second residual correlation analysis of temperature vs. number of named storms one year later. This actually produces a considerably steeper slope (6.36) and the confidence interval is entirely positive even at 99.993% confidence. I can't really explain why this would be the case. But here's the thing. If we were to take this new slope at face value, a temperature anomaly of 2.18 degrees (C) would be enough to make the average season similar to the 2005 season.

Anthropogenic Global Warming is Absolutely Occurring

[Originally posted at Natural Variation.]

I need to ask for the reader's indulgence, as this post is not about autism, except insofar as determining the merit of correlations has become a perseveration of mine. You see, it is trivial to come up with naive correlations of autism trends vs. practically anything about the modern world. The administrative prevalence of autism has been increasing almost always since records have been kept. Concurrent upward trends of nearly anything, from vaccines to environmental pollution, from trans fats to electromagnetic radiation, and so on, are easy to come by.

In my latest post at LB/RB I suggested that instead of correlating trends in a naive manner, we could attempt to correlate the residuals of time regression models of each trend. A residual is a delta or difference between an observed value and a modeled value. (Here's a concise explanation).

When modeling real world phenomena, regression models will never (or almost never) be perfect fits. For all sorts of reasons, even if simply random fluctuation, there will be deviations from a modeled trend. If there's a causative relationship between two trends, the residuals of (or deviations from) corresponding close-fitting regression models should correlate with one another as well. By this I don't mean that the residuals should always be in the same direction; but they should be in the same direction more often than not, in average.

The nice thing about this technique is that it is completely accessible to anyone with Excel installed. It can also be illustrated graphically, as the reader will see.

So it occurred to me to test this idea in a different field of science where there's controversy over correlation vs. causation. I thought global warming would be a great candidate. After all, the spoof about a decrease in the number of pirates correlating with many other arbitrary trends appears to originate in the global warming debate (see this).

To summarize what I found, there is a strong and statistically significant correlation between cumulative human CO2 emissions and northern hemisphere temperature anomalies. Because of the methodology used, I'm quite confident this cannot be explained by coincidence, data collection errors, solar output as a confound, or causation in the opposite direction.

Now, I fully recognize that I'm only superficially familiar with the debate over anthropogenic global warming. I am also not versed in climatology. Therefore, I cannot be entirely sure that this type of analysis hasn't been done before. Google and Google Scholar searches didn't seem to turn up anything, and given the importance of the topic, I thought it was not only prudent but necessary to put this evidence out there. As always, scrutiny and discussion are welcome.

Northern hemisphere temperature data from 1850 to 2004 was obtained from the Climatic Research Unit of the University of East Anglia, UK.

Global CO2 emission data was obtained from CDIAC. I did not use CO2 atmospheric concentration data because temperature increases can theoretically cause this concentration to increase. Human emissions are what we're interested in. More specifically, I calculated cumulative CO2 emissions for every year since 1850. Greenhouse temperature anomalies are presumably caused by the total amount of CO2 in the atmosphere, not by the emissions in any given year. Since CO2 stays in the atmosphere for 50 to 200 years (source) modeling the cumulative human contribution of CO2 should be adequate enough.

Figure 1 (click to enlarge) is a graph of the general time trends of these two sets of data. It also shows the modeled trend lines we will use to calculate residuals. In this analysis we're using third-order polynomial models. They seem to give a considerably closer fit than second-order polynomial models.

co2 temperature

I calculated the residuals and built a scatter graph matching cumulative CO2 (X axis) and temperature (Y axis) residuals for each year from 1850 to 2004. As expected, the slope of a linear regression of the scatter was positive (1.9x10-5) and statistically significant (95% confidence interval 1.13x10-5 to 2.66x10-5).

[Note: Instructions on how to calculate the slope confidence interval of a linear regression with Excel can be found here.]

I suspected, however, that there should be lag between cumulative CO2 fluctuations and temperature fluctuations. It presumably takes some time for heat to be trapped. I proceeded to create a moving average trend line of the temperature residuals. It did in fact have a similar shape to the cumulative CO2 residuals graph, but it appeared to lag it by about 10 years. The reader should be able to roughly see this lag in Figure 1.

So I re-ran the whole analysis by only considering the years 1850 to 1997 and correlating CO2 residuals with residuals of temperature 10 years later. The correlation between these two sets of data is remarkable. Let's start with a bar graph of both sets of residuals, Figure 2.

co2 temperature residuals

Figure 2 is a good graph to get a subjective sense of the correlation. Let's see if the math confirms this. Figure 3 is the scatter graph of the residuals.

co2 temperature residual cross-correlation

The slope of a linear regression of the scatter is 2.6x10-5, and it is statistically significant (95% confidence interval 1.88x10-5 to 3.33x10-5). Even the 99.99999999% confidence interval is entirely positive. Unless anthropogenic global warming is a reality, there is no apparent reason why the residuals of cumulative human CO2 emissions should correlate so well with the residuals of temperature 10 years later throughout the last 150 years.

The slope of the scatter is actually more steep than expected, if you consider the naive correlation between cumulative CO2 emissions and temperature. There are probably several reasons for this. The one I believe to be the most likely is that over time CO2 does get removed from the atmosphere. Adding this consideration to the analysis should produce a more accurate slope. The other potential reasons don't bode so well for our species.

[Update 2/22/2010: I have written a follow-up titled Statistical Proof of Anthropogenic Global Warming v2.0.]

Hello World

I will use this blog to write about topics unrelated to autism. This was prompted by a post on Global Warming that I wrote on my primary blog, Natural Variation. I have at least a few more such posts planned.