Friday, March 5, 2010

Essentially The Same Graph, Different Smoothing

This is probably the last post I will write on the topic of tropical cyclones in a long time. I don't intend to make this an all-tropical-cyclones-all-the-time blog. I promise.

I thought I should mention a paper I found – Mann & Emanuel (2006) – that contains a graph of storm counts and sea surface temperatures, shown below.



The smoothing is somewhat different to that of the graphs I've produced. It's decadal smoothing. They apparently use SST data for the Atlantic averaged over August-September-October. (I did something like this in my first analysis, but I do not believe it makes sense anymore.)

Mann & Emanuel then compare the decadally smoothed series, and find a correlation coefficient R of 0.73 (p < 0.001 one-tailed.) Based on these results, the authors concluded that:
There is a strong historical relationship between tropical Atlantic SST and tropical cyclone activity extending back through the late nineteenth century.

The way I see correlation coefficients, 0.5 is "not bad," 0.75 is "pretty good," 0.9 is "very good," and 0.99+ is "law of physics."

The confidence level is what really matters when it comes to causality, in my view. In my statistical analysis, where data had all the original noise, I found a detrended association with 99.993% confidence (equivalent to p < 0.00007) – allowing for a lag of 1 year.

I wondered, nevertheless: What would the correlation coefficient be like if we compare the 15-year central moving averages of the data I've been using? Let's do this only considering data since 1870, like Mann & Emanuel do. Let's also allow for a lag, given that such is evident in the graph.

I found something surprising. The optimal lag is 7 years, not a few years like I had presumed. The correlation coefficient R at this lag is 0.924 – well within "very good" territory.

You know what's interesting about the 7-year lag? 2005 minus 1998 is 7. I might come back to that some other time.

I conclude that Mann & Emanuel were on the right track, but they didn't make a strong enough case. There's a presentation by Tom Knutson of NOAA where he mentions the graph from Mann & Emanuel (2006), apparently thinking it's interesting, but wonders if the storm record is reliable enough to produce a graph like that. Then he goes on to discuss the long-term trend, which may or may not be statistically significant (as I already explained, causality and trend significance are different things), and the projections of computer simulations.

4 comments:

Anonymous said...

Aren't you going to get an inflated correlation coefficient by comparing smoothed data? And allowing for any of a number of possible lags would also seem to artificially increase the correlation. Alternatively, perhaps you compensated for these two effects using a method not explained here?

Anyway, I'm enjoying reading your blog (which I just discovered, via one of your comments at Tamino's site).

"J"

Joseph said...

@Anon: Thanks, and yes, you would. Like I noted, what really matters is the confidence level with full noise. I just wanted to compare the correlation coefficient with that of Mann & Emanuel, which (according to the paper) is calculated from the decadally-smoothed series.

That's basically statistical confirmation that the 15-year smoothing looks a lot better, and is therefore more convincing. Subjectively, that's clear.

Joseph said...

BTW, when I analyzed the full-noise data statistically, the assocation was significant at the 95% level with no lag, so no need for corrections there. It was simply more significant at a lag of 1 year. The best lag appears to be 7 years, but there could obviously be uncertainty there (which I haven't calculated, and honestly haven't thought about how that might be done.)

Anonymous said...

Thank you for the very prompt replies. I agree that the visual comparison between the smoothed graphs is striking.