Wednesday, March 3, 2010

Too Easy To Be True?

Back in June of 2008, I wrote a statistical analysis of the association between northern hemisphere sea surface temperatures and the number of named storms in the Atlantic basin. I determined, with 99.993% confidence, that indeed there was such an association. I had controlled for coincidental trends (otherwise known as a spurious correlation) by means of detrending.

A commenter over at Climate Audit tacitly admitted reproducing my analysis, but pointed out that if he detrended the series using 6th-order polynomial trendlines, the association no longer held. I noted that if you allow for a lag of 1 year between the series, even the 6th-order detrending resulted in a statistically significant association, despite the loss of information that necessarily results from such a detrending. The existence of the lag between temperatures and named storms would soon become crystal clear to me and my (not very many) readers.

During the back and forth with the Climate Audit commenter, I realized that if you simply smooth out the noise from both series, the association becomes graphically evident, and a lot more convincing – I thought – than a statistical analysis. The first graph I posted used nothing more than a 21-year central moving average for smoothing. The results were remarkable and the graph was remarkably easy to produce.

As it turns out, it was also remarkably difficult to believe. A few months went by, and a reader (paulm) posted a link to my analysis over at the GW blog. The graph was met with a got-to-be-fake type of response.

When I found out, this obviously made me upset. You can tell I was upset as I was explaining my very lenient comment policy (see sidebar) in my response to the incident. I even posted the spreadsheet for verification. There were no further falsification accusations after I did this, and I thought that was the end of it.

Fast forward a year and a half. Deltoid has a recent post on the topic, quoting various IPCC statements, and I basically commented that the IPCC was wrong, in my view, in regards to the number of tropical cyclones not changing in response to global warming. The data told me otherwise.

I guess questioning an IPCC claim was a mistake, wasn't it? I might have also broken some social norm I'm not aware of or something. Some of the regulars started to talk to me as if I were one of the resident trolls, like El Gordo or Spangled Drongo. They basically accused me of fraud and trying to deceive, in a way that is not dissimilar to how the CRU team are accused of fraud and so forth.

Things calmed down after I, once again, posted the spreadsheet. I appreciate Bernard J's semi-apology.

For reference, below are the links to the data and the spreadsheet. I posted the spreadsheet at a more permanent location.

What else can I do? You've seen my comment policy. Should I post screenshots too?

I'd hate to think that my most interesting graphs are assumed to be fake a priori. Is my graph of Red Sea sea level and Vostok temperatures a fake? What about my graph of the natural spline interpolation of the Law Dome CO2 data? What about the one with the Mann et al. (2008) reconstruction and CO2 at the time of the industrial revolution? What about the graph with the Greenland temperature reconstruction?

I realize I often post claims that you can't just Google to confirm, and I realize that people are sometimes paranoid. That's why I have the comment policy I have.


Anonymous said...

I've not yet waded through all of the prior back and forth, but I thought I might suggest a source for some of the confusion-

Have you looked at the adjusted storm counts? The IPCC and others don't just look at the raw storm numbers, as there is reason to suspect that they have a large undercount bias as you move backward through time.

Joseph said...

@thingsbreak: I think the undercount bias occurs mostly in pre-1900 data, and you can sort of see that in the graph. It's possible there's a slight upward trend due to systematic bias, but it obviously doesn't explain away the graph. Also, the statistical analysis I did with detrending should take care of any systematic bias.

I haven't taken the time to look at adjusted counts, but I have no reason to expect things would be a whole lot different.

Hank Roberts said...

I've followed up at the Deltoid thread since it's lively, on these points:

1) humans see patterns exquisitely well, often when they don't exist
2) statistics can correct for that
3) hurricanes/cyclones and sea surface temperatures, including teleconnections and correlations, are getting a lot of attention; the IPCC didn't refer only to the North Atlantic. Look at the papers cited by the AR4 WG1, and the subsequent work. Figure out how much of the variation in whichever hurricane count is explained by whatever temperature index, for your chosen example; see if your number is comparable to the numbers published. As 'thingsbreak' points out, you can find different storm counts (how to count them is a hotly argued topic). You can also look at different indices of temperature, and at teleconnection from remote areas. Each approach will give different numbers.

That's where the statistics comes in useful, trying to estimate how much of the variation in one thing might (with a mechanism) explain the variation in the other.

For example -- look at sea surface temperatures in the equatorial Pacific (El Nino/La Nina) correlating with hurricanes.

Other cites at the Deltoid thread.

Joseph said...

@Hank: I have analyzed the data statistically as well, not just graphically. I mention that in the post.

Thanks for the pointers to additional data I could look at. Of course, I can't promise that I'll look at it. As this is basically a hobby for me, there are all sorts of other things that might peak my interest next.

Stu said...


I think a clear explanation of the method involved in producing the graph would help people seeing it for the first time, well, believe their eyes!

Myself, I was not sceptical because it contradicts the almighty IPCC. I was sceptical because it looks so damn perfect ;-)

Joseph said...

@Stu: I understand. The first time I came up with it, I had to double-check a few times. I thought there had to be a mistake.

VangelV said...

I think the undercount bias occurs mostly in pre-1900 data, and you can sort of see that in the graph.

I do not think that your statement is valid. Without satellites and aeroplanes to monitor activity there were many significant storms that were not reported because they were not observed and evaluated. The most useful old data is related to reported landfall and that data does not provide any support for the discredited storm-AGW link.

Joseph said...

@VangeIV: My claim that the undercount bias is mostly a pre-1900 problem is supported by Vecchi & Knutson (2008).