Monday, April 5, 2010

Urban Heat Island Effect - Probably Negligible

Previously I had discussed the difference between rural and urban temperature stations in the U.S. Commenter steven argued that population assessments (R, S and U) in GHCN v2 might be outdated and – in general – not very good proxies of what we really want to measure.

I then compared rural stations in the Mid-West (a low-population-density region of the U.S.) with all rural stations. There wasn't a major difference between these two sets of stations either. Commenter steven was not convinced, however. He posted some satellite pictures of rural stations that are located in what appear to be sub-urban areas.

How could we measure the impact of human populations on station temperature with the data available to us? It's clearly not enough to express doubt and speculate about what might be going on.

Here's what I came up with. There's a vegetation property in the station metadata. If you look at stations in regions that are forested (FO), marshes (MA) or deserts (DE), they appear to be actually rural. I looked at a subset of such stations in Google Maps, and they are not close to human settlements, with few exceptions. The GHCN Processor command I used to obtain a temperature series is the following.

ghcnp -dt mean -include "population_type=='R' && (vegetation=='FO' || vegetation=='MA' || vegetation=='DE')" -o /tmp/global-rural-plus.csv

575 stations fit these characteristics. For comparison, I got temperature series for big cities (population > 0.5 million), and small towns and cities (population <= 0.5 million.) I calculated 12-year moving averages in each case, which is what you see in the figure below.

There might be some differences, but they are always small, and we've compared several different stations sets now, globally and at the U.S. level.

An argument could also be made that small human settlements increase the albedo of an area, so they might have a cooling effect.

Addendum (4/5/2010)

Here's an actual UHI finding of interest. I compared cities of population over 2 million with towns whose population is between 10,000 and 15,000. The difference is more pronounced in this case.

The overall effect is still negligible, nevertheless. The number of cities decreases exponentially with population size.


steven said...

You are still missing the point.

The point is not to grab grab metadata willy nilly and throw it at a meat grinder. Or to make suppositions about what a place might look like based on meta data. or assume that rural doesnt mean infected by UHI, land use can be a huge issue.

In the end the UHI effect may be small. I think it probably is. But even peterson 2003 noted that the absence of UHI in the record was counter intuitive to the tenets of climate science. Go read that paper.
and tell me the three assumptions that the paper rests on. You'll see that they are meta data assumptions.

For the record. I believe in AGW. I believe that GHgs are to blame. I believe we should take action.

Nothing one finds in the study of UHI will change that. but that doesnt change anything in looking at the problem in the best way we can with the best attention to detail we can muster.

PS. you might want to check on the provenence of that metadata ( Ron B has an update of it that might be better )

let me give you a clue on peterson 2003.

petersn argued that we SHOULD find a difference between urban and rural, but that we didnt.
to explain this "mystery" ( his words, not mine ) he relied on two postulates ( his word not mine)

What were those assumptions and how do we test them?

If I get some time I will create a google tour of your latest effort, or I can shoot you the R code and you can transliterate it

Joseph said...

Do you agree that the Forested/Marsh/Desert areas are a good proxy of "really rural" areas, steven? I'm sure you can get a hold of the relevant satellite maps.

I do think it's a bit of a mystery. I would've expected to see a bigger difference in the raw data.

I guess UHI is neither a linear nor a continuously increasing response. For big cities, most of the effect could have occurred early on in their history. Small cities might not have a UHI altogether, and could even contribute albedo.

steven said...

nd a couple more things.


This is definatelty the worst of the lot when it comes to metadata. It why NASA is not using it anymore. Nightlights is preferred.

But which version.
Just read the description of where the population data comes from.

Also, in that coding R = <10K people

But we know that UHI happens at populations less than 10K. ITS NOT ABOUT POPULATION.
its about what the population does to the SITE.

Anyways, in your selection your selection criteria for towns is way off.

On the Field which you selected. That a general topography for the area ( where area is not specified ) from Navigational charts.

So you looked at three Population classifications:

R ( GHCN rural ) is less than 10K

Small urban ( >10K <500K)
Large Urban >500K

Now GHCN puts U at being greater than 50K

The problem here is again UHI is a thresholded variable. As I've explained to you before population
is a PROXY.

Let me just draw a hypothetical response curve for you. This is ONLY to illustrate the KIND of problem
you get:

UHI Effect POP
0 0
.1 100
.2 200
.3 1000
.4 10000
.5 100000
.6 1000000
.61 10000000

So when you Bin using R, you are Binning
in the steep part of the response curve.
Again, thes numbers are for ILLUSTRATION only
you see, you still think that Population is
A. accurate in the metadata
B. The best proxy.

To that now you have added the data from the TOPO flag.

Let me give you an idea of the things that matter.


steven said...

Hi Joseph,

WRT the Topo flags. when I read the metadata description and the source I just discounted it as useless. A long time ago I tried to do some analysis on JUST DESERTS. which would I thought be a cool counter intuitive way of looking at the problem. I recall that the metadata wasnt that hot. That said,
I ALWAYS think its fun and instructive to study the data we actually have.. with an eye toward improving it .

At the google code site I linked to you can just pull down whatever you like, I put up the kml files
so you can actually tour a whole selection of stations. I'm moding somebody else code,
Its R, If you want it Just shoot me a mail.

I forgot all my java.. haha, and I've just started down this "must learn R" path.

Wait check back on the google code site and I'll just upload the code as I Got it.

The R knowledge required to Mod it is not that great. It will listed as Peters Code.
My mod's done, but its easier for you to just hack his

steven said...

Codes up,

steven said...


Ron B, update does not have to do with the metadata you used, sorry. More later.

Also, Nick stokes did something that was curious on airport versus non airport..

Everyone complains about airports, BUT they tend to be located in areas where the boundary layer is "normal" that is low buildings etc etc..

When the wind speed gets about certain values ( like xx meters/sec then the wind will "blow" the UHI away.. increased turbidy and vertical mixing. With no wind the heat is stuck.. like inversion layer for example..

Any.. for grins The following comparisions

Coastal versus Non coastal ( with urban/rural split
And all airport comparisons are potentially fruitful.

Joseph said...

Let me just draw a hypothetical response curve for you.

But see, that's completely hypothetical. Further, it assumes that UHI always exists for any given town. This probably depends on the town.

What we're interested in is not really what the population does to the area. We're interested in what the population does to the area, after the thermometer was installed.

If you have a specific hypothesis, what is it? How do we test it?

Joseph said...

Note: I've posted an addendum that shows UHI is real, but still negligible.

steven said...

But see, that's completely hypothetical. Further, it assumes that UHI always exists for any given town. This probably depends on the town.

A. The pupose of the curve is to get the IDEA through to you. If you want to see a real curve then just read Oke. Or look at the building height curve I posted for you, or read the Bubble study, or download the EU UHI study I have posted up.

B. Of course it depends upon the town. That is WHY you cant just use a simple proxy to look for the effect or to understand it. Remember UHI is a physical reality that has been studied and acknowledged by climate science since it was first identified in the 1800s. It depends upon the physical geometry. Why? because the geometry changes everything:. Now, more people usually means more changes to geometry, but only up to a point.. a threshold The other effect of more building is changes in surface properties.. again, thresholded. So if you want an actual curve for the response to "population" then your best bet is Oke, but evern there it is very hard to study the effects
of moving from PRISTINE to Small rural, to larger rural, to small urban to urban, to megatropolis.

Start with some school notes, and se the bibliography at the bottom,3-notes.doc

Or here is a nice one that shows how the effect varies with distance from the urban center and the role of wind, along with references to Oke.

Now, for a different view of what meta data might be interesting consider Gallo.

Gallo comes close to using what you proposed.
You used the TOPO feild in the meta data, but if you look at the vegatative field ( the 16 character wide beast ) that one is probably derived from a NDVI ( Ron B I believe has an update of that ) in any case, there are several products-- nightlight ISA ( impervious surfaces ) NDVI, and pop ulation density which will be instructive.. ALSO, the other factor of wind needs to be looked at. Now Parker's seminal UHI study used wind data ( NCEP i believe ) to show that UHI did not infect the record.

But start here:

You continued:

" What we're interested in is not really what the population does to the area. We're interested in what the population does to the area, after the thermometer was installed."

I think we have a little bit different perspective. The goal is to find those stations that have the minimal amount of human changes to the local environment.

1. Picking small populations is a start.
a. it needs to be CURRENT population
b. we cant be in a case of deurbanization so
check the historical records.
2. Picking for small densities helps more.
a. Density drives the height of buildings
b. density drives waste heat.
3. Nightlights helps.
a. nightlights and density is a good proxy for ISA
b. we dont tend to randomly light open feilds
c. nightlights is as accurate as the location
of the site ( lat lon mismatch)
4. NDVI is a good filter as well.

5. For some parts of the world like the US we have satillite products for irrigated lands and land use
records going back more than a century.

So, yes we are interested in what a population does to an area AFTER the station was installed.

That means:

1. Good historical research.
2. The best meta data about the area today.

it doesnt mean, grinding the data at hand.

steven said...

If you have a specific hypothesis, what is it? How do we test it?

Specific Hypothesis.

1. if you start with garbage metadata you get garbage results. they may confirm your bias, but they are still garbage results.

2. if you use GHCN coding ( R,S,U( to select rural areas you will select some areas that have phenomenlogical evidence of UHI causes present.

3. If you use nightlight to select rural, you will also be fooled in a small percentage of cases.

4. If you use NDVI you will also pick areas where land use drives the equation.

We test these hypothesis by you using the metadata to select stations and then looking at each and every one in detail and determining if the station fits or doesnt fit with the kind of stations developed specifically for the purpose of collecting good data ( CRN) AFTER, you do that then you can begin to look at the following

A. do we even have enough stations. If UHI is
a small effect ( some suggest as much as .1C
per decade) then identifing it will require a test
with significant POWER as to avoid type II errors
in testing
B. Are they spatially diverse.

So, short answer. with the current metadata you have I'm not sure its even worth looking at. GIGO

Finally, you have to look at whether the series has been adjusted ( say TOBS ) and if the error in that correction ( the SE of prediction for TOBS is 3-6 times greater than the measurement error ) is too large.

This last point is lost on everyone.

If your monthly temp average is 10C. that average has a sd of .03C ( jones and brohan 06)

That means you are 95% sure the real temp was
between 9.9C and 10.1C

If, you then ADJUST, this temperature because of changes in the time of observation, you get the following problem. Lets say that 10C measure was taken at 6AM. But every other station takes measures at midnight. ( this is quite common )
What is done in the US ( USHCN which is part of GHCn) is the following. That 10C measure is adjusted based on a computer model. The computer model is an empirical model that says the following: if your average temp was 10C ( when your measures were taken at 6Am) then we PREDICT that the midnight temp will be .5C lower
+-.3C. That's the error in the prediction that adjusts the temp. So 10C+-.1C becomes

9.5C +-.4C ( roughly) So, while we have a better estimate of the mean ( all temps are corrected for the same observing time) we have increased the error. At this point no one who calculates a global temperature series takes the error of adjustment prediction into account in the final error budgets.

Performing statistical tests ( rural versus Urban)
Must take notice of the fact that the underlying error structure of these two series may be different.
HINT: the rural sites get adjusted because they tend to be human observers, as opposed to say an ASOS or newer stations ( more likely in urban areas) that are fully automated.

In terms of looking at the current metadata. Running numbers on that is instructive just to get an idea of the numbers of stations that meet particular criteria.

In the US for example, one needs over 200 stations to resolve warming trends of .1C per decade.

That's 50% of the trend we associate with AGW.

600 stations gets you closer to .05C per decade

Joseph said...

@steven: No, the point is not to show that UHI does or doesn't exist, or even to test a model of UHI. The point of the posts is clearly to determine whether UHI biases the instrumental temperature record. This is what readers will be interested in, obviously.

There's no evidence that it does. The existence of UHI is not sufficient evidence to show that it biases the record. To show that it biases the record, you need to have data that demonstrates exactly this. Everything else is just distraction from the main point. Period.

steven said...

Ah I see you last chart.

ya well, between 1971 and 2009, lets say the last 40 years, you see a rise of .9C for the worst and .3C
for Smaller places ( not the best )

That's about .13C per decade. Ross McKittrick ( the paper that jones wanted to keep out of AR4) argued
for .1C per decade. Jones sees it at .05C for the century.

The issue is even at .1C per decade since 1970s
you have to rememeber that this is over the land
and the land is 30% of the total. So it moves the total globe numbers only down a fraction. probably somewhere between Jones number and McKittricks number. Certainly NOT a blow to AGW, but its not nothing.

OH, good work. I always forget to say that, but It is really fun and inspiring to see people look at this stuff for themselves. Running the numbers for yourself, you can say, given the data we have, we cant say that all the warming is due to UHI. We might see that some of it is, But clearly not most of it. clearly not the majority. If UHI was a huge problem it would pop right out of the simplest analysis. the fact that it doesnt pop right out should be conforting. for me, its challenging to find that little effect. I hope you get this perspective