Sunday, March 28, 2010

GHCN Processor 1.1

Version 1.1 of GHCN Processor is now available for download. Relative to Version 1.0, the highlights are:

  • Implemented the first difference station combination method (Peterson et al. 1998.)
  • Added -abs option, which causes the tool to produce an "absolute measurement" series, as opposed to an "anomaly" series.
  • Added a -ccm option (similar to -scm) that sets the grid cell/box combination method.
  • The default station and grid cell combination method is the linear-equation-based method (olem.) This decision was based on simulations I ran comparing all 3 supported methods, involving series with equal and different underlying slopes.
  • Added -og option, which lets you write grid partitioning information to a CSV-format file. This feature is informative if you use the -gt seg option. (Zeke suggested producing a map. This is the next best thing.)

See the release-notes.txt and readme.txt files for additional details.


Anonymous said...

Added -og option, which lets you write grid partitioning information to a CSV-format file. This feature is informative if you use the -gt seg option. (Zeke suggested producing a map. This is the next best thing.)

I've just posted a C++ program that takes a gridded cvs file and produces a map from it. The grid must be 2.5 by 2.5 equal area.

If you'r interested, you can find out more here:

Joseph said...

Thanks Anon.

Anonymous said...

Hi Joseph
Thank you for this tool.

I have a problem with the output, I am not sure it it is a bug on my system. The output is a csv file but the comma is also used as a decimal separator.

Is there a way to fix this? As it is now the output is ambiguous.

Best regards
Søren Rosdahl Jensen

Joseph said...

@Søren: Thanks for bringing that to my attention. I've uploaded a fix (v1.2).

The bug has to do with the default Java locale as it relates to the formatting of floating point numbers, which will depend on your Windows Regional & Language settings. I've changed it so it always using English as the locale for this.

Anonymous said...

Thanks Joseph.

Now it works.
My computer will now be busy processing, exploring the different options of the tool.

Best regards
Søren Rosdahl Jensen

Robert said...

Hey Joseph,
I was wondering how I would be able to use this tool to get data for a specific region. Like I am looking for all the data for a certain region in Eastern Canada. How could I use it to do so? Or could I?

Joseph said...

@Robert: Well, since Canadian region names are not part of the database, I'd recommend using latitude and longitude.

I've used this to get just the "mid west" in the US:

ghcnp -dt mean -include "country eq 'UNITED STATES OF AMERICA' && population_type eq 'R' && longitude < -95 && longitude > -115" -reg -o /tmp/us-really-rural-raw.csv

Robert said...

I am actually new to this program and it will take some time to get used to it but should I do something like:

ghcnp -dt mean -include "country" eq 'Canada' && Longitude < -50 && > -65 && Latitude > 50 %% Latitude < 64" -reg -o /tmp/can-eastlab.csv

Do you perhaps have any instructions on the tool that I could use. like what does "-dt mean" do and the same for the -reg -o /tmp/can-eastlab.csv

Like does that just make it a regional composite and store it as a .csv in my temp files?

Thanks for your help

Robert said...

Hi Joseph,
For the record I have yet to download the program because I wasn't sure if it would do what I wanted to do but it seems like it will. Have you got any special instructions that I should follow with the 1.1 processor?

Joseph said...

@Robert: It's case-sensitive, so try something like:

ghcnp -dt mean -include "country eq 'CANADA' && longitude < -50 && longitude > -65 && latitude > 50 %% latitude < 64" -reg -o /tmp/can-eastlab.csv

You can find information on options and syntax in the readme.txt file shipped with the tool.

Robert said...

Thanks Joseph,
Very helpful.

I was wondering how the program is accessed and installed. I set my path (under windows environment variables) to the bin directory but after that what do I use to open it and where do I type my code in?

Robert said...

Sorry if its a dumb question

Joseph said...

@Robert: In Windows, you need to open a Command Prompt window. This is normally under All Programs / Accessories.

Robert said...

ghcnp -of monthly -o /tmp/ghcn-global-monthly.csv

I tried this code and it came abck and said 'java' is not recognized as an internal or external command, operable program or batch file

I checked on the java website and it says I have the recommended version

I was doing this in the command prompt under the c:\

Any other steps I need to do?

Sorry about all the questions

Joseph said...

@Robert: After you install Java, you should have 'java' in your PATH. That is to say that when you type 'java' in the Command Prompt, it should say something other than 'command not recognized.'

It's possible that 'java' is not in the PATH. Try:

echo %PATH%

Does it have something like this in it?

c:\program files\java\jdk1.6.0_10\bin

Robert said...

I don't see Java as one of the files under path. I do see windows live, python, quicktime, matlab and ghcnp.

Robert said...

Java is not in my path file. I looked for the bin file (so i could move it into path) for the newest version of Java but its not with the other java files so it seems. Any suggestions?

Joseph said...

I'm pretty sure the JRE should be put in PATH when you install it. You could try installing the latest JRE.

If nothing else, see where Java is installed in your system (probably under program files/java) and add the bin directory to the PATH by hand.

Installing a JDK instead of a JRE should work too.

Robert said...

Hey Thanks for all the help Joseph. I got it working now. You don't incorporate any SST data into it do you? Is there any way of knowing which stations are being used and/or locations?

Joseph said...

@Robert: No, it only uses GHCN v2 data, which includes some small islands, but no SSTs.

You can't really tell which stations are being used without modifying the source code.

Robert said...

Hey Joseph,

Yeah it probably would be hard to get to know what stations are being used but if you ever get the time it would be useful because then they can just be plotted on a map.

I was trying to do my temperature analysis for the region in question and I ran into a snag

I used the code

ghcnp -dt mean -include “longitude < -55 && longitude > -71 && latitude > 50 && latitude < 61" -reg -o Global/eastlab.csv

but it doesn't seem to work. My location for the file is under c:\Global\eastlab.csv

At first the error said no output file specified and after I changed the code a bit it says system cannot find the file specified. What do you think that means?

Robert said...

Hey Joseph,
I did end up getting it all working for global and Canada and such but I haven't been able to get it to work for my area described before. Any Idea?

Joseph said...

@Robert: Try it this way:

ghcnp -dt mean -include "longitude < -55 && longitude > -71 && latitude > 50 && latitude < 61" -reg -o "c:\Global\eastlab.csv"

That works for me. Your first double-quote character was “ and not the regular double-quote character " that the command-line understands.

Robert said...

Hello Joseph,
Thank you for your help with this. It worked fine. If I wanted to try the FDM method instead of the linear based method would I just put in the section for fdm ahead of -dt mean or behind it... Same question regarding placement of monthly versus yearly

The result of your method is similar to one I derived myself for the same region but there are some differences that ive noticed. Although the basic trends are all the same and such mine has a more pronounced early 20th century warming. I believe this is because I picked and chose which southernmost stations I used because Im actually trying to study the northern region (labrador) and the southern boundary is below that but there is so little station data there I had to include something.

Robert said...

The method I used is completely different than any method ive seen and im wondering if perhaps you could tell me how it sounds. I average all the station data for each year to have a simple mean temperature composite. (Obviously flawed)

Then I use this simple mean temperature composite as the reference station and simultaneously compute offsets between each station and the overlapping period of the temperature composite.

Thus the average offset is computed for each station. I then adjust each station by its average offset (add the offset) and iterate the whole process until the calculation of the offsets results in the sum of all the offsets being equal to 0 (or close to it 0.00097)

Any thoughts on this method...

It ended up unfortunately taking 17 iterations before I got it there but it seemed to give close to what you had (except with more late century warmth).

Joseph said...

@Robert: I've tested a simulation against the first-difference method, and I've found it to be worse than a simple temperature-anomaly-in-reference-period method. I suspect it accumulates errors.

The method you describe is similar to what the tool understand as olem.

The offset for a reference station is equal to 1/N times the sum of the relative offsets of the other stations. The other absolute station offsets are calculated accordingly. (This gives good absolute temperature measurements, not just anomalies.)

I use every station as a reference station once, and do a weighted average of the results.

Robert said...

Hey Joseph,
I tried to use this program to calculate temperatures for greenland but I only received 8 results and those were not full. Any idea what the issue could be?

Joseph said...

@Robert: I think it's because there aren't many stations in Greenland. I do think there should be enough stations for a full record, though.

Did you try stations from the unadjusted data set?