How do I select representative stations from 112 stations to do statistical analysis of the temp, rainfall and humidity?

Published on February 4, 2015 14:22 by Elias Nkiaka, University of Leeds - PhD candidate, School of Geography, Faculty of Environment

I am modelling a 75000 sq km watershed with no in situ measurements for hydro-meteorological data so i depending on data sets that i downloaded from Climate Forecasting System Reanalysiswebsite. After clipping the data to my study area, i have 112 station but it is not possible for me to analyse station by station data due to time constraint. The resolution of the data is 38km.SWAT is catchment model used for rainfall runoff modelling and climate change impact studies. It isalso be used for land use change impact studies in watersheds.

6 Answers

Cluster analysis will solve the issue.

Answered on February 12, 2015 11:46 by Sui Taiwan
Hi Elias, First of all, telling you that I am not an expert in this matter, so I maybe my comments are not useful at all for you at this point of your research, but have you tried with some kind of GIS related software, like BASINS? Maybe you can try to download the data from the stations and group themâ€¦ or you can also try with HSPF to complement SWATâ€¦I cannot understand very well what you are referring to when you mention that “you donÂ´t have in situ measurements” if you have the info from the 112 stationsâ€¦ how long in your time scale for the simulation and analyses? This will be a very important point for the data load, as well as for the data restriction and data managementâ€¦ I would also try to cluster them by similarity of the sampling area and the characterization of the locationâ€¦ In any case, good luck!!!

Answered on February 9, 2015 14:17 by Gema Martínez, Water Centre for Latin America and the Caribbean - PhD student
Just to clarify my assumptions here: where you say "no in situ measurements for hydro-meteorological data" I assume that means "no precipitation measurements" because then you say "I have 112 stations" which I assume are stream discharge data. To narrow down the pool of evaluations, you could: 1. select 52 stations at random from the population (somewhat akin to a jackknife procedure, 52 being approximately the number of CFS Reanalysis grid points in your watershed based on stated grid resolution) 2. do a cross-correlation analysis of the discharge time series for all 112 stations (a simple step, even in MS Excel), rank the correlations (keeping the information on which stations produced each correlation value) from least to most, and then select those station pairs with the lowest correlations until you have a manageable collection of stations (however many you have time to analyze) for your evaluation step. The idea here is to maximize your accounting for the overall variability of stream discharge observations across the watershed. Two stations that are in series on the same stream will likely be highly correlated, so you don't necessarily need both of those stations, as you'd be repeating information in your analysis. Two stations on different streams will be less correlated, so having both will be useful to gauge how well you are representing the spatial variability of precipitation-runoff processes in the watershed. Two stations on opposite sides of the watershed will (likely) be quite different and have a low correlation. Throwing all of the stations into that mix will mean that you get station relationships like that, but also some internal to the watershed, in different sub-watersheds, some at headwaters and some at outlets, etc. with the likely outcome that you end up with a pool of station locations that are spread out all over the watershed, representing both the modeled area and its internal variability.

Answered on February 4, 2015 17:31 by Matthew Garcia, University of Wisconsin - Madison - Post-doctoral Researcher
you could try to extract representative stations by the properties. maybe you could perform a cluster analysis to get a cluster of similiar statons.

Answered on February 4, 2015 16:28 by Stefan Halbfass, Company for Applied Landscape Ecology - CEO
Elias. I have a few documents that speak to this that might assist. I also would check out some of these experts. Todd Gardner at World Research Institute and Rowan Schmidt at Earth Economics. Both of these men have done presentations regarding your questions. I don't see a method to upload documents. Send me an email. Cervantesbrenda60atgmaildotcom.

Answered on February 4, 2015 16:26 by Brenda Cervantes, Project Specialist at Lane Community College Energy Water Programs
Kindly try to consider spatial distribution of the stations (carefully fitting into your resolution) and the how representative each station is with respect to the surrounding area. Other factors could be length of time series is it the same???

Answered on February 4, 2015 14:48 by Stephen Siwila, THE COPPERBELT UNIVERSITY (ZAMBIA) - LECTURER- WATER SUPPLY AND ENVIRONMENTAL ENGINEERING