USGS gage data streamflow

Hello, community,

I hope you are doing well. I need to calculate streamflow and runoff in the USGS gauge data. Can you please explain how runoff or water yield can be calculated to a specific gauge data?

Bests,
Hemen

Hello @hkarimi89 and welcome to the forum!

Could you provide any additional information about the problem you’re experiencing? Are you trying to calibrate an InVEST model to the output of a stream gauge? Or are you trying to do something else?

Thanks!
James

Hi James,

Thank you so much for your reply. Yes, I need to calibrate with gage data.

Thanks,
Hazhir

Hi Hazhir -

It is hard to come up with detailed, specific instructions for working with gage data, since each dataset will be different in format and content, based on data source and station data availability.

I’m not exactly sure what aspect of this you’re looking for guidance on, but we recently put together a draft of general freshwater model calibration methods, which I intend to add to the InVEST User Guide but haven’t gotten around to yet. So I’ll paste them here, and will be interested to hear if they address your questions or not, and if not, what’s missing.

General calibration steps

  1. Find observed data within the watershed of interest. Usually from gauge stations.
  • Gauge station data often comes from government agencies, but may also be provided by water-related utilities like hydropower operators, or other sources.
  1. Review the observed data for required measurements, duration, and completeness.
  • Required measurements: values that correspond to the output of the model you’re calibrating. For example, if you’re calibrating SDR sediment export, the gauge data must include either sediment load values, or a combination of sediment concentration and water flow data that can be used to calculate sediment load.
  • Duration: Optimally, at least 10 years of continuous daily data, which corresponds with the time frame of the climate data you’re using as input to the InVEST model.
  • Completeness: No large gaps in data. If there is a gap in one year, but the other years’ data fill that gap, that is ok. But if most or all years are missing data for, say, a whole month or whole season, then that is unlikely to produce good results.
  • Even better if someone has already processed the observed data into monthly or annual average values. This is rare, but worth asking about.
  1. Prepare the observed data, summarizing it to a value that can be compared directly with model results.
  • This process will be different depending on the nature of the data you’re working with, and the model output that you are calibrating, so it’s hard to generalize.
  • In the end, you want to create (at least one) single value that represents average annual sediment loading, nutrient loading, or water flow at the gauge station, with units that match the model output. (For the seasonal water yield model, you could use (12) averages representing each month of the year for a gauge station, but would need to decide how to distribute the annual baseflow result.)
  1. Compare the calculated observed values with modeled results.
  • Summarize the modeled results within the watershed that drains into the point where the observed data was taken. See the following section “Delineating watersheds” for more information.
  • The modeled result is unlikely to match the observed values, and may be very different. Remember that these are simple models, and for any model (even complex ones) calibration is necessary to bring the modeled results close to reality, and have confidence in the absolute values.
  1. Do a sensitivity analysis to determine which model parameters are useful to adjust for calibration.
  • This requires doing many model runs, which is most efficiently done by scripting, so it’s easier to iterate over a range of biophysical table values, input rasters, or other parameters.
  • Vary biophysical table values (related to the land use/land cover map), as well as global model parameter values, one parameter at a time, within reasonable ranges, based on ranges reported in the literature. You can also vary spatial input layers, if you have different sources covering the area of interest that are significantly different from each other.
  • The parameters that have the greatest effect on results should be used for calibration.
  1. Once you’ve chosen the parameters that have the greatest effect, do another set of model runs that adjusts these parameters across a range of values, changing all of the parameters at the same time, such that a different set of parameter values is used for each model run.

  2. Use statistical methods to compare the results from step 6 with the observed data. Select the set of model parameters that create results that come satisfactorily close to the observed data value.

  • This can be as simple as calculating the percent error as follows:
    • percent_error = ((modeled_value - observed_value) / observed value) * 100

Delineating watersheds

When calibrating freshwater models with observed data, we need to delineate the watershed that flows into the point where the observed data gauge is located. Then we can summarize the relevant model result (such as sediment export) within that watershed, and compare that summary with the observed data value.

Many different tools are available to create watersheds, and you can use whichever one you’re comfortable with. InVEST includes the tool DelineateIt as a simple, effective way of creating watersheds.

Whichever tool you use, they generally require, at a minimum, a digital elevation model (DEM) raster, and a vector (like a shapefile or geopackage) containing the point location(s) to be used as outlets. In this case, the outlet will be the location of the gauge station where observed data comes from. The DEM must be the same one that is used as input to the InVEST freshwater model you’re calibrating.

After running the delineation tool, look at the resulting watershed carefully to make sure that it appears correct. One common problem is that the delineated watershed is very tiny. This is usually caused by the outlet point not being located directly on a stream created by the delineation tool. To fix this, many delineation tools have a “snap” function, where you can specify a distance around the outlet point that the tool should look for a stream, and if one is found within that distance, the tool “snaps” the point to the stream, and delineates the watershed more accurately. If the tool does not have a snap feature, you can manually move the point to lie on the stream network generated by the delineation tool.

Once the watershed is correctly generated, a GIS tool like Zonal Statistics is used to sum the relevant model result raster (such as sediment export) within the watershed. This summarized value is then compared with the observed data value. Alternatively, you can use the generated watershed as an input to the model, which will do the summarizing for you, and output a vector layer whose table contains the summarized values.


Step 3 (preparing observed data) is probably the trickiest one to advise on, since it’s where each dataset will be different, and it’s also probably the step people would like the most help with. But do let me know if this helps and what could be improved.

~ Stacie

3 Likes

We just published a paper that has a script which addresses much of what @swolny expertly and consistently narrates. You can find it with open access and a link to at our GitHub in Science of the Total Environment: https://www.sciencedirect.com/science/article/pii/S0048969724052616. We are working on extensions with flow imputation. This is for the NDR but the stream flow part is needed for loadings so it should be amenable to adjustment for water yeild. The first author Mariam Valladares, MS, is working on some other related methods. The articld has steps for delineation, data retrieval, imputation, etc., and a case study in Puerto Rico. Let is know if it is useful. (And Hi @swolny @hkarimi and James)

3 Likes

Re delineation: There is also a USGS tool, which won’t work internationally
That is ultimately what we have used but have you guys done any formal comparisons? @swolny @jdouglass

Wow, @dowthut, that’s a very detailed article, which fills in a lot of the “data cleaning is hard” gaps that I noted in my response. Thanks to @Mariam for telling us about it in this post as well. It looks like a very valuable contribution to help us validate our modeling results, and I agree that the detail given should be applicable to other models and observed datasets.

Validation often is a rather complex process, as we can see with your reference to multiple R packages and other tools that can be brought in to help. So I also very much appreciate you providing your (well-commented!) R code with the paper.

As for watershed delineation, I don’t think we’ve done a formal comparison between DelineateIt and other watershed-delineation tools. My main experience has been more trial and error, with Arc Desktop’s watershed tool mostly not working, so trying ArcHydro, which was much better but buggy and complicated, trying a few others, then settling on DelineateIt a while back because it’s pretty simple and works, including on nested watersheds.

The main differences of note for me are the difference in stream network and resulting watershed between DEMs, more than the tool algorithm itself, since those results can vary dramatically. And there’s another place where it’s more art than science, deciding whether the stream network/watersheds from a particular DEM are “good enough” since they’re pretty much never exactly the same as a real-world stream map. Maybe there’s a tool out there to to quantify similarity between them that would make that process more objective, but I’ve never looked for one.

~ Stacie

2 Likes

Hi Stacie,

Thank you so much for the invaluable information. I am sorry for the late reply I was in the field and did access to the PC.

Bests,
Hazhir