I am trying to compute for Annual Water Yield for the entire Ghana as my study area. However, some of my input data has some No data areas. For instance, the Soil Grids data has NoData values for water bodies, Worldclim and DEM also has some NoData values.
How can I fill these NoData value areas with values as output will have no data when left without NoData.
NB: I have tried filling the NoData values areas using Arcmap map algebra but the output has minimum and maximum values different from the input data.
I use the operation below in raster calculator:
Con(IsNull(“raster”), FocalStatistics(“raster”, NbrRectangle (5,5, “CELL”), “MEAN”), “raster”)
How to fill areas of NoData depends on which input you’re working with.
In the Working with the DEM section of the User Guide, I give a few ideas for how to fill holes in the DEM. It is often the case that the resulting map will have minimum and/or maximum values that are a bit different than the input data. This also happens any time you reproject or resample a DEM. In general, this isn’t a problem, unless the difference is very large and just looks wrong.
Soil data is a bit trickier, and honestly I haven’t come up with a good way of doing it at a large scale, using the latest SoilGrids data. (It was somewhat easier when soil data was provided by polygons that had a single value across large areas). If you’re working with a relatively small area, or there aren’t many holes, it might be easy enough to assign each NoData area the dominant soil value that is around the NoData hole. I’ve also tried several GIS methods (interpolation, Nibble in ArcGIS, Close gaps in QGIS…) and they all produce results that look weird. In particular, we don’t want to assign values that have a lot of detail that we’re not at all confident in, which methods like interpolation do.
If you happen to have information outside of SoilGrids to help assign those values, you can do that too. For example, if you can find documentation about the soil depth beneath a large water body, you could assign that value to the entire NoData area under that water body. That’s not common, but possible.
It is also simply a limitation of our current soil datasets that they have holes. So while it is problematic for our modeling, I have also just left the holes, and listed that as a limitation of the input data. Which isn’t very satisfying, but it’s a real issue that it’s good to draw attention to.
I’d be very interested to hear if anyone has ideas for how to fill holes in SoilGrids data across large areas, in a way that we can have some confidence in the results.
Hi @swolny, thank you for the response.
I have been able to find a way around it with this map algebra expression: Con(IsNull(“raster”), FocalStatistics(“raster”, NbrRectangle (5,5, “CELL”), “MAJORITY”), “raster”).
So instead of “MEAN”, I changed the focal statistics to “MAJORITY” and through that I have been able to fill all the NoData areas for my input dataset. It is good to point out that the min and max values of the original data were not altered in the final output.