NDR ValueError: Could not open masked dem as a gdal.OF_RASTER

I’ve previously run NDR on a different geographic area and got sensible results. I’ve got stuck when trying to run it on a new area. I was running InVEST 3.12.1 and was getting “ERROR 1” as I recall. I’m now trying with version 3.14.0. It still fails, albeit much faster and with a more informative, but still pretty unhelpful error:

A taskgraph _task_executor failed on Task fill pits (5). Terminating taskgraph.
multiprocessing.pool.RemoteTraceback:
“”"
Traceback (most recent call last):
File “/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py”, line 125, in worker
result = (True, func(*args, **kwds))
File “src/pygeoprocessing/routing/routing.pyx”, line 727, in pygeoprocessing.routing.routing.fill_pits
File “/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py”, line 1870, in get_raster_info
raise ValueError(
ValueError: Could not open /home/guy/blahpath/local_data/intermediate_outputs/masked_dem_test_2016.tif as a gdal.OF_RASTER
“”"
The above exception was the direct cause of the following exception:

It then seems to repeat the same exception (for the same dem file) before then complaining similarly (twice) about not being able to open the masked LULC TIF and the masked runoff proxy TIF. Looking in the intermediate_outputs directory, it’s no great surprise it can’t open the files because they’re not there.

This is pretty much all the info I have. No .txt log file is created. I’m not creating these masked rasters so I don’t know why they’re not being created. If them not being created is the root cause, how do I find out why? All files should be using CRS epsg:27700. I can read all my input vector and raster data using either geopandas or xarray. How do I find out what’s breaking here? I’ve previously run the DEM file through routeDEM okay.

Thanks for the info on this @gtmaskall !

The fact that there’s no logfile makes me wonder if there’s some kind of permissions issue with the location of your workspace. I realize that there is no logfile, but could you copy-paste the complete model logging here so we can take a look? There should be some additional debugging information in there that might point us in the right direction.

Thanks,
James

You mean this?

A taskgraph _task_executor failed on Task fill pits (5). Terminating taskgraph.
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "src/pygeoprocessing/routing/routing.pyx", line 727, in pygeoprocessing.routing.routing.fill_pits
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 1870, in get_raster_info
    raise ValueError(
ValueError: Could not open /home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_dem_test_2016.tif as a gdal.OF_RASTER
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 461, in _task_executor
    task._call()
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 1090, in _call
    payload = result.get()
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 774, in get
    raise self._value
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "src/pygeoprocessing/routing/routing.pyx", line 727, in pygeoprocessing.routing.routing.fill_pits
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 1870, in get_raster_info
    raise ValueError(
ValueError: Could not open /home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_dem_test_2016.tif as a gdal.OF_RASTER
A taskgraph _task_executor failed on Task n load (20). Terminating taskgraph.
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/natcap/invest/ndr/ndr.py", line 1185, in _calculate_load
    lulc_raster_info = pygeoprocessing.get_raster_info(lulc_raster_path)
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 1870, in get_raster_info
    raise ValueError(
ValueError: Could not open /home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_lulc_test_2016.tif as a gdal.OF_RASTER
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 461, in _task_executor
    task._call()
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 1090, in _call
    payload = result.get()
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 774, in get
    raise self._value
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/natcap/invest/ndr/ndr.py", line 1185, in _calculate_load
    lulc_raster_info = pygeoprocessing.get_raster_info(lulc_raster_path)
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 1870, in get_raster_info
    raise ValueError(
ValueError: Could not open /home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_lulc_test_2016.tif as a gdal.OF_RASTER
A taskgraph _task_executor failed on Task runoff proxy mean (11). Terminating taskgraph.
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/natcap/invest/ndr/ndr.py", line 1135, in _normalize_raster
    base_nodata = pygeoprocessing.get_raster_info(
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 1870, in get_raster_info
    raise ValueError(
ValueError: Could not open /home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_runoff_proxy_test_2016.tif as a gdal.OF_RASTER
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 461, in _task_executor
    task._call()
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 1090, in _call
    payload = result.get()
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 774, in get
    raise self._value
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/natcap/invest/ndr/ndr.py", line 1135, in _normalize_raster
    base_nodata = pygeoprocessing.get_raster_info(
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 1870, in get_raster_info
    raise ValueError(
ValueError: Could not open /home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_runoff_proxy_test_2016.tif as a gdal.OF_RASTER
Exception raised when joining task Task object 140683156568576:

{'exception_object': ValueError('Could not open /home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_dem_test_2016.tif as a gdal.OF_RASTER'),
 'ignore_directories': True,
 'ignore_path_list': [],
 'priority': 0,
 'self._reexecution_info': {'args_clean': [('/home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_dem_test_2016.tif',
                                            1),
                                           'in_target_path_list'],
                            'file_stat_list': [],
                            'func_name': 'fill_pits',
                            'kwargs_clean': {'working_dir': '/home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs'},
                            'other_arguments': [[('/home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_dem_test_2016.tif',
                                                  1),
                                                 '/home/guy/projects/Agreed/data_science/research/water/Anglian/in_target_path_list'],
                                                {'working_dir': '/home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs'}],
                            'source_code_hash': 'da39a3ee5e6b4b0d3255bfef95601890afd80709'},
 'self._result': None,
 'target_path_list': ['/home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/filled_dem_test_2016.tif'],
 'task_id_hash': '0f5cad9b2c6cd1a40adc529477a43d8152332f76',
 'task_name': 'fill pits (5)',
 'task_reexecution_hash': '68b62f287b197ddeffcd7c81a1c5e65b07999ecc'}. It's possible that this task did not cause the exception, rather another exception terminated the task_graph. Check the log to see if there are other exceptions.
multiprocessing.pool.RemoteTraceback: 
"""
Traceback (most recent call last):
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "src/pygeoprocessing/routing/routing.pyx", line 727, in pygeoprocessing.routing.routing.fill_pits
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 1870, in get_raster_info
    raise ValueError(
ValueError: Could not open /home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_dem_test_2016.tif as a gdal.OF_RASTER
"""

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 782, in join
    timedout = not task.join(timeout)
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 1272, in join
    raise self.exception_object
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 461, in _task_executor
    task._call()
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 1090, in _call
    payload = result.get()
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 774, in get
    raise self._value
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/multiprocessing/pool.py", line 125, in worker
    result = (True, func(*args, **kwds))
  File "src/pygeoprocessing/routing/routing.pyx", line 727, in pygeoprocessing.routing.routing.fill_pits
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 1870, in get_raster_info
    raise ValueError(
ValueError: Could not open /home/guy/projects/Agreed/data_science/research/water/Anglian/local_data/intermediate_outputs/masked_dem_test_2016.tif as a gdal.OF_RASTER

I’d be surprised if it’s a permissions thing. The software does create aligned_dem_test_2016.tif, aligned_lulc_test_2016.tif, and aligned_runoff_proxy_test_2016.tif in the intermediate_outputs directory.

If it’s helpful, broadly the workflow has been to start with a 30 m DEM that I’ve already successfully used with RouteDEM. This is in EPSG:27700. The runoff proxy is basically a raster of annual rainfall on a 1 km grid that I’ve interpolated onto the DEM coordinates using xarray’s interp_like. My lucode raster was created by putting a GeoDataFrame of LUCODEs through make_geocube and using its like= arg to align with the DEM. None of this is shockingly different to what I’ve done before, except the data’s different and there’s always a small chance it’s a slightly different python environment. I’ve tried to rule out being lucky before with nodata and dtypes; e.g. before, on the data/area that worked, I had a fill value of 9999 in my lucode raster where there wouldn’t have been a 9999 in the biophysical params table, and I don’t think it was flagged as the nodata value, so I suspect I was lucky that there weren’t any 9999 values left after NDR masked outside of the watershed.

If this illuminates any, here’s my call to ndr.execute:

for y in range(2016, 2023):
    lucode_raster = f"./local_data/colne_lucode_raster_{y}.tiff"
    ndr.execute({
        'workspace_dir': './local_data/',
        'dem_path': './local_data/filled_colne_dem.tiff',
        'lulc_path': lucode_raster,
        'runoff_proxy_path': f"./local_data/colne_catch_total_rainfall_mm_{y}.tiff",
        'watersheds_path': '/home/guy/data/Agreed/Anglian_Water/Agreed data/Subcatchments_exc_Salary.gpkg',
        'biophysical_table_path': './local_data/colne_biophysical_v1.csv',
        'calc_p': False,
        'calc_n': True,
        'results_suffix': f"test_{y}",
        'threshold_flow_accumulation': 1000,
        'k_param': 2,
        'subsurface_critical_length_n': 5,
        'subsurface_eff_n': 0.5,
        'n_workers': 10,
    })
    break

Running just that cell, so keeping input data constant, in my conda environment that has InVEST 3.12.1, after 3m 23.5s, that cell dies with:
ERROR 1: ./local_data/intermediate_outputs/ic_factor_test_2016.tif, band 1: Failed to compute statistics, no valid pixels found in sampling. This does create a lot more of the usual intermediate files in intermediate_outputs.
If I switch to my conda environment with InVEST 3.14.0, it dies in 0.6 s with the ValueErrors I pasted above and that finally brought me here. That is, in VSCode, select the first env, import ndr, run the cell, select the env containing the newer version of InVEST, import ndr, run that same cell again, get the two different errors. I’m aware of the potential for contaminating one run with the detritus of a previous run using the same suffix, but scrupulously clearing any outputs hasn’t helped.

Hi @gtmaskall,

Thanks for all that detailed information, it is indeed helpful. Would it be possible to share a set of your inputs that you know is failing so we can try and reproduce it on our end? I think just the inputs that are going into the ndr.execute function are fine for now. So, one year of the LULC, runoff proxy, and then the other static spatial inputs and table. If it’s possible to share this over Google Drive or another cloud service that’d be great, otherwise email works too. My email for either case is ddenu@stanford.edu.

Cheers,

Doug

1 Like

Hi @gtmaskall,

Another thought that just occurred to me, that’d be worth trying. Could you set n_workers to -1 and create a different workspace_dir for each run? I realize you have a changing results_suffix but am curious if there might be an issue there.

Cheers,

Doug

1 Like

Hmm, hi Doug, thanks for all this. In trying to make sure I had all my ducks in a row on this I’ve been double checking the generation of input data. I’m still working on that (on holiday at the moment), but I did just try your above suggestion.

I specified a new workspace_dir and set n_workers to -1 instead of 10 and got a new error:

Warning 1: the input vector layer has a SRS, but the source raster dataset does not.
Cutline results may be incorrect.

Using a different workspace_dir each time but the same input data, I can reliably alternate between this error and my previous

ValueError: Could not open ./local_data_4/intermediate_outputs/aligned_runoff_proxy_test_2016.tif as a gdal.OF_RASTER

family of errors.

I admit I had to look up ‘SRS’. Basically a synonym for CRS? Is this significant? This did generate output export files, but all values seemed to be nan (not sure whether that’s an issue with any of my inputs or comes under ‘Cutline results may be incorrect’. Unfortunately, the above (new) error doesn’t specify which files, but is this something you were suspecting? And is there mileage in me ensuring I set a CRS using rioxarray’s write_crs method before saving any raster files I create?

Regards,
Guy

Hi @gtmaskall,

Thanks for that information. Yes, it will be necessary that all your inputs are projected and have the same projection. So you’ll want to define the Spatial Reference System if they are not. I’m not overly familiar with rioxarray but it’s worth pulling your inputs into ArcGIS or QGIS and making sure they all have a defined projected SRS. Yes, SRS and CRS are synonymous.

If you keep getting stuck, feel free to share inputs from a failed run and we can maybe diagnose a bit quicker.

I’ll throw in a selfish plug here, we also have a geoprocessing library which we use internally for InVEST called Pygeoprocessing that might be worth checking out, seeing as you’ve been using other Python utilities. If you do end up giving it a try we’d love whatever feedback you might have!

Cheers,

Doug

1 Like

Thanks @dcdenu4 ,

I think the CRS was the main issue (I had also got a KeyError 13 at one point, and can replicate this, in pygeoprocessing, but I’ll pop that in another post) and mark this as solved for the purposes of this post’s title.

My runoff proxy raster wasn’t saved with a CRS specified. I resaved it having set the CRS and this single change resolved the failure/error that sparked this post. I now get all nan in my export output, but that’s for another post!

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.