KeyError: 13 in pygeoprocessing

I’m not sure whether pygeoprocessing is installed by default, or whether I installed it in an attempt to resolve the issue reported in another post, but does it not support int64 datatypes?

I admit in the past, when first playing with NDR, I left my lucode raster as floats and it worked. Now, when trying to correctly set it as int, it seems to work if I set lucode to be int32, but fails if it’s int64, with:

Something went wrong when adding task align rasters (1), terminating taskgraph.
Traceback (most recent call last):
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 674, in add_task
    new_task._call()
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/taskgraph/Task.py", line 1093, in _call
    payload = self._func(*self._args, **self._kwargs)
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 863, in align_and_resize_raster_stack
    raster_info_list = [
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 864, in <listcomp>
    get_raster_info(path) for path in base_raster_path_list]
  File "/home/guy/anaconda3/envs/invest_geo_202309/lib/python3.10/site-packages/pygeoprocessing/geoprocessing.py", line 1920, in get_raster_info
    _GDAL_TYPE_TO_NUMPY_LOOKUP[band_datatype])
KeyError: 13

This suggests the problem here is in pygeoprocessing and, internally, it has int64 as band_datatype 13? I’m curious, because the documentation for GDAL states that it supports int64 as of version 3.5. My environment has 3.6.4 installed. The version of pygeoprocessing in this environment is 2.4.0. I note there was a 2.4.1 release a few weeks ago, but the release notes on GitHub don’t say anything about int64 support.

I ask this as well as tagging it with ndr because although ndr.execute runs with my lucode as int32, I get all nan output for n export. So whilst setting int64 clearly produces an error, I may (also) be doing something (else) wrong.

@gtmaskall,
Thanks for reporting this. pygeoprocessing does not yet support int64, and we should have added it when GDAL 3.5 came out. However, demand for int64 is usually very low because it’s almost never needed with real world data. int32 can store positive integer values up to 2^32/2 - if you have less than 2 billion different land cover codes, int64 is unnecessary and actually detrimental because it consumes more memory.

I would recommend you stick with using int32 (or even int16 if you want to save more memory) and then address the issue of nan output, which is probably unrelated.

1 Like

Yeah, I agree with all that. I just wanted to check that was the actual ‘issue’ here, given int64 is seemingly supported by GDAL. Thanks for the prompy reply.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.