Crop production, appears to be an issue with the input data

Hello all

I am getting very low observed production values from the Crop Production model. So far I have only used it for maize. The absolute highest value in the maize_observed_production.tif output is just 0.3t/ha which is really low, even for a country like Zimbabwe where agriculture has not reached its full potential. Over much of the country, observed production is coming out at 0.1t/ha or less.

The problem seems to lie with the default input data, specifically the maize_yield_map.tif. If you open that raster, you will see the maximum raster value for maize production globally is just 1.73 t/ha, yet average maize production/ha exceeds this for most countries. I am also strangely seeing higher values from millet_yield_map.tif for the same areas of Zimbabwe, which is surprising since maize is by far the dominant crop. Again this seems to suggest an issue with the input observed yield data.

When I add the maize yield per ha raster downloaded directly from the Monfreda dataset Harvested Area and Yield for 175 Crops - EarthStat, I see that production per ha is much higher. When doing Zonal stats at national scale, it is also much closer to reported values of maize production in Zimbabwe than what I am getting from the observed production raster produced by InVEST.

I see another user had a similar problem with low observed maize yields some weeks back. I wonder if it could be the same underlying issue?


UPDATE: Just ran the model for millet and the observed production output was indeed much greater than it was for maize! This shouldn’t be the case as maize is by far the dominant crop. Could someone have a look at the maize_yield_map.tif in the sample dataset and see if the values seem off?

Hey @lukezw,

Thanks for posting to the forums and doing some research into the values you think are off in the data. We’ll take a look at this soon and report back here!



I am not sure if anyone has had any luck following up on this, but I have further evidence that the issue seems to be with the input data for observed maize yields.

In my screen clip of the legend layers in Arc “XXX_YieldPerHectare.tif” is the yield data obtained directly from the Monfreda dataset, while “XXX_yield_map.tif” is the observed yield data from the Invest sample dataset. It seems that these values should be the same as they are representing the same data.

You will see that the raster values are the same for the millet layers, which explains why the value I am getting seems reasonable for millet. However, you can see that there is a major difference between the Monfreda layer and the observed yield raster in the InVEST sample data for maize.


I have tried to get round this by adding the Monfreda raster into the Invest sample data directory to replace the erroneous ratser in the observed yield folder, and renaming it to match the naming convention followed by the observed yield rasters. Unfortunately when I do this the model produces an error and does not run to completion. I have attached the log file, the error showing is TypeError: unsupported operand type(s) for &: ‘slice’ and ‘bool’.

So I guess 2 parts to this post, i) confirming that there seems to be an issue with the input data for maize observed yields ii) is there any way round the error I get when I try use the raster I obtained directly from the Monfreda dataset?
InVEST-Crop-Production-Percentile-log-2022-04-14–16_17_11.txt (14.3 KB)

Hi @lukezw,

The error is happening because the rasters downloaded directly from EarthStat do not have a nodata value set. The model shouldn’t care about that, it’s a small bug that I have targeted to fix in the next release. But in the meantime, you can get around it by setting a nodata value for the raster. Here are instructions for how to do that in QGIS. I recommend using a nodata value of -1 so it won’t conflict with the real data.

Hope that helps, and thanks for reporting these issues!

1 Like