Using processed/intermediate data from previous model runs for the SDR model

Hello everyone

I am running the SDR model over a large area which is taking around 2 hours per model run.

A large part of the processing time is taken up by the warping and alignment of layers, calculation of flow direction and accumulation etc.

The only thing I am changing between model runs is the biophysical table to improve alignment with measured sediment loads. To save time, I wanted to know if the intermediate outputs from previous model runs (e.g. aligned raster layers, filled DEM, flow routings) can be used in re-runs, to avoid re-generating these every time the model is run?

Hello @lukezw ,

Yes, the model should re-use any inputs it can, so only modifying the biophysical table should not result in the model re-aligning or routing the large spatial data, it should pick up with just those tasks that involve the biophysical table.

One thing to keep in mind, though, is that the safest way to do this across lots of different model runs in order to save as much time as possible is to:

  1. Always use the same workspace and
  2. Do not modify the input spatial data in between runs and
  3. Don’t change the suffix across multiple runs

If you need access to outputs across multiple successive runs, you’ll probably want to copy that output to somewhere outside of the workspace for later reference.

Let us know if you have any questions!
James

1 Like

@jdouglass I’d like to flag (1) and (3) as being incompatible in practice. If we are using the same Workspace, then we must change the Suffix, else we overwrite our results, which you alluded to with the note to copy the output elsewhere between runs. I can understand why it needs the same Workspace to avoid re-processing overhead, but why would a change in Suffix cause InVEST to re-run everything? Can that be fixed?

~ Stacie

1 Like

I completely understand, and I’m sorry for the constraint!

As for whether it can be fixed, this is something that is still on our list and it is something we would very much like to fix.

As for why the suffix causes an issue, it’s surprisingly difficult to write software that treats two files with different filenames (different suffixes) as the same file, and to do so in such a way that we use file A in place of file B in a set of tasks that executes in a nondeterministic order. So yes, I’m sure this could be fixed, but doing so will require a major rework of the underlying task graph.

1 Like

Thanks both for the feedback on this.
I am indeed able to get the model to pick up the previous intermediate outputs if using the same workspace, spatial inputs and suffix as pointed out by James.

Obviously this approach requires some care to keep track of what is what when copying out files from previous runs, while keeping the suffix identical between runs, but at least it is doable. If it ever can be done, it would be great to have the option to change the suffix but still use previous intermediate data when changing biophysical parameters, as it would make it easier to keep track of the adjustments being made. But I understand the difficult of adding this in currently!

1 Like