Exploring the possibility to run the SWY model at a monthly scale

Dear all,
I am trying to understand if it is possible to run the SWY model using a different aggregation time scale rather than one year.
As I am using SWY model from an IDE, I can access the source code and, if I correctly understand, there isn’t a clear relation/reason to use the year as the aggregation time scale, isn’t it?
My problem is that it happens that I have only a some months of data available (let suppose that I have only 100 days of data of P and ETP) then I would like to calculate all the variables (QF, ETP and baseflow) for a basin using only these data at a monthly scale.
Do you think that it would be possible?
Which parts of the code do you think that would have to be changed?

I can see the variable N_MONTHS and this is something that I can easily change (in my case in 3), but my question is, at this point is it enough to provide the data of the 3 months separately as done for 12 months or should I use daily data as input?
And in any case, do you think that from the source code is it possible to create also the aggregated maps of baseflow for each month in order to have the possibility to calculate the discharge (total, average, …) for a month? I saw some other posts regarding the possibility to extract monthly baseflow from the yearly data but I would prefer to make the software create them directly.

Any help is appreciated…

Thanks in advance


Hi @silli -

The SWY model does provide QF results only at the monthly scale (in the intermediate output folder). (I’m just now realizing that since PET is calculated monthly, it would be nice if the model did output monthly PET rasters.) However, I believe that the rest of the equations, starting with local recharge, which is then used for baseflow, are done on an annual time scale.

I’ll let the software team address your coding questions, but one thing I do know is that we do not generally recommend using the baseflow results as absolute values, but as an index. So technically, you would not want to add baseflow to quickflow to get “total” flow. At least part of this is because the simplicity of this model does not allow us to know when baseflow will enter a stream. It could be within one month, or two, or 6, but we make the assumption that it makes it to the stream within a year, thus the annual time scale.

~ Stacie


Thank you @swolny for your reply.

As far as I can understand the most important variables for the water budget are calculated at a monthly scale, or, in general, at the scale of the input data.
Only the calculation of the LocalRecharge Li is made on a yearly basis, aggregating (only summing) the monthly contributions (as raster maps) without taking care of any other parameter.

This morning I went a step more in deep in the source code and found that in the function calculate_local_recharge of the seasonal_water_yield_core.pyx at a certain point there is a FOR cycle

for m_index in range(12)

where all the values of the different monthly maps are summed, and only after this, the evaluation of the local recharge starts.

I have two questions now:

  1. the fixed value of 12 is intended to be N_MONTHS? is there a specific reason for not using the variable and having this value fixed?
  2. since all the variables are available at a monthly scale it would be possible to skip this aggregation and evaluate Li at a monthly scale, right? this will be based on the assumption that we consider the way we do the evaluation of the yearly budget applicable also at at monthly scale…
  3. after this monthly evaluation of the water budget, it would also be possible to aggregate all the variables at a yearly or seasonal timescale (3 months); does this make sense for you or am I missing something?

I read in the documentation somewhere (I can not find it now, sorry) that there should be a scientific publication on the SWY theory rather than the Invest users guide. Is this publication available?

In any case I agree with you that if the elaborations are made it would be nice to save as much intermediate outputs possible. So +1 to store also the monthly PET together with the Qf.

Thanks again and sorry for bothering you!


Hi all,
is there someone who can help me with the source code?

Thanks in advance


Hi @silli , sorry for the delay, thanks for posting again.

Good catch. Probably this should use the same variable.

I can’t really speak to whether or not this a good idea, or whether the results are valid, but I just tried changing the N_MONTHS constant to 4 and the model ran without error on 4 months of input data. Do you think that accomplishes your objective?

You would also need to change that fixed 12 to use N_MONTHS. And you will need to re-install natcap.invest after modifying the source code. We typically do that at the command line (in an active virtual environment) with pip uninstall natcap.invest -y && python setup.py install (uninstall the old and install the new)

Are these files actually in the cache_dir already?

There is this one, and maybe others? Modeling seasonal water yield for landscape management: Applications in Peru and Myanmar - ScienceDirect

Good luck!

Thank you @dave!
I will try to modify the variable and the value and try to run the model for one-two months, then I’ll check the results with those obtained using the yearly water budget.
Thanks for the double check.

Nope, these maps are created in the code but not saved in any file. It would be nice to have also these in the chache_dir folder.

I read this publication but it is not the one I wanted… do you have anything more related to the theory of the SWY model?

Hear you soon! :smiley:

Hmm I would have thought cache_dir/et0_a<1-12>.tif would have been the monthly PET. Pretty much anytime data is represented by a raster in an invest model, that raster exists on disk somewhere in the output workspace because we want to avoid big in-memory operations.

Thanks @dave, but I would mean the AET, this is calculated in the code for each month and suddenly aggregated at yearly timescale, but it could be saved on the hard disk. Because this is the value used in the water balance for the ETP, not the PET or the ETP0.
This is calculated in the seasonal_water_yield_core about at row 654

aet_i += min(
p_m - qf_m +

Simply we should not sum all the monthly contribute but save the value before

Would this be possible?

Aha, thanks for setting me straight :slight_smile: You’re right, the monthly values are never being saved. It looks like it would take a little bit of work to write an AET raster for each month, but it could be done.

You could imagine a list of rasters like we have for the other monthly variables. Lines 618:625 show how each monthly raster is selected for those other variables. And those ManagedRaster objects have a set method that you could use to set each value, similar to how line 659 is setting the accumulated annual aet value at each pixel.

Backing up, to create the list of ManagedRasters, you can see how that’s done for the single annual AET raster at 499-503. So you’d need to make a raster like that for each month.

Hope that helps!

Hi all, sorry again for the insistence.
I have a question about the use of the parameters of the SWY model in case of application of the water budget at a different time scale.
In particular:

pij [0, 1] is the proportion of flow from cell i to j
alpha_m = the fraction of upslope annual available recharge that is available in month m
beta_i = the fraction of the upgradient subsidy that is available for downgradient evapotranspiration
gamma = the fraction of pixel recharge that is available to downgradient pixels

Supposing that we use beta and gamma equal to 1 and that I use the simplified D8 flow direction so pij will be set also to 1, the only parameter that we have to evaluate at different timescale is alpha_m. The default value si 1/12 considering the annual water budget and the AET calculated at monthly scale (because L_sum_avail is calculated yearly). If I will have only the monthly water budget without aggregating it to yearly time scale, I will have only monthly data also for L_sum. What could be the value of this parameter? May I set it to 1 because all the upstream local recharge is available in this month or do you think that I can insert here a fraction.
In case I will insert a fraction here, may I store the difference availability to use it in the next timesteps or not?

Maybe I did a confusion on the things, it is not so easy for me to explain all, but I hope that you will have some suggestion on the possible value of the alpha_m parameter.


If you only want to change the model to save monthly AET rasters as Dave described, you do not need to modify alpha_m. However if you want to run the model with only 3 months of data, you could create some averaged data to fill in the rest of the year:

  • Sum your three monthly precipitation rasters and divide by 3. Use this averaged precipitation raster for the other 9 months.
  • Sum your three monthly evapotranspiration rasters and divide by 3. Use this averaged evapotranspiration raster for the other 9 months
  • For the alpha_m values for your 3 months of data, choose values such that alpha_m1 + alpha_m2 + alpha_m3 = 0.25. Set alpha_m for the other 9 months to 1/12. This way they will all sum to 1.
  • Choose fake Kc values for the other 9 months

Then I believe you could ignore the outputs for the 9 months you didn’t have data for. @dave does that sound right? I don’t think there’s any “carry over” of data between months, so would this be valid?

Thanks for weighing in Emily. I don’t feel like I have the experience with this model to add very much when it comes to these parameters and the sub-annual timescale. But I will say that I think @silli is now working with a modifed model that runs on fewer than 12 months of input data. So I’m not sure it’s necessary to fill in those other months with fake data.

As for alpha_m, I guess it makes sense to use 1/n_months ?

Otherwise the User’s Guide suggests this, which I don’t think is actually possible except by further modifying the source code:

An alternative assumption is to set values to the antecedent monthly precipitation values, relative to the total precipitation: Pm-1/Pannual

Thank you @esoth and @dave for the replies.
I thought about your suggestions and honestly I am still not sure on what it has sense to do. Probably the two best solutions are those to set:

  1. alpha_m to 1/n_months (or periods): so if I would run the model only on one month I will have alpha_m = 1
  2. alpha_m to the antecedent monthly precipitation values, relative to the total precipitation: Pm-1/Pannual: this only if I can run the model on more than one month and in any case, what should I use as reference value for the precipitation on the first month?

Thanks again for the big help