Hi all,
I am working on running the Urban InVEST Cooling model in Python via a function that takes each city in a list of 760 for the United States, prepares its arguments (i.e., clips the NLCD/evapotranspiration rasters, calculates the rural reference temperature and UHI), and runs the cooling model for each of the summer months in a given year. I have been using the multiprocessing module pool.map function to speed this process along. I save all the results to the same workspace with suffixes edited with the city and month.
I’m running into an issue where my code is working when each city is run sequentially (i.e., looping through the cities list), but during parallel processing there seems to be some thread intermingling messing with the results. Some cities have the correct avd_eng_cn in the uhi.shp results but none of the other information is correct, some have incorrect information throughout. The code works correctly when including only one city with the multiprocessing call.
I have tried running versions of the code without the data preparation within the function for a small subsample of cities that had the correct avd_eng_cn but nothing else correct in the uhi.shp. In this version, all of the data sources read in are specific to the city (so the parallel calls of the function should not be reading/writing to the same places). I notice that the incorrect values of other attributes in the uhi.shp files (avg_cc, avg_tmp_v, etc.) are the correct values for another city in the subsample. All of the inputs should be generated for the given city specifically within each call of the function since the loop works correctly, so I am left to assume the threads are intermingling rather than running in parallel.
I’ve also tried using the n_workers argument for the subsample and it seems to work, but it didn’t seem to speed things up much, at least for this small subsample.
I don’t have a log file, but let me know if there is anything you would like me to show for context. I added a picture showing what is happening in the multiprocessing results vs. what I should expect: