Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: incorporate dask-geopandas #9

Open
nmarchio opened this issue Apr 26, 2021 · 0 comments
Open

Feature request: incorporate dask-geopandas #9

nmarchio opened this issue Apr 26, 2021 · 0 comments

Comments

@nmarchio
Copy link
Member

nmarchio commented Apr 26, 2021

To improve performance a potentially useful change to the underlying code is to add dask-geopandas compatibility (which is an extension of dask). In the case of Midway, this would involve using the dask-slurm scheduler.

https://jobqueue.dask.org/en/latest/examples.html#slurm-deployments

The current recommended approach for parallelization is as follows:
Bash script that contains a sbatch template that submits 1 job per node for each country (up to 10 countries at once) and have the script serially process each GADM file and chunk the rows of the GADM file in parallel across the 28 cores

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant