Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is Client(threads_per_worker=1) the best way to load a Dask client? #409

Open
navidcy opened this issue Jul 5, 2024 · 5 comments
Open
Labels
❓ question Further information is requested

Comments

@navidcy
Copy link
Collaborator

navidcy commented Jul 5, 2024

I started seeing Client(threads_per_worker=1) in the recipes.
Is this intentional? Is this best practice? I'm not doubting, dask is a bit of a mystery for me... If this is what we should be doing we should change it everywhere, right?

cc @anton-seaice @angus-g

@navidcy navidcy added the ❓ question Further information is requested label Jul 5, 2024
@anton-seaice
Copy link
Collaborator

There is a bug somewhere in the netcdf libraries, that means that using more than one thread when trying to access a netcdf file in parallel (across multiple threads within the same work) fails.

Its a bug, so it should / will get fixed and this will flow through to future conda/analysis versions. Although the issue has been around for more than a year now and not been resolved.

Its a workaround for this issue:

https://forum.access-hive.org.au/t/netcdf-not-a-valid-id-errors/389

Dale said he would pin netcdf in conda/analysis to the previous version that didn't have this bug, but people were having issues last week. Ill ask him on the hive about it :)

@anton-seaice
Copy link
Collaborator

Duplicate of #398

@anton-seaice anton-seaice marked this as a duplicate of #398 Jul 10, 2024
@anton-seaice anton-seaice closed this as not planned Won't fix, can't repro, duplicate, stale Jul 10, 2024
@anton-seaice anton-seaice marked this as not a duplicate of #398 Jul 16, 2024
@anton-seaice
Copy link
Collaborator

I think I closed this incorrectly.

Anyway - there is no resolution in sight for this (its deep in the netcdf-c library!).

So for now, to run on 'conda/analysis3-24.04' or later, we need to set threads_per_worker=1

@anton-seaice anton-seaice reopened this Jul 16, 2024
@dougiesquire
Copy link
Collaborator

Duplicate of #398

These are not duplicates. The two issues are different.

@navidcy
Copy link
Collaborator Author

navidcy commented Jul 29, 2024

Should we add this to all examples until the issue is resolved?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
❓ question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants