Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate codec parallelism #128

Open
LDeakin opened this issue Jan 11, 2025 · 0 comments
Open

Investigate codec parallelism #128

LDeakin opened this issue Jan 11, 2025 · 0 comments
Labels
help wanted Extra attention is needed performance Performance related

Comments

@LDeakin
Copy link
Owner

LDeakin commented Jan 11, 2025

zarrs is pretty quick (see the zarr_benchmarks repo), but maybe it can be even faster?

zarrs automatically balances chunk and internal codec parallelism based on codec recommended_concurrency and the number of chunks involved in an operation. This works very well for sharding.

Some other codecs can support internal parallelism (e.g. zfp, blosc), but multithreading is intentionally disabled. Why? Because in my limited testing, the overhead seems quite high unless the chunks are massive, and I almost always use sharding with small inner chunks. It may be worth investigating these codecs to determine if it is worth kicking in codec parallelism (and with how many threads). The potential advantage is reduced memory usage (fewer chunks being decoded in parallel) and less cache thrashing.

@LDeakin LDeakin added help wanted Extra attention is needed performance Performance related labels Jan 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed performance Performance related
Projects
None yet
Development

No branches or pull requests

1 participant