Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compression is likely not working. #288

Open
CiaranWelsh opened this issue May 28, 2024 · 5 comments
Open

Compression is likely not working. #288

CiaranWelsh opened this issue May 28, 2024 · 5 comments

Comments

@CiaranWelsh
Copy link

I've been playing around with the compression algorithms in hdf5 and observed that the Blosc compression algorithms basically do nothing. I suspect this is a bug on your end (though also possible on my end)

Here's my evaluation, hope it helps =]

compression_performance

@mulimoen
Copy link
Collaborator

See #273 for a potential duplicate

@CiaranWelsh
Copy link
Author

CiaranWelsh commented May 29, 2024

Yes looks like is a duplicate, thanks for pointing me to the other issue. After building hdf5-rust like this:

// cargo.toml

hdf5 = { git = "https://github.com/aldanor/hdf5-rust.git", rev = "33c7a18155bf0ccf11dfe8412f59376619e292bc", features = ["blosc", "lzf" ] }
hdf5-types = { git = "https://github.com/aldanor/hdf5-rust.git", rev = "33c7a18155bf0ccf11dfe8412f59376619e292bc" }
blosc-src = { version = "0.3.0", features = ["lz4", "zlib", "zstd"] }

the new data looks like this (please ignore the title in the plots as they are wrong (input data

compression_algorithmspace-check-output-1MB-raw-data
compression_levelspace-check-output-1MB-raw-data
compression_nthreadsspace-check-output-1MB-raw-data
compression_shufflespace-check-output-1MB-raw-data
hdf5_chunk_sizespace-check-output-1MB-raw-data

As a follow up questions, it seems that the number of threads argument isn't doing much. I'm setting it with

        hdf5::filters::blosc_set_nthreads(nthreads);


Is there something I'm missing in order to get this working? Thanks

@mulimoen
Copy link
Collaborator

The blosc threads issue was mentioned in #231

@CiaranWelsh
Copy link
Author

Great, I'll not worry about it for now then and hope a fix turns up in time. Thanks for the help!

@mulimoen
Copy link
Collaborator

You are very welcome to make a PR exposing blosc_sys::blosc_set_blocksize as hdf5::filters::blosc_set_blocksize and another adding features blosc-lz4 which enables `blosc-sys/lz4 etc.!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants