zstd filter: add support for parallel processing #163

marxin · 2023-08-30T06:48:33Z

As you probably know, zstd is famous for its fast (and stable) parallel compression capability. It would be great if iostream would teach the skill.

I've got a WIP parallel implementation of the io::multichar_output_filter filter:
boost-zstd.cc.txt

The text was updated successfully, but these errors were encountered:

marxin · 2023-08-30T06:48:44Z

@rdoeffinger

rdoeffinger · 2023-08-30T07:28:30Z

There are 3 main stumbling blocks:

it seems there is no API designed for how to set filter options in a generic way. While it's possible to just add a function for setting multithreading specifically, it would be better if it could support more, like setting compression levels in a standard way for all compression filters, allowing to set any new options that are added to zstd etc. Designing this is not easy
it seems zstd does not have an option for "choose a good amount of worker threads", it only accepts an explicit number. That makes it a pain to use and hard to provide a nice "enable multithreading" option
I don't have a use-case that benefits much from this, so I would be in no position to test if it works well and if there are any performance bottlenecks (e.g. too much flushing being added by default).

marxin · 2023-08-30T07:59:47Z

it seems zstd does not have an option for "choose a good amount of worker threads", it only accepts an explicit number. That makes it a pain to use and hard to provide a nice "enable multithreading" option

Yeah, one might want to use the auto-detection capability of the library:
--auto-threads={physical,logical} (default: physical): When using a default amount of threads via -T0, choose the default based on the number of detected physical or logical cores.

I don't have a use-case that benefits much from this, so I would be in no position to test if it works well and if there are any performance bottlenecks (e.g. too much flushing being added by default).

Heh. I think every time you compress something bigger you want to have it done as fast as possible. That's why using gzip won't work as it's only single-threaded compression algorithm.

rdoeffinger · 2023-08-30T17:30:35Z

Yeah, one might want to use the auto-detection capability of the library

My understanding is that the library does not have this capability, only the binary. But if there is a library API to select this mode it would indeed be helpful.

Heh. I think every time you compress something bigger you want to have it done as fast as possible.

Not if the thing producing the data is producing it at a fairly leisurely pace. It just runs for long enough that you need compression. Also all cases where you have a lot of different things to compress are better handled by parallelism across the items to compress, not by parallelizing the compression.
But anyway, not the main point and not an argument against the feature, just that this complicates developing and testing such a feature for me (plus, reduces motivation to do so).

marxin · 2023-08-31T09:45:52Z

Sure, I understand it's not an exciting use case for you. In my case, I stream data to a single file where I would like to apply compression. And that's why boost IO streams come in handy and it would help me to save a lot of time.

rdoeffinger · 2023-09-11T17:55:48Z

I had a bit of a look.
The good news: it's easy to add support for extra parameters via the zstd_params struct.
The bad news: the library functions to set parallel compression are only available from version 1.4.
So would need to figure out how to handle that.
While it's possible to automatically use these new functions when the header is 1.4 or newer, that would mean that a binary built against 1.4 will now no longer run on a system that has 1.3, whereas it would have worked fine before.
Not sure if that is really a problem, but it's a risk.
Then there is also the question, what to do if mutithreading was requested but turns out to not be available?
There is the more ugly and hard-core option to just give access to the cstream_ and dstream_ variables, leaving it to the user to figure it out and set these options, but that seems like bad design.

marxin · 2023-09-13T10:16:58Z

Well, ZSTD 1.4 is more than 4 years old, so I would not spent much time with older ZSTD releases.
And yes, if MT is not available (or disabled in libzstd), then single-threaded mode seems to me like a reasonable fallback.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

zstd filter: add support for parallel processing #163

zstd filter: add support for parallel processing #163

marxin commented Aug 30, 2023

marxin commented Aug 30, 2023

rdoeffinger commented Aug 30, 2023

marxin commented Aug 30, 2023

rdoeffinger commented Aug 30, 2023 •

edited

Loading

marxin commented Aug 31, 2023

rdoeffinger commented Sep 11, 2023

marxin commented Sep 13, 2023

zstd filter: add support for parallel processing #163

zstd filter: add support for parallel processing #163

Comments

marxin commented Aug 30, 2023

marxin commented Aug 30, 2023

rdoeffinger commented Aug 30, 2023

marxin commented Aug 30, 2023

rdoeffinger commented Aug 30, 2023 • edited Loading

marxin commented Aug 31, 2023

rdoeffinger commented Sep 11, 2023

marxin commented Sep 13, 2023

rdoeffinger commented Aug 30, 2023 •

edited

Loading