Why do .solb files have lower compression ratios than .meshb files? #36
Answered
by
LoicMarechal
CsatiZoltan
asked this question in
Q&A
Replies: 3 comments 2 replies
-
|
Hi Zoltan.
Floating point values never compress well with lossless compression method, either in ASCII or binary.
ASCII format compression is slightly better than binary as it is more redundant.
In the end, a compressed ASCII and a compressed binary are about the same size.
Your only option to really save on space with floating point values is ti use a lossy algorithm.
But there is no general purpose solution, it has to be adapted to the physics.
You can check SZ if you are interested: https://szcompressor.org/
Mesh connectivity compresses well in ASCII or binary as the Huffman and Lempel-Ziv algorithm work well with topological information.
But the Lempel-Ziv compression can be greatly enhanced with the help of mesh renumbering through an SFC (Space Filing Curve).
A tet mesh renumbered through a Hilbert curve is compressed by a factor of four.
Regards,
Loïc
|
Beta Was this translation helpful? Give feedback.
1 reply
Answer selected by
CsatiZoltan
-
|
Thank you for the remarks.
I've read good things about ALP as well, though it is a lossless compressor.
Is mesh renumbering supported in libMeshb, or it is out of scope?
The mesh renumbering is handled by one of my other libraries, the LPlib (https://github.com/LoicMarechal/LPlib).
In the utilities directory there is a command line (hilbert) that renumbers a mesh through a Hilbert curve.
The Lplib also offers this renumbering scheme as a procedure you can call right after reading a mesh file.
Let's take an example with a 10 million tets mesh file in binary format.
With random numbering:
Initial size = 243 MB
Compression with xz on 64 cores = 10.8 seconds
Compressed size = 143 MB
With Hilbert renumbering:
Initial size = 243 MB
Renumbering with hilbert on 64 cores= 0.6 seconds
Compression with xz on 64 cores = 8.1 seconds
Compressed size = 50 MB
The renumbering time is even offset by the speedup in compression time.
Loïc
|
Beta Was this translation helpful? Give feedback.
1 reply
-
|
Do you mean that I can do in-memory renumbering on the mesh file I read with libMeshb (so no file I/O)? If so, is the renumbering done in-place, or a new mesh data structure is created (essentially duplicating the memory usage at a given time)?
The mesh renumbering is done in-memory.
You can choose between two ways.
Do it yourself:
HilbertRenumbering()
Input: your mesh coordinates
Output: a renumbering table that works both ways, old -> new and new -> old indices for each vertex.
It is up to you to permute the elements.
All in one:
MeshRenumbering()
Input: pointers to your data
Output: none
The whole mesh is renumbered via memory duplication.
The mesh fields are renumbered and copied one by one so the maximum memory footprint is not twice the size of the whole mesh but is as big as the biggest field (usually tets).
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi Loïc,
I made some lossless compression tests with gzip (that relies on zlib) and zstandard. A .meshb file could be compressed well (compression ratio up to 1.95), while a .solb file hardly (compression ratio at most 1.15).
I used
transmeshto convert the binary files to ASCII formats so that I can investigate the values. Since my .solb file was obtained in the end of a simulation, the values in there differ significantly. It suggests why the compression ratio is low. As for the .mesh(b) files, they contain two types of tables: floating point values for the vertex coordinates and integer values for the mesh connectivity. If zlib and zstandard can compress integers better than floats, it explains the relatively higher compression ratio. But this reasoning is valid for the ASCII format, I don't know if it remains valid for the binary formats.As you know the internal structure of the binary formats, do you have an idea why this is the case?
Thank you
Beta Was this translation helpful? Give feedback.
All reactions