Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
lib/x86/adler32: refactor and improve implementations
- Refactor the x86 implementations of Adler-32 to be organized like the x86 implementations of CRC-32, where there's an x86-specific template that expands into the different implementations. - Add an AVX512VNNI implementation using 256-bit vectors. - Increase the number of vectors processed per iteration of the inner loop of the AVX512VNNI implementations from 2 to 4. - Handle small amounts of data more efficiently. If the length is small, don't bother aligning the pointer at the beginning. Also optimize the handling of any bytes left over after the inner loop. Also avoid doing redundant reductions mod 65521. - Make the AVX-VNNI implementation dot with 1's so that all VNNI implementations use the same strategy. - Put "_x86" in the name of the functions, like what is done for CRC-32.
- Loading branch information