Releases · ridiculousfish/libdivide

30 Jul 17:59

kimwalisch

v5.1

0152603

libdivide-5.1 Latest

Latest

This is a maintenance release.

This release mainly fixes a C++ compilation failure by the upcoming GCC 15 compiler: #113

ChangeLog

Simplify & clean up the AVR constant div test code by @adbancroft in #85
Constant division templates by @adbancroft in #89
Tester program: enable vector tests by @adbancroft in #92
Fix GCC vector alignment and aliasing issues by @adbancroft in #93
Implement 16-bit SSE2 & AVX2 vector division by @adbancroft in #94
Fix compilation of primitive types by @ridiculousfish in #98
Fix minor issue during porting(https://github.com/apache/incubator-nuttx) by @xiaoxiang781216 in #99
Replace typeid(T).name() with type_tag::get_tag() by @xiaoxiang781216 in #100
Try to fix the MSVC build by @ridiculousfish in #101
Increase minimum CMake version to 3.5 by @qak in #106
Add prefixes to CMake option names by @qak in #107
Fix LIBDIVIDE_VERSION CMake variable by @qak in #108
Include missing CTest module by @qak in #109
Only build tests when libdivide is the main project by @qak in #110
Fix a typo (division/divsion) in README.md by @musicinmybrain in #114
Compile fix for divider::operator== by @masbug in #113
Add a Constexpr zero-initializing constructor for divider by @sharkautarch in #115

New Contributors

@xiaoxiang781216 made their first contribution in #99
@qak made their first contribution in #106
@musicinmybrain made their first contribution in #114
@masbug made their first contribution in #113
@sharkautarch made their first contribution in #115

Full Changelog: 5.0...v5.1

Contributors

ridiculousfish, musicinmybrain, and 5 other contributors

Assets 2

17 Jul 18:56

ridiculousfish

5.0

b322221

v5.0.0

Reference code for narrowing division has been added.
The C and C++ APIs have been extended to support 16-bit scalar integer division.
Multiple enhancements to add support for 8-bit microcontrollers
- Compiles cleanly using avr-gcc, used by the Atmel AVR microcontroller family (popular on Arduino boards)
  - Code base includes AtMega2560 test & bench marking programs
- Adds predefined macros to speed up division by 16-bit constants: division by a 16-bit constant is not optimized by avr-gcc on 8-bit systems.

Assets 2

09 Mar 07:20

ridiculousfish

v4.0.0

fdbafd4

v4.0.0

All SIMD types may now be used simultaneously, instead of selecting one at compile time. For example you may define all of LIBDIVIDE_SSE2, LIBDIVIDE_AVX2, and LIBDIVIDE_AVX512 and use them simultaneously.
ARM NEON types are now supported. New functions take uint32x4_t, int32x4_t, uint64x2_t, and int64x2_t. Note: while libdivide is tested on both ARM32 and AArch64, NEON intrinsics have only been tested on AArch64.
Breaking: To support multiple vector types, vector functions have been renamed according to their width (#52). Instead of libdivide_u32_do_vector, now use libdivide_u32_do_vec128 for SSE2 or NEON, libdivide_u32_do_vec256 for AVX2, and libdivide_u32_do_vec512 for AVX512.
On non-x86 CPUs, generating 64 bit dividers is now faster than before. Previously libdivide used __uint128_t when available; however libdivide's fallback code was shown to be several times faster so the __uint128_t path has been removed. x86 and x86-64 CPUs are unaffected.
Certain code sourced from StackOverflow has been reimplemented; this code had an ambiguous license. All code in libdivide is now covered under the zlib or boost license (at your option).
libdivide.h no longer requires C++11 or later. The minimum language standards are C99 or C++98.

Assets 2

16 Oct 09:32

kimwalisch

v3.0

08da2f6

libdivide-3.0

This release adds C++ support for all 32-bit and 64-bit integer types (#58). Unfortunately this code change required C++11 instead of C++98, hence the major version had to be increased (even though this is a small release). This version also improves libdivide's CMake build system which should make it easier to package libdivide.

BREAKING
- libdivide.h now requires C++11 or later
BUG FIXES
- Support all 32-bit and 64-bit integer types in C++ (#58)
- Fix cross compilation (#59)
ENHANCEMENT
- Add support for CMake find_package(libdivide)

Assets 2

04 Jul 14:45

kimwalisch

v2.0

4963536

libdivide-2.0

I am happy to announce the release of libdivide-2.0 🎉

Libdivide finally supports AVX2 and AVX512 vector division on x86 CPUs. Libdivide now also works with the clang-cl compiler and the Intel C++ compiler on Windows. There have been many small incremental improvements which should provide minor speedups for many use cases.

Since libdivide is now nearly 10 years old and many features have been added over the years it has become necessary to remove some rarely used functionality. I have removed the unswitch functionality since it was a large amount of code that has never been used by anybody as far as I am aware of. So overall, even with the added support for AVX2 and AVX512, libdivide.h now contains fewer lines of code than the previous release and compiles faster using both C and C++.

BREAKING
- Removed unswitch functionality (#46)
- Renamed macro LIBDIVIDE_USE_SSE2 to LIBDIVIDE_SSE2
- Renamed divider::recover_divisor() to divider::recover()
BUG FIXES
- Remove _udiv128() as not yet supported by clang-cl and icl compilers
- Fix C++ linker issue caused by anonymous namespace (#54)
- Fix clang-cl (Windows) linker issue (#56)
ENHANCEMENT
- Add AVX2 & AVX512 vector division
- Speed up SSE2 libdivide_mullhi_u64_vector()
- Support +1 & -1 signed branchfree dividers (4a1d5a7)
- Speed up unsigned branchfull power of 2 dividers (2422199)
- Simplify C++ templates
- Simplify more bit flags of the libdivide_*_t structs
- Get rid of MAYBE_VECTOR() hack
TESTING
- tester.cpp: Convert to modern C++
- tester.cpp: Add more test cases
- benchmark_branchfreee.cpp: Convert to modern C++
- benchmark.c: Prevent compilers from optmizing too much
BUILD
- Automatically detect SSE2/AVX2/AVX512
DOCS
- doc/C-API.md: Add C API reference
- doc/CPP-API.md: Add C++ API reference
- README.md: Add vector division and performance tips sections

Assets 2

29 May 16:51

kimwalisch

v1.1

61a1037

libdivide-1.1

This release fixes 2 non critical bugs and silences a few compiler warnings. The generation of libdivide divisors has been sped up for MSVC on x64 and for GCC/Clang on 64-bit CPU architectures other than x64. I have also done some general code clean ups, below is the compete changelog:

BUG FIXES
- Fix bug in libdivide_128_div_64_to_64() (#45)
- Fix MSVC ARM 64-bit bug (07931e9)
- Fix -Wshift-count-overflow warning on avr CPU architecture (#41)
- Fix -Wshadow warning in libdivide_s32_do()
- Fix -Wignored-attributes warnings when compiling SSE2 code using GCC 9
ENHANCEMENT
- libdivide_128_div_64_to_64(): optimize using _udiv128() for MSVC 2019 or later
- libdivide_128_div_64_to_64(): optimize using __uint128_t for GCC/Clang on 64-bit CPU architectures
- Add LIBDIVIDE_VERSION macro to libdivide.h
- Clean up SSE2 code in libdivide.h
- Increase runtime of test cases in primes_benchmark.cpp
BUILD
- Remove windows directory with legacy Visual Studio project files
- Move test programs to test directory

Assets 2

21 Jan 16:38

kimwalisch

v1.0

1562876

libdivide-1.0

I am happy to announce the 1.0 release of libdivide 🎉

A lot of effort has been spent to polish libdivide for the 1.0 release. It has also been tested extensively using a plethora of different compilers (GCC, Clang, MSVC, ICC, MinGW, Cygwin), OSes and CPU architectures (i386, x86-64, ARM, ARM64, PowerPC, PPC64) to ensure it passes all tests and compiles without warnings at a high warning level.

Have a look at the ChangeLog to see what's new.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ChangeLog

New Contributors

Contributors

Releases: ridiculousfish/libdivide

libdivide-5.1

ChangeLog

New Contributors

Contributors

v5.0.0

v4.0.0

libdivide-3.0

libdivide-2.0

libdivide-1.1

libdivide-1.0