|
1 | 1 | OpenBLAS ChangeLog
|
| 2 | +==================================================================== |
| 3 | +Version 0.3.4 |
| 4 | +02-Dec-2018 |
| 5 | + |
| 6 | +common: |
| 7 | + * the new, experimental thread-local memory allocation had |
| 8 | + inadvertently been left enabled for gmake builds in 0.3.3 |
| 9 | + despite the announcement. It is now disabled by default, and |
| 10 | + single-threaded builds will keep using the old allocator even |
| 11 | + if the USE_TLS option is turned on. |
| 12 | + * OpenBLAS will now provide enough buffer space for at least 50 |
| 13 | + threads by default. |
| 14 | + * The output of openblas_get_config() now contains the version |
| 15 | + number. |
| 16 | + * A serious thread safety bug in GEMV operation with small M and |
| 17 | + large N size has been fixed. |
| 18 | + * The code will now automatically call blas_thread_init after a |
| 19 | + fork if needed before handling a call to openblas_set_num_threads |
| 20 | + * Accesses to parallelized level3 functions from multiple callers |
| 21 | + are now serialized to avoid thread races (unless using OpenMP). |
| 22 | + This should provide better performance than the known-threadsafe |
| 23 | + (but non-default) USE_SIMPLE_THREADED_LEVEL3 option. |
| 24 | + * When building LAPACK with gfortran, -frecursive is now (again) |
| 25 | + enabled by default to ensure correct behaviour. |
| 26 | + * The OpenBLAS version cblas.h now supports both CBLAS_ORDER and |
| 27 | + CBLAS_LAYOUT as the name of the matrix row/column order option. |
| 28 | + * Externally set LDFLAGS are now passed through to the final compile/link |
| 29 | + steps to facilitate setting platform-specific linker flags. |
| 30 | + * A potential race condition during the build of LAPACK (that would |
| 31 | + usually manifest itself as a failure to build TESTING/MATGEN) has been |
| 32 | + fixed. |
| 33 | + * xHEMV has been changed to stay single-threaded for small input sizes |
| 34 | + where the overhead of multithreading exceeds any possible gains |
| 35 | + * CSWAP and ZSWAP have been limited to a single thread except on ARMV8 or |
| 36 | + ThunderX hardware with sizable input. |
| 37 | + * Linker flags for the PGI compiler have been updated |
| 38 | + * Behaviour of AXPY with zero increments is now handled in the C interface, |
| 39 | + correcting the result on at least Intel Atom. |
| 40 | + * The result matrix from calling SGELSS with an all-zero input matrix is |
| 41 | + now zeroed completely. |
| 42 | + |
| 43 | +x86_64: |
| 44 | + * Autodetection of AMD Ryzen2 has been fixed (again). |
| 45 | + * CMAKE builds now support labeling of an INTERFACE64=1 build of |
| 46 | + the library with the _64 suffix. |
| 47 | + * AVX512 version of DGEMM has been added and the AVX512 SGEMM kernel |
| 48 | + has been sped up by rewriting with C intrinsics |
| 49 | + * Fixed compilation on RHEL5/CENTOS5 (issue with typename __WAIT_STATUS) |
| 50 | + |
| 51 | +POWER: |
| 52 | + * added support for building on AIX (with gcc and GNU tools from AIX Toolbox). |
| 53 | + * CPU type detection has been implemented for AIX. |
| 54 | + * CPU type detection has been fixed for NETBSD. |
| 55 | + |
| 56 | +MIPS64: |
| 57 | + * AXPY on LOONGSON3A has been corrected to pass "zero increment" utest. |
| 58 | + * DSDOT on LOONGSON3A has been fixed. |
| 59 | + * the SGEMM microkernel has been hardened against potential data loss. |
| 60 | + |
| 61 | +ARMV8: |
| 62 | + * DYNAMic_ARCH support is now available for 64bit ARM |
| 63 | + * cross-compiling for ARMV8 under iOS now works. |
| 64 | + * cpu-specific code has been rearranged to make better use of both |
| 65 | + hardware commonalities and model-specific compiler optimizations. |
| 66 | + * XGENE1 has been removed as a TARGET, superseded by the improved generic |
| 67 | + ARMV8 support. |
| 68 | + |
| 69 | +ARMV7: |
| 70 | + * Older assembly mnemonics have been converted to UAL form to allow |
| 71 | + building with clang 7.0 |
| 72 | + * Cross compiling LAPACKE for Android has been fixed again (broken by |
| 73 | + update to LAPACK 3.7.0 some while ago). |
| 74 | + |
2 | 75 | ====================================================================
|
3 | 76 | Version 0.3.3
|
4 | 77 | 31-Aug-2018
|
|
0 commit comments