DLA-Future 0.4.0
Changes:
- Modified
CommunicatorGrid
to avoid blocking calls toMPI_Comm_dup
. It now returns communicator pipelines. (#993) - Added support for Intel oneMKL and the
intel-oneapi-mkl
spack package. (#1073) (*)
Performance improvements:
- Reduced the size of the matrix-matrix multiplications in the tridiagonal eigensolver to cover only the non deflated part of the eigenvectors. (#951 #967 #996 #997 #998)
- Introduced stackless threads where appropriate. (#1037)
Bug fixes:
- Use
drop_operation_state
to avoid stack overflows. (#1004)
Notes:
(*) At the time of the release the spack spec blaspp~openmp ^intel-oneapi-mkl threads=openmp
doesn't build. If you rely on multithreaded BLAS we suggest to use blaspp+openmp ^intel-oneapi-mkl threads=openmp
until spack/spack#42087 gets merged.