Releases: openucx/ucx
Releases · openucx/ucx
v1.3.0 RC4
Changelog:
- Fixes for gcc8 compilation
- Fix missing initialization of rndv_send_nbr thresholds
- Fix mlx5 srq cleanup
- Fix ep info print when there is no wireup lane
- Optimize ugni locking
v1.3.0 RC3
- Fix compilation issue with mlx5 on ARM
- Disable GDR-copy when ODP is used
v1.3.0 RC2
Bugfixes:
- Fix flow control for DC transport
1.3.0 - RC1
Features:
- Added stream-based communication API to UCP
- Added support for GPU platforms: Nvidia CUDA and AMD ROCM software stacks
- Added API for client/server based connection establishment
- Added support for TCP transport
- Support for InfiniBand tag-matching offload for DC and accelerated transports
- Multi-rail support for eager and rendezvous protocols
- Added support for tag-matching communications with CUDA buffers
- Added ucp_rkey_ptr() to obtain pointer for shared memory region
- Avoid progress overhead on unused transports
- Improved scalability of software tag-matching by using a hash table
- Added transparent huge-pages allocator
- Added non-blocking flush and disconnect for UCP
- Support fixed-address memory allocation via ucp_mem_map()
- Added ucp_tag_send_nbr() API to avoid send request allocation
- Support global addressing in all IB transports
- Add support for external epoll fd and edge-triggered events
- Added registration cache for knem
- Initial support for Java bindings
Bugfixes:
- Multiple bugfixes (full list on githib)
Tested configurations: - InfiniBand: MLNX_OFED 4.2, inbox OFED drivers.
- CUDA: gdrcopy 1.2, cuda 9.1.85
- XPMEM: 2.6.2
- KNEM: 1.1.2
Known issues:
#2047 - UCP: ucp_do_am_bcopy_multi drops data on UCS_ERROR_NO_RESOURCE
#2047 - failure in ud/uct_flush_test.am_zcopy_flush_ep_nb/1
#1977 - failure in shm/test_ucp_rma.blocking_small/0
#1926 - Timeout in mpi_test_suite with HW TM
#1920 - transport retry count exceeded in many-to-one tests
#1689 - Segmentation fault on memory hooks test in jenkins
v1.2.2
Main:
- Support including UCX API headers from C++ code
- UD transport to handle unicast flood on RoCE fabric
- Compilation fixes for gcc 7.1.1, clang 3.6, clang 5
Details:
- When UD transport is used with RoCE, packets intended for other peers may
arrive on different adapters (as a result of unicast flooding). - This change adds packet filtering based on destination GIDs. Now the packet
is silently dropped, if its destination GID does not match the local GID. - Added a new device ID for InfiniBand HCA
- [packaging] Move
examples/
andperftest/
into doc - [packaging] Update spec to work on old distros while complaint with Fedora
guidelines - [cleanup] Removed unused ptmalloc version (2.83)
- [cleanup] Fixup license headers
v1.2.2 RC1
Main:
- Support including UCX API headers from C++ code
- UD transport to handle unicast flood on RoCE fabric
- Compilation fixes for gcc 7.1.1 and clang 3.6
Details:
- When UD transport is used with RoCE, packets intended for other peers may
arrive on different adapters (as a result of unicast flooding). - This change adds packet filtering based on destination GIDs. Now the packet
is silently dropped, if its destination GID does not match the local GID. - [packaging] Move
examples/
andperftest/
into doc - [packaging] Update spec to work on old distros while complaint with Fedora
guidelines - [cleanup] Removed unused ptmalloc version (2.83)
- [cleanup] Fixup license headers
v1.2.1
v1.2.0
1.2.0 (June 15, 2017)
Supported platforms
- Shared memory: KNEM, CMA, XPMEM, SYSV, Posix
- VERBs over InfiniBand and RoCE.
VERBS over other RDMA interconnects (iWarp, OmniPath, etc.) is available
for community evaluation and has not been tested in context of this release - Cray Gemini and Aries
- Architectures: x86_64, ARMv8 (64bit), Power64
Features:
- Added support for InfiniBand DC and UD transports, including accelerated verbs for Mellanox devices
- Full support for PGAS/SHMEM interfaces, blocking and non-blocking APIs
- Support for MPI tag matching, both in software and offload mode
- Zero copy protocols and rendezvous, registration cache
- Handling transport errors
- Flow control for DC/RC
- Dataypes support: contiguous, IOV, generic
- Multi-threading support
- Support for ARMv8 64bit architecture
- A new API for efficient memory polling
- Support for malloc-hooks and memory registration caching
Bugfixes:
- Multiple bugfixes improving overall stability of the library
Known issues:
- #1604 - Failure in ud/test_ud_slow_timer.retransmit1/1 with valgrind bug
- #1588 - Fix reading cpuinfo timebase for ppc bug portability training
- #1579 - Ud/test_ud.ca_md test takes too long too complete bug
- #1576 - Failure in ud/test_ud_slow_timer.retransmit1/0 with valgrind bug
- #1569 - Send completion with error with dc_verbs bug
- #1566 - Segfault in malloc_hook.fork on arm bug
- #1565 - Hang in udrc/test_ucp_rma.nonblocking_stream_get_nbi_flush_worker bug
- #1534 - Wireup.c:473 Fatal: endpoint reconfiguration not supported yet bug
- #1533 - Stack overflow under Valgrind 'rc_mlx5/uct_p2p_err_test.local_access_error/0' bug
- #1513 - Hang in MPI_Finalize with UCX_TLS=rc[_x],sm on the bsend2 test bug
- #1504 - Failure in cm/uct_p2p_am_test.am_bcopy/1 bug
- #1492 - Hang when using polling fd bug
- #1489 - Hang on the osu_fop_latency test with RoCE bug
- #1005 - ROcE problem with OMPI direct modex - UD assertion
1.1.0 (September 1, 2015)
Workarounds:
Features:
- Added support for AM based on FIFO in
mm
shared memory transport - Added support for UCT
knem
shared memory transport (http://knem.gforge.inria.fr) - Added support for UCT
mm/xpmem
shared memory transport (https://github.com/hjelmn/xpmem)
Bugfixes:
Known issues:
1.0.0 (July 22, 2015)
Features:
- Added support for UCT
cma
shared memory transport (Cross-Memory Attatch) - Added support for UCT
mm
shared memory transport with mmap/sysv APIs - Added support for UCT
rc
transport based on Infiniband/RC with verbs - Added support for UCT
mlx5_rc
transport based on Infiniband/RC with accelerated verbs - Added support for UCT
cm
transport based on Infiniband/SIDR (Service ID Resolution) - Added support for UCT
ugni
transport based on Cray/UGNI - Added support for Doxygen based documentation generation
- Added support for UCP basic protocol layer to fit PGAS paradigm (RMA, AMO)
- Added ucx_perftest utility to exercise major UCX flows and provide performance metrics
- Added test script for jenkins (contrib/test_jenkins.sh)
- Added packaging for RPM/DEB based linux distributions (see contrib/buildrpm.sh)
- Added Unit-tests infractucture for UCX functionality based on Google Test framework (see test/gtest/)
- Added initial integration for OpenMPI with UCX for PGAS/SHMEM API
(see: https://github.com/openucx/ompi-mirror/pull/1) - Added end-to-end testing infrastructure based on MTT (see contrib/mtt/README_MTT)