-
Notifications
You must be signed in to change notification settings - Fork 56
Testing and Benchmarking
Basic unit tests can be executed via the following command:
$ make check
which executes the unit test suite on the same environment where the make command was executed and reports a summary when done:
PASS: deque
PASS: freelist
PASS: msgbuff
PASS: show_tuner_decisions
PASS: scheduler
PASS: idpool
PASS: ep_addr_list
PASS: mr
============================================================================
Testsuite summary for aws-ofi-nccl GitHub-dev
============================================================================
# TOTAL: 8
# PASS: 8
# SKIP: 0
# XFAIL: 0
# FAIL: 0
# XPASS: 0
# ERROR: 0
============================================================================
Running plugin functional tests require a working MPI installation and a MPI setup between the communicating hosts. To install MPI, you can use standard packages provided for your linux distribution. Once MPI is setup, you can use commands like below for running any test of your choice.
mpirun -n 2 --host <host-1>,<host-2> $INSTALL_PREFIX/bin/nccl_message_transfer
Note: All tests require exactly 2 MPI ranks to run except ring.c
To run collective benchmark tests with the aws-ofi-nccl
plugin, you can follow the instructions below.
- Clone the repository
git clone https://github.com/NVIDIA/nccl-tests.git
- Build the tests
cd nccl-tests/
make MPI=1 MPI_HOME=/path/to/mpi CUDA_HOME=/path/to/cuda NCCL_HOME=/path/to/nccl
- Run perf tests
NCCL_DEBUG=INFO mpirun -np 2 --bind-to none build/all_reduce_perf -b 8 -f 2 -e 32M -c 1 -g 1
If you installed the AWS libfabric plugin in a custom prefix, ensure
LD_LIBRARY_PATH
is set to include that prefix so the perf test binaries can
find the plugin.