TODO Also see stabs re-run all-reduce bench and update plot+table as the bench switched to KiB/MiB/etc. https://github.com/stas00/ml-engineering/tree/master/network/benchmarks#all_reduce-benchmark