Improve document (#126)

**Description** Improve document by: Add using docker option in installation page Tell users the time of compiling MSCCL.
Azure · Nov 7, 2023 · 7b346f4 · 7b346f4
1 parent e3e6885
commit 7b346f4
Show file tree

Hide file tree

Showing 2 changed files with 30 additions and 6 deletions.
diff --git a/docs/getting-started/installation.mdx b/docs/getting-started/installation.mdx
@@ -18,23 +18,47 @@ Here're the system requirements for MS-AMP.
 * CUDA version 11 or later (which can be checked by running `nvcc --version`).
 * PyTorch version 1.14 or later (which can be checked by running `python -c "import torch; print(torch.__version__)"`).
 
-We strongly recommend using [PyTorch NGC Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch). For example, to start PyTorch 2.1 container, run the following command:
+You can try MS-AMP in two ways: Using Docker or installing from source:  
+
+* Using Docker is a convenient way to get started with MS-AMP. You can use the pre-built Docker image to quickly set up an environment for running MS-AMP.  
+* On the other hand, installing from source gives you more control over the installation process and allows you to customize the installation to your needs.
+
+## Use Docker
+
+You can try the latest MS-AMP Docker container with the following commands:
+
+```bash
+sudo docker run -it -d --name=msampcu121 --privileged --net=host --ipc=host --gpus=all -v /:/hostroot ghcr.io/azure/msamp:main-cuda12.1 bash
+sudo docker exec -it msampcu121 bash
+```
+
+MS-AMP is pre-installed in Docker container and you can verify it by running:
+
+```bash
+python -c 'import msamp;print(msamp.__version__)'
+```
+
+We also provide stable Docker images [here](../user-tutorial/container-images.mdx). 
+
+## Install from source
+
+We strongly recommend using [PyTorch NGC Container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/pytorch) to avoid messing up local environment.  
+For example, to start PyTorch 2.1 container, run the following command:
 
 ```bash
 sudo docker run -it -d --name=msamp --privileged --net=host --ipc=host --gpus=all nvcr.io/nvidia/pytorch:23.04-py3 bash
 sudo docker exec -it msamp bash
 ```
 
-## Install MS-AMP
-You can clone the source from GitHub.
+Then, you can clone the source from GitHub.
 
 ```bash
 git clone https://github.com/Azure/MS-AMP.git
 cd MS-AMP
 git submodule update --init --recursive
 ```
 
-If you want to train model with multiple GPU, you need to install MSCCL to support FP8.
+If you want to train model with multiple GPU, you need to install MSCCL to support FP8. Please note that the compilation of MSCCL may take ~40 minutes on A100 nodes and ~7 minutes on H100 node.
 
 ```bash
 cd third_party/msccl

diff --git a/docs/getting-started/run-msamp.md b/docs/getting-started/run-msamp.md
@@ -14,10 +14,10 @@ After installing MS-AMP, you can run several simple examples using MS-AMP. Pleas
 python mnist.py --enable-msamp --opt-level=O2
 ```
 
-### 2. Run mnist using multi GPUS in single node
+### 2. Run mnist using multi GPUs in single node
 
 ```bash
-torchrun --nproc_per_node=$GPUS mnist_ddp.py --enable-msamp --opt-level=O2
+torchrun --nproc_per_node=8 mnist_ddp.py --enable-msamp --opt-level=O2
 ```
 
 ## CIFAR10