File tree Expand file tree Collapse file tree 1 file changed +5
-6
lines changed Expand file tree Collapse file tree 1 file changed +5
-6
lines changed Original file line number Diff line number Diff line change @@ -10,11 +10,11 @@ This documentation shows some reference performance numbers and the steps to rep
10
10
11
11
It includes:
12
12
13
- - ROCm™ 6.2.2
13
+ - ROCm™ 6.3
14
14
15
15
- vLLM 0.6.3
16
16
17
- - PyTorch 2.5dev (nightly)
17
+ - PyTorch 2.6dev (nightly)
18
18
19
19
## System configuration
20
20
@@ -23,7 +23,7 @@ The performance data below was measured on a server with MI300X accelerators wit
23
23
| System | MI300X with 8 GPUs |
24
24
| ---| ---|
25
25
| BKC | 24.13 |
26
- | ROCm | version ROCm 6.2.2 |
26
+ | ROCm | version ROCm 6.3 |
27
27
| amdgpu | build 2009461 |
28
28
| OS | Ubuntu 22.04 |
29
29
| Linux Kernel | 5.15.0-117-generic |
@@ -45,9 +45,8 @@ You can pull the image with `docker pull rocm/vllm-dev:main`
45
45
46
46
### What is New
47
47
48
- - MoE optimizations for Mixtral 8x22B, FP16
49
- - Llama 3.2 stability improvements
50
- - Llama 3.3 support
48
+ - ROCm 6.3 support
49
+ - Potential bug with Tunable Ops not saving due to a PyTorch issue
51
50
52
51
53
52
Gemms are tuned using PyTorch's Tunable Ops feature (https://github.com/pytorch/pytorch/blob/main/aten/src/ATen/cuda/tunable/README.md )
You can’t perform that action at this time.
0 commit comments