ThunderFX is slower than torch.compile for vicuna-33b-v1.3, Platypus-30B and falcon-40b with FSDP & zero3. #1366

mpatel31415 · 2024-10-30T08:54:54Z

🐛 Bug

To Reproduce

Steps to reproduce the behavior:

Please use:
2 node(s), each with 8 GPUs.
Image "INTERNAL_IMAGE:20241025"
Training script:

python /opt/pytorch/lightning-thunder/thunder/benchmarks/benchmark_litgpt.py \
    --model_name vicuna-33b-v1.3 \
    --distributed_mode fsdp \
    --shard_mode zero3 \
    --compile dynamo_thunder \
    --checkpoint_activations False \
    --low_precision_mode none  \
    --micro_batch_size 2

Environment

system.device_product_name DGXH100
system.gpu_driver_version 535.129.03
libraries.cuda 12.6.2.004
libraries.pip.lightning 2.4.0.dev20240728
libraries.pip.lightning-thunder 0.2.0.dev0
libraries.pip.lightning-utilities 0.11.8
libraries.pip.litgpt 0.4.11
libraries.pip.nvfuser 0.2.20+git85c22a2
libraries.pip.pytorch-lightning 2.4.0
libraries.pip.torch 2.6.0a0+git96b30dc
libraries.pip.torchmetrics 1.5.1
libraries.pip.torchvision 0.19.0a0+d23a6e1

The text was updated successfully, but these errors were encountered:

mpatel31415 · 2024-11-12T16:06:12Z

Recent comparison to torch.compile. There are some new models:

nvMelissa added the mixology Issues that the mixology team has surfaced label Nov 4, 2024

nvMelissa assigned IvanYashchuk Nov 4, 2024

wprazuch mentioned this issue Dec 10, 2024

[Regressions] ThunderFX is slower than 2 weeks ago for 3 models #1534

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ThunderFX is slower than torch.compile for vicuna-33b-v1.3, Platypus-30B and falcon-40b with FSDP & zero3. #1366

ThunderFX is slower than torch.compile for vicuna-33b-v1.3, Platypus-30B and falcon-40b with FSDP & zero3. #1366

mpatel31415 commented Oct 30, 2024

mpatel31415 commented Nov 12, 2024

ThunderFX is slower than torch.compile for vicuna-33b-v1.3, Platypus-30B and falcon-40b with FSDP & zero3. #1366

ThunderFX is slower than torch.compile for vicuna-33b-v1.3, Platypus-30B and falcon-40b with FSDP & zero3. #1366

Comments

mpatel31415 commented Oct 30, 2024

🐛 Bug

To Reproduce

Environment

mpatel31415 commented Nov 12, 2024