Ingest FP8 attn scales and use them in ROCm FlashAttention #1185
Annotations
2 errors and 1 warning
vllm/model_executor/models/llama.py#L235
vllm/model_executor/models/llama.py:235:81: E501 Line too long (81 > 80)
|
|
ubuntu-latest pipelines will use ubuntu-24.04 soon. For more details, see https://github.com/actions/runner-images/issues/10636
|
This job failed
Loading