-
-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Issues: vllm-project/vllm
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[Bug]: vllm 0.8.4 start with using ray, and ray's dashboard fails to start
bug
Something isn't working
#16779
opened Apr 17, 2025 by
ying2025
1 task done
[Bug]: Could't deploy c4ai-command-a-03-2025 with VLLM docker
bug
Something isn't working
#16777
opened Apr 17, 2025 by
mru4913
1 task done
[Usage]: [offline inference] How to get a stream response with tools and how to implement the "parallel_tool_calls" parameter
usage
How to use vllm
#16775
opened Apr 17, 2025 by
konglykly
1 task done
[Bug]: Invalid Mistral ChatCompletionRequest Body Exception
bug
Something isn't working
#16774
opened Apr 17, 2025 by
JasmondL
1 task done
[Bug]: vllm stopped at vLLM is using nccl==2.21.5
bug
Something isn't working
#16772
opened Apr 17, 2025 by
WanianXO
1 task done
[Usage]: How to configure the server parameters for THUDM/GLM-4-32B-0414 to support Function call using vllm-0.8.4?
usage
How to use vllm
#16771
opened Apr 17, 2025 by
jifa513
1 task done
[Bug]: vllm-v0.7.3 V0 engine TP=16 serve DeepSeek-R1 Crash while inference
bug
Something isn't working
#16766
opened Apr 17, 2025 by
handsome-chips
1 task done
[Feature]: VLLM does not support inference for the dora-fine-tuned R1-distill-qwen large model.
feature request
New feature or request
#16764
opened Apr 17, 2025 by
HelloWorldMan-git
1 task done
[Bug]: qwen2.5-vl inference truncated
bug
Something isn't working
#16763
opened Apr 17, 2025 by
vivian-chen010
1 task done
[Usage]: vLLM fails with NCCL invalid usage error when serving model on multi-GPU
usage
How to use vllm
#16761
opened Apr 17, 2025 by
whfeLingYu
1 task done
[Usage]: Wrong context length for Qwen2.5-7B-Instruct?
usage
How to use vllm
#16757
opened Apr 17, 2025 by
tjoymeed
[Bug]: InternVL3-78B OOM on 4 A100 40G in 0.8.4
bug
Something isn't working
#16749
opened Apr 17, 2025 by
hanggun
1 task done
[Feature]: AMD Ryzen AI NPU support
feature request
New feature or request
#16742
opened Apr 16, 2025 by
InspiringCode
1 task done
[Bug]: GuidedDecodingParams choice - Request-level structured output backend must match engine-level backend
bug
Something isn't working
#16738
opened Apr 16, 2025 by
nrv
1 task done
[Feature]: return graceful inference text input validation errors as part of output (without throwing an exception) - to enable skipping / handling bad examples after the processing of good ones
feature request
New feature or request
#16732
opened Apr 16, 2025 by
vadimkantorov
1 task done
[Bug]: With --cpu-offload-gb, deepseek-moe-16b-chat got different response, even if the temperature is zero.
bug
Something isn't working
#16731
opened Apr 16, 2025 by
YenFuLin
1 task done
[Usage]: VLLM>0.8 also met No platform detected, vLLM is running on UnspecifiedPlatform
usage
How to use vllm
#16724
opened Apr 16, 2025 by
rainays
1 task done
[Bug]: Remove fallback to outlines for int/number range and pattern constraints in guided_json
bug
Something isn't working
#16723
opened Apr 16, 2025 by
csy1204
1 task done
[Usage]: I want to learn a hack method to offload and load vllm internal weights between CPU and GPU
usage
How to use vllm
#16722
opened Apr 16, 2025 by
vagitablebirdcode
1 task done
[Bug]: Vllm serve‘s results is not equal to offline inference.
bug
Something isn't working
#16718
opened Apr 16, 2025 by
tzjtatata
1 task done
[Bug]: vllm 0.8.3 v1 engine CUDA Graph Capturing time is too long
bug
Something isn't working
#16716
opened Apr 16, 2025 by
sjtu-zwh
1 task done
[Installation]: Kimi-VL-A3B failed to be deployed using vllm mirroring
installation
Installation problems
#16715
opened Apr 16, 2025 by
nigthDust
1 task done
[Bug]: Loading bnb-community/Llama-4-Scout-17B-16E-Instruct-bnb-4bit error Something isn't working
FusedMoE
quant_method is None
bug
#16713
opened Apr 16, 2025 by
fahadh4ilyas
1 task done
[Bug]: meta-llama/Llama-4-Scout-17B-16E-Instruct not supported on VLLM, 2xH100 NVL 196GB
bug
Something isn't working
#16712
opened Apr 16, 2025 by
bernardgut
1 task done
[Usage]: "How can I register Logits Processors in the args?"
usage
How to use vllm
#16709
opened Apr 16, 2025 by
hjlee1995
1 task done
Previous Next
ProTip!
Type g p on any issue or pull request to go back to the pull request listing page.