vllm-project / vllm Public

Notifications You must be signed in to change notification settings
Fork 6.9k
Star 45.1k

Code
Issues 1.7k
Pull requests 583
Discussions
Actions
Projects 10
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Discussions
Actions
Projects
Security
Insights

Issues: vllm-project/vllm

[Roadmap] vLLM Roadmap Q2 2025

#15735 opened Mar 29, 2025 by simon-mo

Open 7

[V1] Feedback Thread

#12568 opened Jan 30, 2025 by simon-mo

Open 87

[Roadmap] vLLM Release/CI/Performance Benchmark Q2 2025

#16284 opened Apr 8, 2025 by khluu

Open 3

Beta

Labels 45 Milestones 0

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1,699 Open 6,306 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

[Bug]: vllm 0.8.4 start with using ray, and ray's dashboard fails to start bug

Something isn't working

#16779 opened Apr 17, 2025 by ying2025

1 task done

[Bug]: Could't deploy c4ai-command-a-03-2025 with VLLM docker bug

Something isn't working

#16777 opened Apr 17, 2025 by mru4913

1 task done

[Usage]: [offline inference] How to get a stream response with tools and how to implement the "parallel_tool_calls" parameter usage

How to use vllm

#16775 opened Apr 17, 2025 by konglykly

1 task done

[Bug]: Invalid Mistral ChatCompletionRequest Body Exception bug

Something isn't working

#16774 opened Apr 17, 2025 by JasmondL

1 task done

[Bug]: vllm stopped at vLLM is using nccl==2.21.5 bug

Something isn't working

#16772 opened Apr 17, 2025 by WanianXO

1 task done

[Usage]: How to configure the server parameters for THUDM/GLM-4-32B-0414 to support Function call using vllm-0.8.4? usage

How to use vllm

#16771 opened Apr 17, 2025 by jifa513

1 task done

[Bug]: vllm-v0.7.3 V0 engine TP=16 serve DeepSeek-R1 Crash while inference bug

Something isn't working

#16766 opened Apr 17, 2025 by handsome-chips

1 task done

[Feature]: VLLM does not support inference for the dora-fine-tuned R1-distill-qwen large model. feature request

New feature or request

#16764 opened Apr 17, 2025 by HelloWorldMan-git

1 task done

[Bug]: qwen2.5-vl inference truncated bug

Something isn't working

#16763 opened Apr 17, 2025 by vivian-chen010

1 task done

[Usage]: vLLM fails with NCCL invalid usage error when serving model on multi-GPU usage

How to use vllm

#16761 opened Apr 17, 2025 by whfeLingYu

1 task done

[Usage]: Wrong context length for Qwen2.5-7B-Instruct? usage

How to use vllm

#16757 opened Apr 17, 2025 by tjoymeed

[Bug]: InternVL3-78B OOM on 4 A100 40G in 0.8.4 bug

Something isn't working

#16749 opened Apr 17, 2025 by hanggun

1 task done

[Feature]: AMD Ryzen AI NPU support feature request

New feature or request

#16742 opened Apr 16, 2025 by InspiringCode

1 task done

[Bug]: GuidedDecodingParams choice - Request-level structured output backend must match engine-level backend bug

Something isn't working

#16738 opened Apr 16, 2025 by nrv

1 task done

[Feature]: return graceful inference text input validation errors as part of output (without throwing an exception) - to enable skipping / handling bad examples after the processing of good ones feature request

New feature or request

#16732 opened Apr 16, 2025 by vadimkantorov

1 task done

[Bug]: With --cpu-offload-gb, deepseek-moe-16b-chat got different response, even if the temperature is zero. bug

Something isn't working

#16731 opened Apr 16, 2025 by YenFuLin

1 task done

[Usage]: VLLM>0.8 also met No platform detected, vLLM is running on UnspecifiedPlatform usage

How to use vllm

#16724 opened Apr 16, 2025 by rainays

1 task done

[Bug]: Remove fallback to outlines for int/number range and pattern constraints in guided_json bug

Something isn't working

#16723 opened Apr 16, 2025 by csy1204

1 task done

[Usage]: I want to learn a hack method to offload and load vllm internal weights between CPU and GPU usage

How to use vllm

#16722 opened Apr 16, 2025 by vagitablebirdcode

1 task done

[Bug]: Vllm serve‘s results is not equal to offline inference. bug

Something isn't working

#16718 opened Apr 16, 2025 by tzjtatata

1 task done

[Bug]: vllm 0.8.3 v1 engine CUDA Graph Capturing time is too long bug

Something isn't working

#16716 opened Apr 16, 2025 by sjtu-zwh

1 task done

[Installation]: Kimi-VL-A3B failed to be deployed using vllm mirroring installation

Installation problems

#16715 opened Apr 16, 2025 by nigthDust

1 task done

[Bug]: Loading bnb-community/Llama-4-Scout-17B-16E-Instruct-bnb-4bit error FusedMoE quant_method is None bug

Something isn't working

#16713 opened Apr 16, 2025 by fahadh4ilyas

1 task done

[Bug]: meta-llama/Llama-4-Scout-17B-16E-Instruct not supported on VLLM, 2xH100 NVL 196GB bug

Something isn't working

#16712 opened Apr 16, 2025 by bernardgut

1 task done

[Usage]: "How can I register Logits Processors in the args?" usage

How to use vllm

#16709 opened Apr 16, 2025 by hjlee1995

1 task done

Previous 1 2 3 4 5 … 67 68 Next

Previous Next

ProTip! Type g p on any issue or pull request to go back to the pull request listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly