Skip to content

Conversation

@dependabot
Copy link

@dependabot dependabot bot commented on behalf of github Jan 28, 2026

Bumps vllm from 0.6.1.post1 to 0.14.1.

Release notes

Sourced from vllm's releases.

v0.14.1

This is a patch release on top of v0.14.0 to address a few security and memory leak fixes.

v0.14.0

Highlights

This release features approximately 660 commits from 251 contributors (86 new contributors).

Breaking Changes:

  • Async scheduling is now enabled by default - Users who experience issues can disable with --no-async-scheduling.
    • Excludes some not-yet-supported configurations: pipeline parallel, CPU backend, non-MTP/Eagle spec decoding.
  • PyTorch 2.9.1 is now required and the default wheel is compiled against cu129.
  • Deprecated quantization schemes have been removed (#31688, #31285).
  • When using speculative decoding, unsupported sampling parameters will fail rather than being silently ignored (#31982).

Key Improvements:

  • Async scheduling enabled by default (#27614): Overlaps engine core scheduling with GPU execution, improving throughput without user configuration. Now also works with speculative decoding (#31998) and structured outputs (#29821).
  • gRPC server entrypoint (#30190): Alternative to REST API with binary protocol, HTTP/2 multiplexing.
  • --max-model-len auto (#29431): Automatically fits context length to available GPU memory, eliminating OOM startup failures.
  • Model inspection view (#29450): View the modules, attention backends, and quantization of your model in vLLM by specifying VLLM_LOG_MODEL_INSPECTION=1 or by simply printing the LLM object.
  • Model Runner V2 enhancements: UVA block tables (#31965), M-RoPE (#32143), logit_bias/allowed_token_ids/min_tokens support (#32163).
    • Please note that Model Runner V2 is still experimental and disabled by default.

Model Support

New Model Architectures:

LoRA Support Expansion:

Model Enhancements:

  • Qwen3-VL as reranker (#31890)
  • DeepSeek v3.2 chat prefix completion (#31147)
  • GLM-4.5/GLM-4.7 enable_thinking: false (#31788)
  • Ernie4.5-VL video timestamps (#31274)
  • Score template expansion (#31335)
  • LLaMa4 vision encoder compilation (#30709)

... (truncated)

Commits
  • d7de043 [CI] fix version comparsion and exclusion patterns in upload-release-wheels.s...
  • 4dc11b0 [Bugfix] Fix Whisper/encoder-decoder GPU memory leak (#32789)
  • 2bd95d8 [Misc] Bump opencv-python dependecy version to 4.13 (#32668)
  • f46d576 [Misc] Replace urllib's urlparse with urllib3's parse_url (#32746)
  • d682094 [build] fix cu130 related release pipeline steps and publish as nightly image...
  • b17039b [CI] Implement uploading to PyPI and GitHub in the release pipeline, enable r...
  • 48b67ba [Frontend] Standardize use of create_error_response (#32319)
  • 09f4264 [Bugfix] Fix ROCm dockerfiles (#32447)
  • 7f42dc2 [CI] Fix LM Eval Large Models (H100) (#32423)
  • c2a37a3 Cherry pick [ROCm] [CI] [Release] Rocm wheel pipeline with sccache #32264
  • Additional commits viewable in compare view

Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot merge will merge this PR after your CI passes on it
  • @dependabot squash and merge will squash and merge this PR after your CI passes on it
  • @dependabot cancel merge will cancel a previously requested merge and block automerging
  • @dependabot reopen will reopen this PR if it is closed
  • @dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
    You can disable automated security fix PRs for this repo from the Security Alerts page.

Bumps [vllm](https://github.com/vllm-project/vllm) from 0.6.1.post1 to 0.14.1.
- [Release notes](https://github.com/vllm-project/vllm/releases)
- [Changelog](https://github.com/vllm-project/vllm/blob/main/RELEASE.md)
- [Commits](vllm-project/vllm@v0.6.1.post1...v0.14.1)

---
updated-dependencies:
- dependency-name: vllm
  dependency-version: 0.14.1
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update python code labels Jan 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file python Pull requests that update python code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants