hyperai · yudi3ooo · Nov 14, 2025 · Nov 17, 2025 · Nov 17, 2025 · Nov 17, 2025
diff --git a/docs/01-getting-started/01-installation/01-gpu.md b/docs/01-getting-started/01-installation/01-gpu.md
@@ -2,7 +2,7 @@
 title: GPU
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 vLLM 是一个支持如下 GPU 类型的 Python 库，根据您的 GPU 型号查看相应的说明。
 

diff --git a/docs/01-getting-started/01-installation/02-cpu.md b/docs/01-getting-started/01-installation/02-cpu.md
@@ -2,7 +2,7 @@
 title: CPU
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 vLLM 是一个支持以下 CPU 变体的 Python 库。根据您的 CPU 类型查看厂商特定的说明：
 
@@ -11,7 +11,6 @@ vLLM 是一个支持以下 CPU 变体的 Python 库。根据您的 CPU 类型查
 vLLM 初步支持在 x86 CPU 平台进行基础模型推理和服务，支持 FP32、FP16 和 BF16 数据类型。
 
 > **注意**
-> 
 > 此设备没有预编译的 wheel 包或镜像，您必须从源码构建 vLLM。
 
 #### ARM AArch64
@@ -21,7 +20,6 @@ vLLM 已适配支持具备 NEON 指令集的 ARM64 CPU，基于最初为 x86 平
 ARM CPU 后端当前支持 Float32、FP16 和 BFloat16 数据类型。
 
 > **注意**
-> 
 > 此设备没有预编译的 wheel 包或镜像，您必须从源码构建 vLLM。
 
 #### Apple silicon
@@ -31,7 +29,6 @@ vLLM 对 macOS 上的 Apple 芯片提供实验性支持。目前用户需从源
 macOS 的 CPU 实现当前支持 FP32 和 FP16 数据类型。
 
 > **注意**
-> 
 > 此设备没有预编译的 wheel 包或镜像，您必须从源码构建 vLLM。
 
 #### IBM Z (S390X)
@@ -41,7 +38,6 @@ vLLM 对 IBM Z 平台上的 s390x 架构提供实验性支持。目前用户需
 s390x 架构的 CPU 实现当前仅支持 FP32 数据类型。
 
 > **注意**
-> 
 > 此设备没有预编译的 wheel 包或镜像，您必须从源码构建 vLLM。
 
 ## 系统要求
@@ -54,7 +50,8 @@ s390x 架构的 CPU 实现当前仅支持 FP32 数据类型。
 - 编译器：`gcc/g++ >= 12.3.0`（可选，推荐）
 - 指令集架构 (ISA)：AVX512（可选，推荐）
 
-> **提示** >[Intel Extension for PyTorch (IPEX)](https://github.com/intel/intel-extension-for-pytorch)  通过最新特性优化扩展 PyTorch，可在 Intel 硬件上获得额外性能提升。
+> **提示**
+>[Intel Extension for PyTorch (IPEX)](https://github.com/intel/intel-extension-for-pytorch)  通过最新特性优化扩展 PyTorch，可在 Intel 硬件上获得额外性能提升。
 
 #### ARM AArch64
 
@@ -270,10 +267,10 @@ $ docker run -it \
 
 vLLM CPU 后端支持以下特性：
 
-- 张量并行 (Tensor Parallel)
-- 模型量化 (`INT8 W8A8`、`AWQ`、`GPTQ`)
-- 分块预填充 (Chunked-prefill
-- 前缀缓存 (Prefix-caching)
+- 张量并行（Tensor Parallel）
+- 模型量化（`INT8 W8A8`、`AWQ`、`GPTQ`）
+- 分块预填充（Chunked-prefill）
+- 前缀缓存（Prefix-caching）
 - FP8-E5M2 KV 缓存
 
 ## 相关运行时环境变量
@@ -285,7 +282,7 @@ vLLM CPU 后端支持以下特性：
 
 `VLLM_CPU_OMP_THREADS_BIND=0-31|32-63`  表示启用 2 个张量并行进程，rank0 的 32 个 OpenMP 线程绑定到 0-31 号核心，rank1 的线程绑定到 32-63 号核心
 
-- `VLLM_CPU_MOE_PREPACK` : 是否为 MoE 层使用预打包功能。该参数会传递给  `ipex.llm.modules.GatedMLPMOE` 。默认值为  `1` （启用）。在不支持的 CPU 上可能需要设置为  `0` （禁用）。
+- `VLLM_CPU_MOE_PREPACK` : 是否为 MoE 层使用预打包功能。该参数会传递给 `ipex.llm.modules.GatedMLPMOE`。默认值为 `1`（启用）。在不支持的 CPU 上可能需要设置为 `0`（禁用）。
 
 ## 性能优化建议
 
@@ -306,7 +303,7 @@ export VLLM_CPU_OMP_THREADS_BIND=0-29
 vllm serve facebook/opt-125m
 ```
 
-- 在支持超线程的机器上使用 vLLM CPU 后端时，建议通过  `VLLM_CPU_OMP_THREADS_BIND`  将每个物理 CPU 核心只绑定一个 OpenMP 线程。在 16 逻辑核心 / 8 物理核心的超线程平台上：
+- 在支持超线程的机器上使用 vLLM CPU 后端时，建议通过 `VLLM_CPU_OMP_THREADS_BIND` 将每个物理 CPU 核心只绑定一个 OpenMP 线程。在 16 逻辑核心/8 物理核心的超线程平台上：
 
 ```plain
 $ lscpu -e # check the mapping between logical CPU cores and physical CPU cores
@@ -337,13 +334,13 @@ $ export VLLM_CPU_OMP_THREADS_BIND=0-7
 $ python examples/offline_inference/basic/basic.py
 ```
 
-- 在多插槽 NUMA 机器上使用 vLLM CPU 后端时，应注意通过  `VLLM_CPU_OMP_THREADS_BIND`  设置 CPU 核心，避免跨 NUMA 节点的内存访问。
+- 在多插槽 NUMA 机器上使用 vLLM CPU 后端时，应注意通过 `VLLM_CPU_OMP_THREADS_BIND` 设置 CPU 核心，避免跨 NUMA 节点的内存访问。
 
 ## 其他注意事项
 
 - CPU 后端与 GPU 后端有显著差异，因为 vLLM 架构最初是为 GPU 优化的。需要多项优化来提升其性能。
 - 建议将 HTTP 服务组件与推理组件解耦。在 GPU 后端配置中，HTTP 服务和分词任务运行在 CPU 上，而推理运行在 GPU 上，这通常不会造成问题。但在基于 CPU 的环境中，HTTP 服务和分词可能导致显著的上下文切换和缓存效率降低。因此强烈建议分离这两个组件以获得更好的性能。
-- 在启用 NUMA 的 CPU 环境中，内存访问性能可能受  [拓扑结构](https://github.com/intel/intel-extension-for-pytorch/blob/main/docs/tutorials/performance_tuning/tuning_guide.inc.md#non-uniform-memory-access-numa)  影响较大。对于 NUMA 架构，推荐两种优化方案：张量并行或数据并行。
+- 在启用 NUMA 的 CPU 环境中，内存访问性能可能受[拓扑结构](https://github.com/intel/intel-extension-for-pytorch/blob/main/docs/tutorials/performance_tuning/tuning_guide.inc.md#non-uniform-memory-access-numa)影响较大。对于 NUMA 架构，推荐两种优化方案：张量并行或数据并行。
 
   - 延迟敏感场景使用张量并行：遵循 GPU 后端设计，基于 NUMA 节点数量（例如双 NUMA 节点系统 TP=2）使用 Megatron-LM 的并行算法切分模型。随着  [CPU 上的 TP 功能](https://github.com/vllm-project/vllm/pull/6125#)  合并，张量并行已支持服务和离线推理。通常每个 NUMA 节点被视为一个 GPU 卡。以下是启用张量并行度为 2 的服务示例：
 

diff --git a/docs/01-getting-started/01-installation/03-ai-accelerator.md b/docs/01-getting-started/01-installation/03-ai-accelerator.md
@@ -2,7 +2,7 @@
 title: 其他 AI 加速器
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 vLLM 是一个 Python 库，支持以下 AI 加速器。根据您的 AI 加速器类型查看供应商特定说明：
 
@@ -29,23 +29,20 @@ vLLM 是一个 Python 库，支持以下 AI 加速器。根据您的 AI 加速
 您可能需要为 TPU 虚拟机提供额外的持久存储。更多信息请参阅 [Cloud TPU 数据存储选项](https://cloud.devsite.corp.google.com/tpu/docs/storage-options)。
 
 > **注意**
-> 
 > 此设备没有预构建的 wheels，因此您必须使用预构建的 Docker 镜像或从源代码构建 vLLM。
 
 #### Intel Gaudi
 
 此节提供了在 Intel Gaudi 设备上运行 vLLM 的说明。
 
 > **注意**
-> 
 > 此设备没有预构建的 wheels 或镜像，因此您必须从源代码构建 vLLM。
 
 #### AWS Neuron
 
 vLLM 0.3.3 及以上版本支持通过 Neuron SDK 在 AWS Trainium/Inferentia 上进行模型推理和服务，并支持连续批处理。分页注意力 (Paged Attention) 和分块预填充 (Chunked Prefill) 功能目前正在开发中，即将推出。Neuron SDK 当前支持的数据类型为 FP16 和 BF16。
 
 > **注意**
-> 
 > 此设备没有预构建的 wheels 或镜像，因此您必须从源代码构建 vLLM。
 
 ## 环境要求
@@ -61,7 +58,6 @@ vLLM 0.3.3 及以上版本支持通过 Neuron SDK 在 AWS Trainium/Inferentia
 您可以使用 [Cloud TPU API](https://cloud.google.com/tpu/docs/reference/rest) 或 [队列资源](https://cloud.google.com/tpu/docs/queued-resources) API 配置 Cloud TPU。本节展示如何使用队列资源 API 创建 TPU。有关使用 Cloud TPU API 的更多信息，请参阅 [使用 Create Node API 创建 Cloud TPU](https://cloud.google.com/tpu/docs/managing-tpus-tpu-vm#create-node-api)。队列资源允许您以队列方式请求 Cloud TPU 资源。当您请求队列资源时，请求会被添加到 Cloud TPU 服务维护的队列中。当请求的资源可用时，它将分配给您的 Google Cloud 项目供您独占使用。
 
 > **注意**
-> 
 > 在以下所有命令中，请将全大写的参数名称替换为适当的值。有关参数描述，请参阅参数描述表。
 
 #### 使用 GKE 配置 Cloud TPU

diff --git a/docs/01-getting-started/01-installation/README.md b/docs/01-getting-started/01-installation/README.md
@@ -2,25 +2,25 @@
 title: 安装
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 vLLM 支持以下硬件平台：
 
-## GPU
+## [GPU](/docs/getting-started/installation/gpu)
 
-- NVIDIA CUDA
-- AMD ROCm
-- Intel XPU
+- [NVIDIA CUDA](/docs/getting-started/installation/gpu#nvidia-cuda)
+- [AMD ROCm](/docs/getting-started/installation/gpu#amd-rocm)
+- [Intel XPU](/docs/getting-started/installation/gpu#inter-xpu-1)
 
-## CPU
+## [CPU](/docs/getting-started/installation/cpu)
 
-- Intel/AMD x86
-- ARM AArch64
-- Apple silicon
+- [Intel/AMD x86](/docs/getting-started/installation/cpu#intelamd-x86)
+- [ARM AArch64](/docs/getting-started/installation/cpu#arm-aarch64)
+- [Apple silicon](/docs/getting-started/installation/cpu#apple-silicon)
 
-## 其他 AI 加速器
+## [其他 AI 加速器](/docs/getting-started/installation/ai-accelerator)
 
-- Google TPU
-- Intel Gaudi
-- AWS Neuron
+- [Google TPU](/docs/getting-started/installation/ai-accelerator#google-tpu-1)
+- [Intel Gaudi](/docs/getting-started/installation/ai-accelerator#intel-gaudi-1)
+- [AWS Neuron](/docs/getting-started/installation/ai-accelerator#aws-neuron-1)
 - OpenVINO
diff --git a/docs/01-getting-started/02-quickstart.md b/docs/01-getting-started/02-quickstart.md
@@ -2,7 +2,7 @@
 title: 快速开始
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 本指南将帮助您快速开始使用 vLLM 进行以下操作：
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/01-audio_language.md b/docs/01-getting-started/03-examples/01-offline-inference/01-audio_language.md
@@ -2,7 +2,7 @@
 title: Audio Language
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/audio_language.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/audio_language.py)
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/02-basic.md b/docs/01-getting-started/03-examples/01-offline-inference/02-basic.md
@@ -2,7 +2,7 @@
 title: 基础指南
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/basic](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/basic)
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/03-chat_with_tools.md b/docs/01-getting-started/03-examples/01-offline-inference/03-chat_with_tools.md
@@ -2,7 +2,7 @@
 title: Chat With Tools
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/chat_with_tools.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/chat_with_tools.py)
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/04-cpu_offload_lmcache.md b/docs/01-getting-started/03-examples/01-offline-inference/04-cpu_offload_lmcache.md
@@ -2,7 +2,7 @@
 title: Cpu Offload Lmcache
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/cpu_offload_lmcache.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/cpu_offload_lmcache.py)
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/05-data_parallel.md b/docs/01-getting-started/03-examples/01-offline-inference/05-data_parallel.md
@@ -2,7 +2,7 @@
 title: Data Parallel
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/data_parallel.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/data_parallel.py)
 

diff --git a/...ng-started/03-examples/01-offline-inference/06-disaggregated_prefill_lmcache.md b/...ng-started/03-examples/01-offline-inference/06-disaggregated_prefill_lmcache.md
@@ -2,7 +2,7 @@
 title: Disaggregated Prefill Lmcache
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/disaggregated_prefill_lmcache.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/disaggregated_prefill_lmcache.py)
 

diff --git a/...01-getting-started/03-examples/01-offline-inference/07-disaggregated_prefill.md b/...01-getting-started/03-examples/01-offline-inference/07-disaggregated_prefill.md
@@ -2,7 +2,7 @@
 title: Disaggregated Prefill
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/disaggregated_prefill.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/disaggregated_prefill.py)
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/08-distributed.md b/docs/01-getting-started/03-examples/01-offline-inference/08-distributed.md
@@ -2,7 +2,7 @@
 title: Distributed
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/distributed.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/distributed.py)
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/09-eagle.md b/docs/01-getting-started/03-examples/01-offline-inference/09-eagle.md
@@ -2,7 +2,7 @@
 title: Eagle
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/eagle.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/eagle.py)
 

diff --git a/...tting-started/03-examples/01-offline-inference/10-encoder_decoder_multimodal.md b/...tting-started/03-examples/01-offline-inference/10-encoder_decoder_multimodal.md
@@ -2,7 +2,7 @@
 title: Encoder Decoder Multimodal
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/encoder_decoder_multimodal.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/encoder_decoder_multimodal.py)
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/11-encoder_decoder.md b/docs/01-getting-started/03-examples/01-offline-inference/11-encoder_decoder.md
@@ -2,7 +2,7 @@
 title: Encoder Decoder
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/encoder_decoder.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/encoder_decoder.py)
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/12-llm_engine_example.md b/docs/01-getting-started/03-examples/01-offline-inference/12-llm_engine_example.md
@@ -2,7 +2,7 @@
 title: Llm Engine Example
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/llm_engine_example.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/llm_engine_example.py)
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/13-load_sharded_state.md b/docs/01-getting-started/03-examples/01-offline-inference/13-load_sharded_state.md
@@ -2,7 +2,7 @@
 title: Load Sharded State
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/load_sharded_state.py.](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/load_sharded_state.py)
 

diff --git a/...started/03-examples/01-offline-inference/14-lora_with_quantization_inference.md b/...started/03-examples/01-offline-inference/14-lora_with_quantization_inference.md
@@ -2,7 +2,7 @@
 title: Lora With Quantization Inference
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/lora_with_quantization_inference.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/lora_with_quantization_inference.py)
 

diff --git a/docs/01-getting-started/03-examples/01-offline-inference/15-mistral-small.md b/docs/01-getting-started/03-examples/01-offline-inference/15-mistral-small.md
@@ -2,7 +2,7 @@
 title: Mistral-small
 ---
 
-[\*在线运行 vLLM 入门教程：零基础分步指南](https://openbayes.com/console/public/tutorials/rXxb5fZFr29?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
+[\*在线运行 vLLM 入门教程：零基础分步指南](https://app.hyper.ai/console/public/tutorials/rUwYsyhAIt3?utm_source=vLLM-CNdoc&utm_medium=vLLM-CNdoc-V1&utm_campaign=vLLM-CNdoc-V1-25ap)
 
 源码 [examples/offline_inference/mistral-small.py](https://github.com/vllm-project/vllm/blob/main/examples/offline_inference/mistral-small.py)