Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
31 changes: 31 additions & 0 deletions docs/features/reasoning_output.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@ Reasoning models return an additional `reasoning_content` field in their output,
| baidu/ERNIE-4.5-VL-28B-A3B-Paddle | ernie-45-vl | ✅ | ❌ |"chat_template_kwargs":{"enable_thinking": true/false}|
| baidu/ERNIE-4.5-21B-A3B-Thinking | ernie-x1 | ✅ Not supported for turning off | ✅|❌|
| baidu/ERNIE-4.5-VL-28B-A3B-Thinking | ernie-45-vl-thinking | ✅ Not recommended to turn off | ✅|"chat_template_kwargs": {"options": {"thinking_mode": "open/close"}}|
| unsloth/DeepSeek-V3.1-BF16 | deepseek | ❌ (thinking mode off by default) | ✅|❌|
| unsloth/DeepSeek-V3-0324-BF16 | deepseek | ✅ (thinking mode on by default) | ✅|❌|
| unsloth/DeepSeek-R1-BF16 | deepseek | ✅ (thinking mode on by default) | ✅|❌|

The reasoning model requires a specified parser to extract reasoning content. Referring to the `thinking switch parameters` of each model can turn off the model's thinking mode.

Expand Down Expand Up @@ -159,3 +162,31 @@ Model output example
}
```
More reference documentation related to tool calling usage: [Tool Calling](./tool_calling.md)

## Error Handling and Invalid Format

The DeepSeek reasoning parser handles various invalid or incomplete format scenarios gracefully:

### Missing Start Tag
If the model output contains only the end tag without the start tag:
- **Input**: `abc</think>xyz`
- **Output**: `reasoning_content="abc"`, `content="xyz"`
- The parser extracts content before the end tag as reasoning, and content after as reply.

### Missing End Tag
If the model output contains only the start tag without the end tag:
- **Input**: `<think>abc`
- **Output**: `reasoning_content="abc"`, `content=None`
- The parser treats all content as reasoning when the end tag is missing.

### No Reasoning Tags (Thinking Mode Off)
When thinking mode is disabled (e.g., DeepSeek-V3.1 by default):
- **Input**: `direct response`
- **Output**: `reasoning_content=None`, `content="direct response"`
- The parser treats the entire output as reply content.

### Protocol Violation with Tool Calls
If there is non-whitespace content between the reasoning end tag and tool calls:
- **Input**: `<think>thinking</think>\n\nABC\n<|tool▁calls▁begin|>...`
- **Output**: Tool calls are not parsed, entire output is returned as `content`
- This ensures tool calls are only parsed when they immediately follow reasoning content.
28 changes: 28 additions & 0 deletions docs/features/tool_calling.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,13 @@ This document describes how to configure the server in FastDeploy to use the too
| baidu/ERNIE-4.5-21B-A3B-Thinking | ernie-x1 |
| baidu/ERNIE-4.5-VL-28B-A3B-Thinking | ernie-45-vl-thinking |

## Tool Call parser for DeepSeek series models
| Model Name | Parser Name |
|---------------|-------------|
| unsloth/DeepSeek-V3.1-BF16 | deepseek |
| unsloth/DeepSeek-V3-0324-BF16 | deepseek |
| unsloth/DeepSeek-R1-BF16 | deepseek |

## Quickstart

### Starting FastDeploy with Tool Calling Enabled.
Expand Down Expand Up @@ -90,6 +97,27 @@ The example output is as follows. It shows that the model's output of the though
}
```

## Error Handling and Invalid Format

The DeepSeek tool parser handles various invalid or incomplete format scenarios:

### Protocol Violation
If there is non-whitespace content between the reasoning end tag (`</think>`) and tool calls:
- **Input**: `<think>thinking</think>\n\nABC\n<|tool▁calls▁begin|>...`
- **Output**: `tools_called=False`, `tool_calls=None`, `content=<entire_output>`
- Tool calls are not parsed when protocol is violated. The entire output is returned as content.

### Malformed JSON Arguments
If the tool call arguments contain invalid JSON:
- **Input**: `<|tool▁call▁begin|>get_weather<|tool▁sep|>{"location": "北京", "unit":}<|tool▁call▁end|>`
- **Output**: The parser attempts to use `partial_json_parser` to recover valid JSON. If recovery fails, it returns an empty object `{}` or the raw text.
- This ensures graceful handling of incomplete JSON during streaming.

### Missing Tool Call End Tag
If a tool call is incomplete (missing end tag):
- **Input**: `<|tool▁call▁begin|>get_weather<|tool▁sep|>{"location": "北京"`
- **Output**: In streaming mode, the parser waits for more data. In non-streaming mode, it attempts to extract what's available.

## Parallel Tool Calls
If the model can generate parallel tool calls, FastDeploy will return a list:
```bash
Expand Down
31 changes: 31 additions & 0 deletions docs/zh/features/reasoning_output.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,9 @@
| baidu/ERNIE-4.5-VL-28B-A3B-Paddle | ernie-45-vl | ✅ | ❌ |"chat_template_kwargs":{"enable_thinking": true/false}|
| baidu/ERNIE-4.5-21B-A3B-Thinking | ernie-x1 | ✅不支持关思考 | ✅|❌|
| baidu/ERNIE-4.5-VL-28B-A3B-Thinking | ernie-45-vl-thinking | ✅不推荐关闭 | ✅|"chat_template_kwargs": {"options": {"thinking_mode": "open/close"}}|
| unsloth/DeepSeek-V3.1-BF16 | deepseek | ❌ (默认关闭思考模式) | ✅|❌|
| unsloth/DeepSeek-V3-0324-BF16 | deepseek | ✅ (默认开启思考模式) | ✅|❌|
| unsloth/DeepSeek-R1-BF16 | deepseek | ✅ (默认开启思考模式) | ✅|❌|

思考模型需要指定解析器,以便于对思考内容进行解析. 参考各个模型的 `思考开关控制参数` 可以关闭模型思考模式.

Expand Down Expand Up @@ -158,3 +161,31 @@ curl -X POST "http://0.0.0.0:8390/v1/chat/completions" \
}
```
更多工具调用相关的使用参考文档 [Tool Calling](./tool_calling.md)

## 错误处理和格式不合法情况

DeepSeek 推理解析器能够优雅地处理各种格式不合法或不完整的情况:

### 缺少起始标签
如果模型输出只包含结束标签而没有起始标签:
- **输入**: `abc</think>xyz`
- **输出**: `reasoning_content="abc"`, `content="xyz"`
- 解析器会将结束标签之前的内容提取为思考内容,之后的内容提取为回复内容。

### 缺少结束标签
如果模型输出只包含起始标签而没有结束标签:
- **输入**: `<think>abc`
- **输出**: `reasoning_content="abc"`, `content=None`
- 解析器会将所有内容视为思考内容。

### 无思考标签(思考模式关闭)
当思考模式被关闭时(例如 DeepSeek-V3.1 默认关闭):
- **输入**: `direct response`
- **输出**: `reasoning_content=None`, `content="direct response"`
- 解析器会将整个输出视为回复内容。

### 协议不规范(工具调用前有非空白字符)
如果思考结束标签和工具调用之间存在非空白字符:
- **输入**: `<think>thinking</think>\n\nABC\n<|tool▁calls▁begin|>...`
- **输出**: 工具调用不会被解析,整个输出作为 `content` 返回
- 这确保了只有在工具调用紧跟在思考内容之后时才会被解析。
28 changes: 28 additions & 0 deletions docs/zh/features/tool_calling.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,13 @@
| baidu/ERNIE-4.5-21B-A3B-Thinking | ernie-x1 |
| baidu/ERNIE-4.5-VL-28B-A3B-Thinking | ernie-45-vl-thinking |

## DeepSeek系列模型配套工具解释器
| 模型名称 | 解析器名称 |
|---------------|-------------|
| unsloth/DeepSeek-V3.1-BF16 | deepseek |
| unsloth/DeepSeek-V3-0324-BF16 | deepseek |
| unsloth/DeepSeek-R1-BF16 | deepseek |

## 快速开始

### 启动包含解析器的FastDeploy
Expand Down Expand Up @@ -92,6 +99,27 @@ curl -X POST http://0.0.0.0:8000/v1/chat/completions \
]
}
```
## 错误处理和格式不合法情况

DeepSeek 工具解析器能够处理各种格式不合法或不完整的情况:

### 协议不规范
如果思考结束标签(`</think>`)和工具调用之间存在非空白字符:
- **输入**: `<think>thinking</think>\n\nABC\n<|tool▁calls▁begin|>...`
- **输出**: `tools_called=False`, `tool_calls=None`, `content=<完整输出>`
- 当协议不规范时,工具调用不会被解析,整个输出作为 content 返回。

### JSON 参数格式错误
如果工具调用的参数包含无效的 JSON:
- **输入**: `<|tool▁call▁begin|>get_weather<|tool▁sep|>{"location": "北京", "unit":}<|tool▁call▁end|>`
- **输出**: 解析器会尝试使用 `partial_json_parser` 来恢复有效的 JSON。如果恢复失败,会返回空对象 `{}` 或原始文本。
- 这确保了在流式输出过程中能够优雅地处理不完整的 JSON。

### 缺少工具调用结束标签
如果工具调用不完整(缺少结束标签):
- **输入**: `<|tool▁call▁begin|>get_weather<|tool▁sep|>{"location": "北京"`
- **输出**: 在流式模式下,解析器会等待更多数据。在非流式模式下,会尝试提取可用的内容。

## 并行工具调用

如果模型能够生成多个并行的工具调用,FastDeploy 会返回一个列表:
Expand Down
Loading
Loading