Skip to content

Eval bug: [Autoparser] GLM-4.7 tool parsing is often broken #5

@cinu

Description

@cinu

Name and Version

$ ./llama-server --version
ggml_cuda_init: found 8 CUDA devices:
Device 0: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Device 1: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Device 2: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Device 3: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Device 4: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Device 5: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Device 6: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
Device 7: NVIDIA GeForce RTX 3090, compute capability 8.6, VMM: yes
version: 7743 (6a023e8)
built with GNU 13.3.0 for Linux x86_64

Operating systems

Linux

GGML backends

CUDA

Hardware

8 x 3090

Models

GLM-4.7-UD-Q3_K_XL

Problem description & steps to reproduce

Tool parsing is often broken: {"error":{"code":500,"message":"Failed to parse input at pos 184: ... }} for GLM-4.7

Curl command reproduction:

curl 'http://127.0.0.1:8082/v1/chat/completions' \
  --data-raw '{"stream":true,"return_progress":true,"temperature":0.7,"min_p":0,"cache_prompt":true,"model":"GLM-4.7-UD-Q3_K_XL","top_k":40,"top_p":1,"messages":[{"role":"user","content":"list files in current directory"}],"tools":[{"type":"function","function":{"name":"bash","description":"Execute a bash command and return the output","parameters":{"type":"object","properties":{"explanation":{"type":"string","description":"One sentence explanation as to why this tool is being used, and how it contributes to the goal."},"command":{"type":"string","description":"A bash command to execute"}},"required":["command","explanation"]}}}]}'

Output:

// [...]

data: {"choices":[{"finish_reason":null,"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":" requested"}}]}}],"created":1767932874,"id":"chatcmpl-WSkobex5fe5dO1IJtQqrZlpzEcJZD3Xm","model":"GLM-4.7-UD-Q3_K_XL-00001-of-00004.gguf","system_fingerprint":"b7743-6a023e898","object":"chat.completion.chunk"}

data: {"choices":[{"finish_reason":null,"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":" by"}}]}}],"created":1767932874,"id":"chatcmpl-WSkobex5fe5dO1IJtQqrZlpzEcJZD3Xm","model":"GLM-4.7-UD-Q3_K_XL-00001-of-00004.gguf","system_fingerprint":"b7743-6a023e898","object":"chat.completion.chunk"}

data: {"choices":[{"finish_reason":null,"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":" the"}}]}}],"created":1767932874,"id":"chatcmpl-WSkobex5fe5dO1IJtQqrZlpzEcJZD3Xm","model":"GLM-4.7-UD-Q3_K_XL-00001-of-00004.gguf","system_fingerprint":"b7743-6a023e898","object":"chat.completion.chunk"}

data: {"choices":[{"finish_reason":null,"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":" user"}}]}}],"created":1767932874,"id":"chatcmpl-WSkobex5fe5dO1IJtQqrZlpzEcJZD3Xm","model":"GLM-4.7-UD-Q3_K_XL-00001-of-00004.gguf","system_fingerprint":"b7743-6a023e898","object":"chat.completion.chunk"}

data: {"choices":[{"finish_reason":null,"index":0,"delta":{"tool_calls":[{"index":0,"function":{"arguments":"\"}"}}]}}],"created":1767932874,"id":"chatcmpl-WSkobex5fe5dO1IJtQqrZlpzEcJZD3Xm","model":"GLM-4.7-UD-Q3_K_XL-00001-of-00004.gguf","system_fingerprint":"b7743-6a023e898","object":"chat.completion.chunk"}

data: {"error":{"code":500,"message":"Failed to parse input at pos 184: <tool_call>bash</tool_call>{\"command\": \"ls\", \"explanation\": \"List files in the current directory as requested by the user\"} >","type":"server_error"}}

llama.cpp server command:

CUDA_VISIBLE_DEVICES="0,1,2,3,4,5,6,7" ./llama-server -m /modele/GLM-4.7-UD-Q3_K_XL-00001-of-00004.gguf -ngl 9999 --threads 4 -c 110000 -fa on --host localhost --port 8082 --jinja --verbose -ub 128 -b 128 -ctk q8_0 -ctv q8_0

and server output log tail:

// [...]

que    start_loop: processing new tasks
que    start_loop: processing task, id = 1438
que    start_loop: update slots
srv  update_slots: all slots are idle
que    start_loop: waiting for new tasks
srv  update_chat_: Parsing chat message: The user wants to list files in the current directory. This is a straightforward request that can be done with the `ls` command. I'll use the bash tool to execute this command.</think><tool_call>bash</tool_call>{"command": "ls", "explanation": "List files in the current directory as requested by the user"} >
Parsing PEG input with format peg-native: The user wants to list files in the current directory. This is a straightforward request that can be done with the `ls` command. I'll use the bash tool to execute this command.</think><tool_call>bash</tool_call>{"command": "ls", "explanation": "List files in the current directory as requested by the user"} >
srv  update_chat_: Parsing chat message: The user wants to list files in the current directory. This is a straightforward request that can be done with the `ls` command. I'll use the bash tool to execute this command.</think><tool_call>bash</tool_call>{"command": "ls", "explanation": "List files in the current directory as requested by the user"} >
Parsing PEG input with format peg-native: The user wants to list files in the current directory. This is a straightforward request that can be done with the `ls` command. I'll use the bash tool to execute this command.</think><tool_call>bash</tool_call>{"command": "ls", "explanation": "List files in the current directory as requested by the user"} >
srv    operator(): http: streamed chunk: data: {"error":{"code":500,"message":"Failed to parse input at pos 184: <tool_call>bash</tool_call>{\"command\": \"ls\", \"explanation\": \"List files in the current directory as requested by the user\"} >","type":"server_error"}}

First Bad Commit

No response

Relevant log output

Logs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions