-
Notifications
You must be signed in to change notification settings - Fork 5.4k
Description
Description
My app makes queries to Elasticsearch, even though I tell the LLM to query only for 1 record, most of times it returns 10,000. This causes the the tool output to make it to the LLM and exceed the max_tokens which is 128k.
Steps to Reproduce
Use a tool that exceeds the max_tokens capability of the LLM
Expected behavior
Not sure what to expect, but I would expect that the agent would observe the max_tokens configured. Now, should it summarize? Should it just fail the return tool call?
Screenshots/Code snippets
In the evidence field.
Operating System
macOS Sonoma
Python Version
3.12
crewAI Version
1.3.0
crewAI Tools Version
1.3.0
Virtual Environment
Venv
Evidence
ERROR:root:OpenAI API call failed: Error code: 400 - {'status': 'failure', 'message': 'custom error: max_tokens must be at least 1, got -804559.'}
ERROR:root:OpenAI API call failed: Error code: 400 - {'status': 'failure', 'message': 'custom error: max_tokens must be at least 1, got -804559.'}
An unknown error occurred. Please check the details below.
Error details: Error code: 400 - {'status': 'failure', 'message': 'custom error: max_tokens must be at least 1, got -804559.'}
Possible Solution
Summarize or fail the tool call to the LLM telling it the output is too big to process and it needs narrowed down. Not sure if that would tell the LLM to rethink about it's plan.
I'm not sure how to debug the raw request/responses to the LLM, but this would help diagnosis the problem.
Additional context
This is mostly due to the LLM hallucinating on the task it needs to complete. Sometimes, it does exactly as told, most of the time, it just comes up with an incorrect plan, or doesn't execute the tool.