-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How can I adjust the token limit for input? #257
Comments
Transcription and response generation are different functionality independent of each other.
Transcription is done on audio files. While we can provide a prompt for transcription as well, this isn't something transcribe does. As you correctly noted, the input token limit for response generation depends on the model. |
@mang0sw33t I am interested in token limit for response generation. How does transcribe know token limit of a particular model I chose? If I go with Gemini 1.5 pro with token limit 1M will response generation take into account that the model I use has 1M token limit? Does MAX_TRANSCRIPTION_PHRASES_FOR_LLM control it? It set to 12. So it takes 12 last dialog parts? It seems very little. I would say it would be like maybe 1k token in total. UPD: I set MAX_TRANSCRIPTION_PHRASES_FOR_LLM to 100 and now model takes much more context. |
For LLM responses, Transcribe does not do any automatic limits on input tokens based on the model. MAX_TRANSCRIPTION_PHRASES_FOR_LLM is how many previous conversations as visible in the UI, are sent to LLM. You are correct, setting it to a higher number will result in more tokens sent to LLM. Please consider a PR with making this parameter configurable so other users can change it easily in |
What is the maximum input the model can handle for transcription? GPT-4 has a 128k token limit, but it is not clear how much of this is used for generating the response.
The text was updated successfully, but these errors were encountered: