Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I adjust the token limit for input? #257

Open
al-yakubovich opened this issue Aug 29, 2024 · 3 comments
Open

How can I adjust the token limit for input? #257

al-yakubovich opened this issue Aug 29, 2024 · 3 comments

Comments

@al-yakubovich
Copy link

What is the maximum input the model can handle for transcription? GPT-4 has a 128k token limit, but it is not clear how much of this is used for generating the response.

@mang0sw33t
Copy link
Collaborator

mang0sw33t commented Aug 30, 2024

Transcription and response generation are different functionality independent of each other.

What is the maximum input the model can handle for transcription?

Transcription is done on audio files. While we can provide a prompt for transcription as well, this isn't something transcribe does.

As you correctly noted, the input token limit for response generation depends on the model.

@al-yakubovich
Copy link
Author

al-yakubovich commented Aug 31, 2024

@mang0sw33t I am interested in token limit for response generation. How does transcribe know token limit of a particular model I chose? If I go with Gemini 1.5 pro with token limit 1M will response generation take into account that the model I use has 1M token limit?

Does MAX_TRANSCRIPTION_PHRASES_FOR_LLM control it? It set to 12. So it takes 12 last dialog parts? It seems very little. I would say it would be like maybe 1k token in total.

UPD: I set MAX_TRANSCRIPTION_PHRASES_FOR_LLM to 100 and now model takes much more context.

@mang0sw33t
Copy link
Collaborator

For LLM responses, Transcribe does not do any automatic limits on input tokens based on the model.

MAX_TRANSCRIPTION_PHRASES_FOR_LLM is how many previous conversations as visible in the UI, are sent to LLM. You are correct, setting it to a higher number will result in more tokens sent to LLM.

Please consider a PR with making this parameter configurable so other users can change it easily in parameters.yaml file or override.yaml file

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants