Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more than one model #347

Closed
lingster opened this issue Dec 18, 2024 · 3 comments
Closed

Support more than one model #347

lingster opened this issue Dec 18, 2024 · 3 comments

Comments

@lingster
Copy link

Given the range of models that ollama can run and given that we now have smaller models that are great for specific tasks, how hard would it be for specific agents to run with a specific model. Example Qwen-coder for coding tasks, Qwen-vl for image related tasks or perhaps llama 3.3 for overall task management.

@ErikBjare
Copy link
Owner

It would be pretty easy to do such routing here:

def reply(
messages: list[Message],
model: str,
stream: bool = False,
tools: list[ToolSpec] | None = None,
) -> Message:
if stream:
return _reply_stream(messages, model, tools)
else:
print(f"{PROMPT_ASSISTANT}: Thinking...", end="\r")
response = _chat_complete(messages, model, tools)
print(" " * shutil.get_terminal_size().columns, end="\r")
print(f"{PROMPT_ASSISTANT}: {response}")
return Message("assistant", response)

I'm not super interested in it though, but should be easy for gptme to modify itself to do!

@0xbrayo
Copy link
Collaborator

0xbrayo commented Jan 24, 2025

@ErikBjare This was implemented in the last release? using /model

@ErikBjare
Copy link
Owner

Yes, could also be done by adding tools to outsource such reasoning: #416

We are not doing any auto-routing, but I don't think we will be, at least not for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants