Skip to content

Document how to reuse LLM (without having it restart on every run) #29

@ahuang11

Description

@ahuang11

So that llama_model_loader: loaded meta data with 20 key-value pairs and 291 tensors from .models/mistral-7b-instruct-v0.1.Q4_K_M.gguf... is only shown once

from funcchain import chain, settings
from funcchain.model.defaults import ChatLlamaCpp, get_gguf_model
from pydantic import BaseModel



model = "Mistral-7B-Instruct-v0.1-GGUF"
model_file = "mistral-7b-instruct-v0.1.Q4_K_M.gguf"

settings.llm = ChatLlamaCpp(
    model_path=get_gguf_model(model, "Q4_K_M", settings).as_posix(),
)


class Translated(BaseModel):

    chinese: str
    english: str
    french: str


def hello(text: str) -> Translated:
    """
    Translate text into three languages.
    """
    return chain()


hello("Hello!")
hello("one")
hello("two")

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions