Provides a LlamaAI class with Python interface for generating text using Llama models.
- Load Llama models and tokenizers automatically from gguf file
- Generate text completions for prompts
- Automatically adjust model size to fit longer prompts up to a specific limit
- Convenient methods for tokenizing and untokenizing text
- Fix text formatting issues before generating
from llama_ai import LlamaAI
ai = LlamaAI("my_model.gguf", max_tokens=500, max_input_tokens=100)"
Generate text by calling infer():
text = ai.infer("Once upon a time")
print(text)"
pip install gguf_llama
See the API documentation for full details on classes and methods.
Contributions are welcome! Open an issue or PR to improve gguf_llama.