Skip to content

Latest commit

 

History

History
39 lines (26 loc) · 2.77 KB

README.md

File metadata and controls

39 lines (26 loc) · 2.77 KB

🧑‍🔬 Tabby Registry

How can I convert my own model for use with Tabby?

https://tabby.tabbyml.com/docs/faq

Since version 0.5.0, Tabby's inference now operates entirely on llama.cpp, allowing the use of any GGUF-compatible model format with Tabby. To enhance accessibility, we have curated models that we benchmarked, available at registry-tabby

Users are free to fork the repository to create their own registry. If a user's registry is located at https://github.com/USERNAME/registry-tabby, the model ID will be USERNAME/model.

For details on the registry format, please refer to models.json

Completion models (--model)

We recommend using

  • For 1B to 3B models, it's advisable to have at least NVIDIA T4, 10 Series, or 20 Series GPUs.
  • For 7B to 13B models, we recommend using NVIDIA V100, A100, 30 Series, or 40 Series GPUs.

We have published benchmarks for these models on https://leaderboard.tabbyml.com for Tabby's users to consider when making trade-offs between quality, licensing, and model size.

Model ID License
TabbyML/StarCoder-1B BigCode-OpenRAIL-M
TabbyML/StarCoder-3B BigCode-OpenRAIL-M
TabbyML/StarCoder-7B BigCode-OpenRAIL-M
TabbyML/CodeLlama-7B Llama 2
TabbyML/CodeLlama-13B Llama 2
TabbyML/DeepseekCoder-1.3B Deepseek License
TabbyML/DeepseekCoder-6.7B Deepseek License

Chat models (--chat-model)

To ensure optimal response quality, and given that latency requirements are not stringent in this scenario, we recommend using a model with at least 3B parameters.

Model ID License
TabbyML/WizardCoder-3B BigCode-OpenRAIL-M
TabbyML/Mistral-7B Apache 2.0