We train language models specialized in evaluating other language models and optimize evaluation pipelines!
Below are our key projects, with links to their repositories and related publications:
| Repository | Description | Paper |
|---|---|---|
| prometheus-eval | A repository for evaluating LLMs in generation tasks. Supports Prometheus 2, GPT-4, and others. | Link |
| prometheus | An Evaluator LM that is open-source, offers reproducible evaluation, and inexpensive to use. | Link |
| prometheus-vision | An Evaluator VLM that is open-source, offers reproducible evaluation, and inexpensive to use. | Link |