-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: update README to enhance clarity on features, installation, and…
… usage instructions
- Loading branch information
Showing
2 changed files
with
68 additions
and
61 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,89 +1,96 @@ | ||
# Burro 🫏🌯 | ||
# Burro 🫏 | ||
|
||
Burro is an AI-powered burrito LLM evaluation CLI tool. It helps you evaluate | ||
the factuality of responses generated by language models. | ||
|
||
## Features | ||
Burro is a command-line interface (CLI) tool built with Deno for evaluating Large Language Model (LLM) outputs. It provides a straightforward way to run different types of evaluations with secure API key management. | ||
|
||
- Set and encrypt your OpenAI API key | ||
- Run evaluations based on JSON input files | ||
- View detailed evaluation results | ||
## 🚀 Features | ||
|
||
## Installation | ||
- Three specialized evaluation types: | ||
- Answer correctness evaluation with context | ||
- Close-ended QA matching | ||
- Simple output-expected comparison | ||
- Secure OpenAI API key management | ||
- JSON-based evaluation configurations | ||
- SQLite storage for results and settings | ||
|
||
1. Clone the repository: | ||
```sh | ||
git clone <repository-url> | ||
cd <repository-directory> | ||
``` | ||
|
||
2. Install dependencies: | ||
```sh | ||
deno task check | ||
``` | ||
## 📋 Prerequisites | ||
|
||
## Build Process | ||
- [Deno](https://deno.land/) installed on your system | ||
- OpenAI API key | ||
|
||
To build the project, follow these steps: | ||
## 🛠️ Installation | ||
|
||
1. Ensure you have Deno installed. If not, you can install it from | ||
[here](https://deno.land/#installation). | ||
1. Clone the repository: | ||
```bash | ||
git clone <your-repository-url> | ||
cd burro | ||
``` | ||
|
||
2. Run the following command to build the project: | ||
```sh | ||
deno task build | ||
``` | ||
2. Ensure Deno is installed: | ||
```bash | ||
deno --version | ||
``` | ||
|
||
3. The build output will be available in the `dist` directory. | ||
## 🔧 Usage | ||
|
||
## Usage | ||
### Setting up API Keys | ||
|
||
### Set OpenAI API Key | ||
```bash | ||
deno run --allow-read --allow-write --allow-env main.ts set-openai-key | ||
``` | ||
|
||
Before running evaluations, you need to set your OpenAI API key: | ||
### Running Evaluations | ||
|
||
```sh | ||
deno task run set-openai-key | ||
```bash | ||
deno run --allow-read --allow-write --allow-env main.ts run-eval <evaluation-file> | ||
``` | ||
|
||
### Run Evaluation | ||
## 📊 Evaluation Types | ||
|
||
To run an evaluation based on a JSON input file: | ||
### 1. Answer Correctness (answerCorrectness.json) | ||
Evaluates answers against provided context with specific criteria. | ||
|
||
```sh | ||
deno task run run-eval <path-to-json-file> | ||
Example format: | ||
```json | ||
{ | ||
"input": { | ||
"context": "Tesla's Model 3 was first unveiled on March 31, 2016...", | ||
"question": "When did Tesla start delivering the Model 3?" | ||
}, | ||
"output": "July 2017", | ||
"criteria": "Answer must be exactly 'July 2017' based on the provided context" | ||
} | ||
``` | ||
|
||
### Example JSON Input | ||
### 2. Close QA (closeqa.json) | ||
Evaluates exact matching responses for close-ended questions. | ||
|
||
The JSON input file should have the following structure: | ||
Example format: | ||
```json | ||
{ | ||
"input": "List the first three prime numbers in ascending order, separated by commas.", | ||
"output": "2,3,5", | ||
"criteria": "Numbers must be in correct order, separated by commas with no spaces" | ||
} | ||
``` | ||
|
||
### 3. Simple Evals (evals.json) | ||
Compares model outputs against expected answers. | ||
|
||
Example format: | ||
```json | ||
[ | ||
{ | ||
"input": "Which country has the highest population?", | ||
"output": "People's Republic of China", | ||
"expected": "China" | ||
}, | ||
{ | ||
"input": "What is the capital of France?", | ||
"output": "The capital city of France is Paris", | ||
"expected": "Paris" | ||
}, | ||
{ | ||
"input": "Who wrote Romeo and Juliet?", | ||
"output": "The famous playwright William Shakespeare wrote Romeo and Juliet", | ||
"expected": "William Shakespeare" | ||
} | ||
] | ||
{ | ||
"input": "What is the capital of France?", | ||
"output": "The capital city of France is Paris", | ||
"expected": "Paris" | ||
} | ||
``` | ||
|
||
## Development | ||
|
||
### Run Tests | ||
## 🔒 Security Features | ||
|
||
- AES encryption for API key storage | ||
- Secure key generation | ||
- Encrypted SQLite storage | ||
|
||
To run the tests: | ||
|
||
```sh | ||
deno test | ||
``` |
Binary file not shown.