Skip to content

kevinschaul/llm-evals

Repository files navigation

llm-evals

Because we should all have our own set of LLM evals. Blog post

Explore my leaderboard

Installation

Python stuff:

uv sync

Node stuff:

npm install

just:

brew install just

Running the evals

Run them all:

just eval-all

Run a specific one:

just eval CONFIG

where CONFIG is "social-media-insults" for example.

To view the dashboard (the version published at https://kschaul.com/llm-evals/):

just dev

About

Because we should all have our own set of LLM evals.

Topics

Resources

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •