LLM Evaluation Platform

Main features:

Set up experiments with a unique system prompt and multiple LLM models
Evaluate the responses from the LLM models with certain metrics such as exact match, LLM judge, cosine similarity, etc. (more to come!)
Streaming responses from the LLM models to the frontend
Upload a json file with test cases to evaluate the overall performance of the LLM models and see which one is the best
Compare the response times, time to first token, tokens per second for each model using my own NPM library llm-chain!
Visualize the results with graphs

git clone https://github.com/faizancodes/llm-eval-platform.git
cd llm-eval-platform
npm install

Groq and Google are both free to use. OpenAI does require a paid account, so you would like to use this app without it, you can remove the OPENAI_API_KEY from the env.ts file and make other modifications to the codebase as necessary.
The database URL is provided by Neon. You can sign up for a free account here.

OPENAI_API_KEY=""
GROQ_API_KEY=""
GOOGLE_API_KEY=""
DATABASE_URL=""
NODE_ENV="development"
NEXT_PUBLIC_APP_URL="http://localhost:3000"

npm run dev

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
drizzle		drizzle
public		public
src		src
.gitignore		.gitignore
.prettierrc		.prettierrc
README.md		README.md
components.json		components.json
drizzle.config.ts		drizzle.config.ts
eslint.config.mjs		eslint.config.mjs
example-messages.json		example-messages.json
hs-messages.json		hs-messages.json
migrate.ts		migrate.ts
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
schema.png		schema.png
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json