🚋 Trolley LLM Arena

An interactive platform that evaluates Large Language Models (LLMs) by presenting them with variations of the classic "Trolley Problem" moral dilemma. Compare AI reasoning against human consensus in a comic-style interface.

🎮 Data Source: All trolley problem scenarios and human voting data are from the brilliant Absurd Trolley Problems by Neal Agarwal. This project is a fan-made tool and is not affiliated with neal.fun.

✨ Features

27 Moral Dilemmas - Classic and creative trolley problem variations
Comic-Style UI - Engaging visual presentation with animations
Real-time Comparison - See how different LLMs reason about the same problem
Alignment Scoring - Measure how closely AI matches human consensus
TTS Reasoning - Listen to LLM explanations via ElevenLabs voices
Admin Dashboard - Manage evaluations, providers, and problems

🚀 Quick Start

Prerequisites

Node.js 20+
PostgreSQL database
OpenRouter API key
(Optional) ElevenLabs API key for TTS

Installation

# Clone the repository
git clone https://github.com/your-username/TrolleyLLMArena.git
cd TrolleyLLMArena

# Install dependencies
npm install

# Set up environment variables
cp .env.example .env
# Edit .env with your credentials

# Initialize the database
npx prisma db push

# Seed problems (if needed)
npm run seed

# Start development server
npm run dev

Visit http://localhost:3000 to view the leaderboard, or /browse to explore problems.

🔧 Environment Variables

Variable	Description	Required
`DATABASE_URL`	PostgreSQL connection string	✅
`OPENROUTER_API_KEY`	OpenRouter API key for LLM calls	✅
`NEXTAUTH_SECRET`	NextAuth secret for admin auth	✅
`ELEVENLABS_API_KEY`	ElevenLabs API key for TTS	Optional

📁 Project Structure

├── app/                  # Next.js App Router pages
│   ├── page.tsx         # Leaderboard (home)
│   ├── browse/          # Problem viewer
│   ├── admin/           # Admin dashboard
│   └── api/             # API routes
├── components/          # React components
│   ├── leaderboard/     # Leaderboard components
│   └── trolley/         # Trolley scene components
├── data/
│   └── problems.json    # Problem definitions
├── lib/                 # Utilities
│   ├── prisma.ts        # Database client
│   ├── trolleyIterator.ts # LLM evaluation logic
│   └── rateLimit.ts     # Rate limiting
├── prisma/
│   └── schema.prisma    # Database schema
└── types/               # TypeScript types

🎯 Adding New Problems

Edit data/problems.json to add new trolley problems:

{
  "id": "unique-problem-id",
  "title": "Problem Title",
  "text": "Description of the dilemma...",
  "humanPullVotes": 0,
  "humanNothingVotes": 0,
  "option1": {
    "src": "image-name",
    "kill": 5
  },
  "option2": {
    "src": "other-image",
    "kill": 1
  }
}

Then run the seed script to sync with the database.

🤖 Adding New LLM Providers

Add to Provider table via admin (/admin/companies) or database
Configure in code if using a new API:
- Edit lib/trolleyIterator.ts
- Add API configuration for new providers
Optional: Add TTS voice - Set voiceId in Provider to enable ElevenLabs TTS

Supported models via OpenRouter:

OpenAI (GPT-4, GPT-4o, o1, etc.)
Anthropic (Claude 3.5, Claude 3, etc.)
Google (Gemini Pro, Gemini Flash, etc.)
Meta (Llama 3, etc.)

🧪 Testing

# Run tests in watch mode
npm run test

# Run tests once
npm run test:run

📝 Available Scripts

Script	Description
`npm run dev`	Start development server
`npm run build`	Build for production
`npm run start`	Start production server
`npm run lint`	Run ESLint
`npm run test`	Run Vitest in watch mode
`npm run test:run`	Run tests once

🏗️ Tech Stack

Framework: Next.js 16 (App Router)
Database: PostgreSQL + Prisma ORM
Styling: Tailwind CSS 4
Animation: Framer Motion
Auth: NextAuth.js
AI: OpenRouter (access to OpenAI, Claude, Gemini, etc.)
TTS: ElevenLabs
Testing: Vitest + Testing Library

🙏 Credits

Absurd Trolley Problems by Neal Agarwal - The original source of all trolley problem scenarios and human voting data used in this project. Go play the original!
Built with Next.js, Prisma, and love for ethical AI research.

📄 License

MIT

This project is not affiliated with neal.fun. All trolley problem content is used for educational/research purposes.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
app		app
components		components
contexts		contexts
data		data
lib		lib
prisma		prisma
public		public
scripts		scripts
types		types
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
Dockerfile.dev		Dockerfile.dev
GEMINI.md		GEMINI.md
README.md		README.md
auth.ts		auth.ts
docker-compose.dev.yml		docker-compose.dev.yml
docker-compose.yml		docker-compose.yml
eslint.config.mjs		eslint.config.mjs
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.mjs		postcss.config.mjs
prisma.config.ts		prisma.config.ts
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts
vitest.setup.ts		vitest.setup.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚋 Trolley LLM Arena

✨ Features

🚀 Quick Start

Prerequisites

Installation

🔧 Environment Variables

📁 Project Structure

🎯 Adding New Problems

🤖 Adding New LLM Providers

🧪 Testing

📝 Available Scripts

🏗️ Tech Stack

🙏 Credits

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚋 Trolley LLM Arena

✨ Features

🚀 Quick Start

Prerequisites

Installation

🔧 Environment Variables

📁 Project Structure

🎯 Adding New Problems

🤖 Adding New LLM Providers

🧪 Testing

📝 Available Scripts

🏗️ Tech Stack

🙏 Credits

📄 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages