AI Character Engine

Drop-in AI NPCs for any game. The AI Character Engine turns LLM-powered agents into believable characters that make autonomous decisions via tool-calling, build fading memories across three tiers, and develop dynamic closeness to the player. Plug in any game — from tavern sims to space stations — through a single GamePlugin interface.

One-Click Setup

Windows

git clone https://github.com/rookiemann/ai-character-engine.git
cd ai-character-engine
setup.bat

Linux / Mac

git clone https://github.com/rookiemann/ai-character-engine.git
cd ai-character-engine
chmod +x setup.sh && ./setup.sh

Game developer? Start with QUICKSTART.md — add AI NPCs to your game in 15 minutes.

Features

35 subsystems organized into six categories:

Core

Subsystem	Description
Engine	Top-level orchestrator that wires all subsystems together
AgentRunner	Stateless LLM calls: context in, decision out
MemoryManager	3-tier memory: working ring buffer, episodic with fading, LLM-compressed summaries
InferenceService	Provider abstraction with batch concurrency
TickScheduler	Fast tick (active agents) + slow tick (background/dormant + maintenance)
ProximityManager	Closeness 0-100, drives activity tiers and capability unlocks

Social

Subsystem	Description
ChatService	Direct chat with characters (closeness 40+)
DelegationManager	Delegate authority to characters (closeness 60+)
ConversationManager	Multi-agent conversations with turn-taking
GossipManager	Information propagation via talk_to, credibility decay per hop
ReputationManager	Collective knowledge (-100 to +100), witness-based scoring
MoodContagionManager	Ephemeral emotion spreading at locations
HierarchyManager	Factions, ranks, chain-of-command orders, auto-succession

Agent Intelligence

Subsystem	Description
EmotionManager	Emotion tracking with mood state
GoalPlanner	Step-based goal tracking and pursuit
PlayerModeler	Learns player preferences over time
NeedsManager	5 needs (rest, social, sustenance, safety, purpose) that drive initiative
RoutineManager	Phase-based daily activities tied to game time
InitiativeChecker	Event-driven decision triggers
PerceptionManager	Location tracking, spatial event filtering, nearby awareness

Memory

Subsystem	Description
3-Tier Memory	Working (recent) -> Episodic (importance-scored, fading) -> Summary (compressed)
SemanticRetriever	Embedding-based memory search when SQL retrieval returns sparse results
MemoryConsolidator	Clusters similar memories (tag-based + optional semantic)
EmbeddingService	Optional embedding provider for semantic features

Infrastructure

Subsystem	Description
ToolRegistry	Game registers tools, engine routes to LLM function calling
ToolValidator	Type coercion, range clamping, size limits
FailoverChain	Circuit breaker with exponential cooldown (5s-120s)
PriorityQueue	Priority-based agent scheduling
Middleware	Pipeline for request/response processing
StatePersistence	Save/load state, snapshots, import/export
HttpServer	30 REST endpoints, native Node.js (no frameworks)
MetricsCollector	Sliding-window latency percentiles, tool/action distribution
PromptExperiment	A/B testing with weighted variant assignment
MultiPlayer	Per-player closeness and character relationships
ErrorRecovery	Tool hallucination recovery, context-size retry, graceful shutdown
StreamingChat	SSE-based streaming for chat responses

Lifecycle

Subsystem	Description
LifecycleManager	Character death, cleanup, and auto-respawn

Quick Start

1. Install

Use the setup script (recommended) or install manually:

git clone https://github.com/rookiemann/ai-character-engine.git
cd ai-character-engine
npm install
npm run build
cp engine.config.example.json engine.config.json

2. Configure a Provider

Edit engine.config.json to set your inference provider. The default is vLLM (highest throughput):

pip install vllm
python -m vllm.entrypoints.openai.api_server --model Salesforce/xLAM-2-1b-fc-r --port 8100

Or for the easiest start, use Ollama:

# Install Ollama from https://ollama.com
ollama pull qwen2.5:7b

Then change engine.config.json inference type to "ollama" (see Minimal Config Examples below).

3. Run the Starter Demo

npm run demo:starter

This starts "Quiet Village" — 2 characters (farmer + blacksmith) making autonomous decisions. The code in examples/my-first-plugin/ is heavily commented as a learning template.

4. Run the Full Sample

npm run demo:sample

"Tavern Tales" — a medieval tavern with 4 characters (barkeep, merchant, bard, guard) who chat with the player and react to events.

5. Try the HTTP API

npm run demo:api

Then in another terminal:

# Health check
curl http://localhost:3000/api/health

# List characters
curl http://localhost:3000/api/characters

# Chat with a character
curl -X POST http://localhost:3000/api/chat/char-0 \
  -H 'Content-Type: application/json' \
  -d '{"message":"Hello!"}'

# Inject a game event
curl -X POST http://localhost:3000/api/events \
  -H 'Content-Type: application/json' \
  -d '{"event":{"type":"combat","source":"bandits","data":{"description":"Bandits attack!"},"importance":8,"timestamp":0}}'

Provider Setup

The engine supports 6 LLM providers. Local providers are strongly recommended — the engine makes hundreds of LLM calls per minute.

Provider	Type	Best For	Setup
vLLM	Local	Recommended — highest throughput (11+ dec/s)	Python + CUDA GPU
Ollama	Local	Easy start — zero config, 2 minutes	`ollama pull qwen2.5:7b`
LM Studio	Local	GUI-friendly exploration	Load model in GUI
OpenRouter	Cloud	Multi-model access (no GPU needed)	API key
OpenAI	Cloud	GPT models	API key
Anthropic	Cloud	Claude models	API key

Cost warning: Cloud providers (OpenRouter, OpenAI, Anthropic) will incur significant costs because the engine makes continuous LLM calls for every active character.

See docs/provider-setup.md for detailed setup instructions for each provider.

Minimal Config Examples

vLLM (recommended — highest throughput):

{
  "inference": {
    "type": "vllm",
    "baseUrl": "http://127.0.0.1:8100/v1",
    "models": { "heavy": "default", "mid": "default", "light": "default" },
    "maxConcurrency": 64,
    "timeoutMs": 60000
  }
}

Ollama (easy start):

{
  "inference": {
    "type": "ollama",
    "models": { "heavy": "qwen2.5:7b", "mid": "qwen2.5:7b", "light": "qwen2.5:1.5b" }
  }
}

Architecture

                           +-----------------------+
                           |       GamePlugin      |
                           | (your game implements)|
                           +-----------+-----------+
                                       |
                                       v
+----------------------------------------------------------------------+
|                              ENGINE                                   |
|                                                                      |
|  +--------------+    +-----------------+    +-------------------+    |
|  | TickScheduler|--->| AgentRunner     |--->| InferenceService  |    |
|  | fast + slow  |    | context->decision|   | 6 providers       |    |
|  +--------------+    +-----------------+    +-------------------+    |
|         |                    |                                        |
|         v                    v                                        |
|  +--------------+    +-----------------+    +-------------------+    |
|  | Proximity    |    | ContextAssembler|    | FailoverChain     |    |
|  | Manager      |    | + PromptBuilder |    | circuit breaker   |    |
|  +--------------+    +-----------------+    +-------------------+    |
|         |                    |                                        |
|         v                    v                                        |
|  +--------------+    +-----------------+    +-------------------+    |
|  | Activity     |    | MemoryManager   |    | ToolRegistry      |    |
|  | Tiers        |    | 3-tier + fading |    | + ToolValidator   |    |
|  +--------------+    +-----------------+    +-------------------+    |
|                              |                                        |
|  Social:                     v              Intelligence:            |
|  ChatService          MemoryConsolidator    EmotionManager           |
|  DelegationMgr        SemanticRetriever     GoalPlanner              |
|  ConversationMgr      EmbeddingService      PlayerModeler            |
|  GossipMgr                                  NeedsManager             |
|  ReputationMgr        Infra:                RoutineManager           |
|  MoodContagionMgr     HttpServer (30 API)   InitiativeChecker        |
|  HierarchyMgr         MetricsCollector      PerceptionManager        |
|                        StatePersistence                              |
|  Lifecycle:            PromptExperiment                              |
|  LifecycleManager      Middleware                                    |
+----------------------------------------------------------------------+

Data flow for each decision:

GameEvent
  -> PerceptionManager (spatial filtering)
  -> ContextAssembler (gather memories, state, proprioception)
  -> PromptBuilder (budget-aware prompt construction)
  -> AgentRunner (LLM call via InferenceService)
  -> ToolValidator (validate + coerce tool arguments)
  -> ToolRegistry (execute game-registered tool)
  -> MemoryManager (store result as episodic memory)

Tick lifecycle:

Fast tick (default 2s): Processes active-tier agents. Each gets a full decision cycle with up to 6 tools.
Slow tick (default 30s): Processes background and dormant agents with reduced token budgets. Also runs maintenance: memory decay, consolidation, summary regeneration.

GamePlugin Interface

Games integrate by implementing the GamePlugin interface:

import { Engine, loadConfigFile } from 'ai-character-engine';
import type { GamePlugin } from 'ai-character-engine';

const myPlugin: GamePlugin = {
  id: 'my-game',
  name: 'My Game',

  // Required: Define character archetypes
  getArchetypes() {
    return [{
      id: 'warrior',
      name: 'Warrior',
      description: 'A brave fighter',
      defaultIdentity: {
        personality: 'Bold and loyal',
        backstory: 'Trained since youth',
        goals: ['Protect the village'],
        traits: ['brave', 'strong'],
      },
    }];
  },

  // Required: Define tools characters can use
  getTools() {
    return [{
      definition: {
        name: 'attack',
        description: 'Attack a target',
        parameters: [
          { name: 'target', type: 'string', description: 'Who to attack', required: true },
        ],
      },
      executor: (args) => ({
        success: true,
        result: `Attacked ${args.target}!`,
      }),
    }];
  },

  // Required: Current game state snapshot
  getGameState() {
    return {
      worldTime: Date.now(),
      location: 'Village',
      nearbyEntities: ['Player', 'Merchant'],
      recentEvents: ['Morning has broken'],
    };
  },

  // Required: Character self-knowledge
  getProprioception(characterId) {
    return {
      currentAction: 'idle',
      location: 'village_square',
      inventory: ['sword', 'shield'],
      status: ['healthy'],
      energy: 0.8,
    };
  },

  // Optional: Initial characters to spawn
  getInitialCharacters() {
    return [{
      id: 'guard-1',
      name: 'Theron',
      archetype: 'warrior',
      identity: {
        personality: 'Stern but fair',
        backstory: 'Captain of the guard',
        goals: ['Keep the peace'],
        traits: ['loyal', 'vigilant'],
      },
      initialCloseness: 30,
    }];
  },

  // Optional: World rules for system prompts
  getWorldRules() {
    return 'Medieval fantasy village. No modern technology. Gold is currency.';
  },
};

// Start the engine
const config = loadConfigFile(); // or inline config
const engine = new Engine(config);
await engine.loadPlugin(myPlugin);
engine.start();

See docs/game-plugin-guide.md for the complete interface reference with all 25+ methods.

Configuration Reference

The full configuration is in engine.config.example.json. Key sections:

{
  // Database
  "database": {
    "path": "./data/engine.db"    // SQLite file path, or ":memory:" for in-memory
  },

  // Inference provider
  "inference": {
    "type": "ollama",             // ollama | vllm | lmstudio | openrouter | openai | anthropic
    "baseUrl": "...",             // Provider URL (auto-set for ollama)
    "apiKey": "...",              // For cloud providers
    "models": {
      "heavy": "qwen2.5:7b",     // Complex decisions
      "mid": "qwen2.5:7b",       // Standard decisions
      "light": "qwen2.5:1.5b"    // Simple decisions
    },
    "maxConcurrency": 10,         // Parallel requests (64 for vLLM)
    "timeoutMs": 30000,           // Request timeout
    "maxRetries": 2               // Retry count
  },

  // Optional: Embeddings for semantic memory
  "embedding": {
    "type": "ollama",
    "models": { "heavy": "nomic-embed-text", "mid": "nomic-embed-text", "light": "nomic-embed-text" },
    "maxConcurrency": 4,
    "timeoutMs": 10000
  },

  // Proximity / closeness
  "proximity": {
    "decayRatePerTick": 0.1,      // Closeness decay per slow tick
    "interactionBoost": 4,         // Boost on tool interaction
    "chatBoost": 2,                // Boost on chat message
    "promotionThreshold": 60,      // Active tier threshold
    "backgroundThreshold": 20,     // Background tier threshold
    "dormantThreshold": 5,         // Dormant tier threshold
    "chatMinCloseness": 40,        // Min closeness to chat
    "delegateMinCloseness": 60     // Min closeness to delegate
  },

  // Tick scheduling
  "tick": {
    "fastTickMs": 2000,            // Active agent processing interval
    "slowTickMs": 30000,           // Background + maintenance interval
    "batchSize": 10                // Agents per batch
  },

  // Memory
  "memory": {
    "workingMemorySize": 5,            // Ring buffer size
    "episodicRetrievalCount": 5,       // Memories to retrieve per decision
    "importanceThreshold": 3,          // Min importance to store
    "decayInterval": 10,               // Ticks between decay passes
    "pruneThreshold": 0.5,             // Remove memories below this score
    "summaryRegenerateInterval": 50    // Decisions between summary regen
  },

  // Logging
  "logging": {
    "level": "info",               // trace | debug | info | warn | error
    "pretty": true                 // Pretty-print logs (disable in production)
  }
}

Examples

Example	Command	Description
My First Plugin	`npm run demo:starter`	2 characters in a village — heavily commented learning template
Tavern Tales	`npm run demo:sample`	4 characters in a medieval tavern, demonstrates chat, events, and tool use
Game Simulations	`npm run demo:sim`	6 game genres (pirate, space, farm, detective, survival, academy) with 32 characters each against vLLM
Diagnostics	`npm run demo:diagnose`	Raw LLM output analysis — categorizes tool call failures
Rich Context	`npm run demo:rich`	Compares bare vs rich game state impact on decision quality
API Server	`npm run demo:api`	HTTP API server on port 3000, integrate from any language

All examples support loadConfigFile() — drop an engine.config.json in the project root and every example picks it up. Without one, each falls back to sensible inline defaults.

Performance

Benchmarked with the xLAM-2-1B model (Salesforce) on an RTX 3090:

Metric	Value
Peak throughput	11.91 decisions/sec
Token throughput	16,350 tokens/sec
Characters	32 simultaneous
Concurrency	64 parallel requests
Latency (p50)	4.6s
Errors	0
Tool types used	5/6
Tool balance	No tool above 23%

10 models tested across 7 configurations. Full results in examples/stress-test/results/.

Model recommendations:

xLAM-2-1B (Salesforce) — Best balance of speed and tool-calling accuracy
Qwen2.5-1.5B — Fastest raw throughput, but 82% dialogue (low tool usage)
7B+ models — Better reasoning, but 3-5x higher latency

Troubleshooting

Provider not running

Symptom	Fix
`ECONNREFUSED 127.0.0.1:8100`	Start vLLM: `python -m vllm.entrypoints.openai.api_server --model <path> --port 8100`
`ECONNREFUSED 127.0.0.1:11434`	Start Ollama: `ollama serve` (or it starts automatically on first `ollama pull`)
`ECONNREFUSED 127.0.0.1:1234`	Open LM Studio, load a model, and click "Start Server"
Health check says `inference: false`	Your provider URL or port doesn't match `engine.config.json`

better-sqlite3 build failure (Windows)

The better-sqlite3 package requires native compilation. If npm install fails:

Install Visual Studio Build Tools
Select the "Desktop development with C++" workload
Run npm install again

Model not found

Provider	Fix
vLLM	The model path in `--model` must exist. Use HuggingFace ID (`Salesforce/xLAM-2-1b-fc-r`) or local path.
Ollama	Run `ollama pull <model-name>` first. List available: `ollama list`
LM Studio	Load the model in the GUI before starting the server

Timeout errors

Increase timeoutMs in engine.config.json (default 60000 for vLLM, 30000 for Ollama)
Use a smaller model — 1-2B parameter models are 3-5x faster than 7B+
Reduce batchSize in tick config if your GPU is overloaded

Characters only talk, never use tools

Model choice matters — xLAM-2-1b-fc-r (Salesforce) has the best tool-calling accuracy for small models
Tool descriptions should be clear and specific — vague descriptions confuse small models
2-6 tools is the sweet spot — too many tools overwhelm small models

Common config mistakes

Mistake	Fix
Wrong `type` for provider	Must be exactly: `vllm`, `ollama`, `lmstudio`, `openrouter`, `openai`, or `anthropic`
Missing `baseUrl` for vLLM	Add `"baseUrl": "http://127.0.0.1:8100/v1"` (Ollama auto-detects)
`models` set to `"default"` with Ollama	Ollama needs real model names: `"qwen2.5:7b"`. Only vLLM/LM Studio use `"default"`
`maxConcurrency` too high for Ollama	Ollama processes sequentially — keep at 10 or lower

Windows-specific issues

vLLM on Windows: No official pip wheel — use the pre-built Windows environment or see docs/vllm-windows.md
vLLM requires --enforce-eager — CUDA graphs (Triton) don't work on Windows
GPU memory: max gpu-memory-utilization is ~0.92 (display driver reserves ~80MB)
Don't use CUDA_DEVICE_ORDER=PCI_BUS_ID — it can flip GPU indices on some systems

Documentation

Quick Start Guide — Add AI NPCs to your game in 15 minutes
Architecture — System design, subsystem graph, tick lifecycle, data flow
Game Plugin Guide — Complete plugin interface reference and tutorial
Provider Setup — Detailed setup for all 6 providers + embeddings
Memory System — 3-tier memory, fading, retrieval, consolidation
Proximity System — Closeness, activity tiers, capability unlocks
vLLM on Windows — Building and running vLLM on Windows with CUDA
API Reference — All 30 HTTP API endpoints with schemas and examples
Contributing — Development setup, code style, PR guidelines

Contributing

See CONTRIBUTING.md for development setup, project structure, code style, and how to add new subsystems or providers.

# Development workflow
npm install
npm run build       # Compile TypeScript
npm test            # Run 415 unit tests
npm run test:e2e    # Run E2E tests (requires vLLM)
npm run lint        # Type-check without emitting

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
docs		docs
examples		examples
src		src
tests		tests
.gitattributes		.gitattributes
.gitignore		.gitignore
.npmignore		.npmignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
drizzle.config.ts		drizzle.config.ts
engine.config.example.json		engine.config.example.json
package-lock.json		package-lock.json
package.json		package.json
setup.bat		setup.bat
setup.sh		setup.sh
tsconfig.json		tsconfig.json
vitest.config.ts		vitest.config.ts
vitest.e2e.config.ts		vitest.e2e.config.ts

Folders and files

Latest commit

History

Repository files navigation

AI Character Engine

One-Click Setup

Windows

Linux / Mac

Features

Core

Social

Agent Intelligence

Memory

Infrastructure

Lifecycle

Quick Start

1. Install

2. Configure a Provider

3. Run the Starter Demo

4. Run the Full Sample

5. Try the HTTP API

Provider Setup

Minimal Config Examples

Architecture

GamePlugin Interface

Configuration Reference

Examples

Performance

Troubleshooting

Provider not running

better-sqlite3 build failure (Windows)

Model not found

Timeout errors

Characters only talk, never use tools

Common config mistakes

Windows-specific issues

Documentation

Contributing

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages