Proposal: HIPPOCAMPUS Pre-Computed Concept Index for Retrieval Optimization

## Problem

Runtime vector search for memory retrieval is expensive and query-dependent. Every recall requires embedding the query, scanning the vector space, and ranking results — at inference time. As memory stores grow, this cost scales linearly.

## Proposed Solution: Pre-Computed Concept Index

Instead of searching at inference time, pre-compute a concept-to-memory mapping during offline consolidation (the equivalent of "sleep"). At retrieval time, anchor words detected in the query map directly to pre-indexed memory clusters — O(1) dictionary lookup instead of runtime kNN.

### How it works

1. **Build phase** (offline, e.g. nightly consolidation):
   - Define an anchor vocabulary (concepts the agent frequently reasons about)
   - For each anchor, embed it and find the k-nearest memory chunks
   - Store the mapping: `concept → [chunk_ids]`

2. **Retrieval phase** (inference time):
   - Detect anchor words in the current query/context
   - Look up pre-computed chunk lists — no embedding, no search
   - Fall through to traditional vector search only for novel/unseen concepts

### Two-tier architecture

| Tier | Built | Indexes | Purpose |
|---|---|---|---|
| Episodic | Real-time (on memory store) | Raw events with temporal context | Recent, unprocessed recall |
| Semantic | Nightly (post-consolidation) | Abstracted knowledge | Stable, high-precision recall |

Retrieval checks semantic tier first (higher precision), falls through to episodic (higher recency). The staleness gap in the semantic tier is intentional — you can't abstract an event before reflecting on it. This mirrors hippocampal replay during slow-wave sleep → cortical consolidation.

## What we've built so far

- A static concept index (500 anchors → 9,524 chunks) built with Gemini embeddings
- A CLI for querying it
- A nightly cron to rebuild it
- Source code: [hippocampus-rebuild.ts](https://github.com/globalcaos/clawdbot-moltbot-openclaw/blob/main/src/memory/engram/hippocampus-rebuild.ts), [hippocampus-hook.ts](https://github.com/globalcaos/clawdbot-moltbot-openclaw/blob/main/src/plugins/hippocampus-hook.ts)

**Honest status:** It's not yet wired into our live retrieval path. We still use runtime semantic search (Gemini embeddings via `memory_search`). The index exists and rebuilds nightly, but we haven't replaced the hot path with it yet. This is a design proposal, not a battle-tested system.

## Research paper

We've written a paper exploring the neuroscience analogy and the math behind the approach:
- [HIPPOCAMPUS paper (Markdown)](https://github.com/globalcaos/clawdbot-moltbot-openclaw/blob/main/docs/papers/hippocampus.md)

The core insight from neuroscience: the human hippocampus doesn't store memories — it indexes them. Hippocampal damage prevents forming new memories not because storage fails, but because indexing breaks.

## Why this could matter for Hexis

Hexis already has a rich memory taxonomy and Postgres + pgvector for retrieval. This concept index could sit as an optimization layer on top:
- Pre-compute concept mappings during Hexis's consolidation/heartbeat cycles
- Reduce inference-time vector search calls for frequently accessed memory types
- Backend-agnostic — works with Postgres/pgvector just as well as SQLite

## What we'd like

Your feedback on whether this approach has merit for Hexis's retrieval path. We're genuinely looking to learn, not to sell — if the idea doesn't fit, that's useful feedback too.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Proposal: HIPPOCAMPUS Pre-Computed Concept Index for Retrieval Optimization #17

Problem

Proposed Solution: Pre-Computed Concept Index

How it works

Two-tier architecture

What we've built so far

Research paper

Why this could matter for Hexis

What we'd like

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Tier	Built	Indexes	Purpose
Episodic	Real-time (on memory store)	Raw events with temporal context	Recent, unprocessed recall
Semantic	Nightly (post-consolidation)	Abstracted knowledge	Stable, high-precision recall

Proposal: HIPPOCAMPUS Pre-Computed Concept Index for Retrieval Optimization #17

Description

Problem

Proposed Solution: Pre-Computed Concept Index

How it works

Two-tier architecture

What we've built so far

Research paper

Why this could matter for Hexis

What we'd like

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions