Skip to content

🧠 The production-ready memory system for intelligent agents. A complete solution for memory management, from extraction and vector search to automated optimization, with a REST API, MCP, CLI, and insights dashboard out-of-the-box.

License

Notifications You must be signed in to change notification settings

sopaco/cortex-mem

Repository files navigation

Cortex Memory

🧠 The AI-native memory framework for building intelligent, context-aware applications 🧠

Built with Rust, Cortex Memory gives your AI agents a high-performance, persistent, and intelligent long-term memory.

Litho Docs Benchmark GitHub Actions Workflow Status MIT


πŸ‘‹ What is Cortex Memory?

Cortex Memory is a complete, production-ready framework for giving your AI applications a long-term memory. It moves beyond simple chat history, providing an intelligent memory system that automatically extracts, organizes, and optimizes information to make your AI agents smarter and more personalized.

Powered by Rust and LLMs, Cortex Memory analyzes conversations, deduces facts, and stores them in a structured, searchable knowledge base. This allows your agent to remember user preferences, past interactions, and key details, leading to more natural and context-aware conversations.

Transform your stateless AI into an intelligent, context-aware partner.

Before Cortex Memory After Cortex Memory

Stateless AI

  • Forgets user details after every session
  • Lacks personalization and context
  • Repeats questions and suggestions
  • Limited to short-term conversation history
  • Feels robotic and impersonal

Intelligent AI with Cortex Memory

  • Remembers user preferences and history
  • Provides deeply personalized interactions
  • Learns and adapts over time
  • Maintains context across multiple conversations
  • Builds rapport and feels like a true assistant

😺 Why Use Cortex Memory?

  • Build Smarter Agents: Give your AI the ability to learn and remember, leading to more intelligent and useful interactions.
  • Enhance User Experience: Create personalized, context-aware experiences that delight users and build long-term engagement.
  • Automated Memory Management: Let the system handle the complexity of extracting, storing, and optimizing memories. No more manual data management.
  • High Performance & Scalability: Built with Rust, Cortex Memory is fast, memory-safe, and ready to scale with your application.
  • Flexible & Extensible: Integrate with your existing systems via a REST API, CLI, or direct library usage.
  • Insightful Analytics: Use the provided web dashboard to visualize and understand your agent's memory.

🌟 For:

  • Developers building LLM-powered chatbots and agents.
  • Teams creating personalized AI assistants.
  • Open source projects that need a memory backbone.
  • Anyone who wants to build truly intelligent AI applications!

❀️ Like Cortex Memory? Star it 🌟 or Sponsor Me! ❀️

🌠 Features & Capabilities

  • Intelligent Fact Extraction: Automatically extracts key facts and insights from unstructured text using LLMs.
  • Memory Classification & Deduplication: Organizes memories and removes redundant information to keep the knowledge base clean and efficient.
  • Automated Memory Optimization: Periodically reviews, consolidates, and refines memories to improve relevance and reduce cost.
  • Vector-Based Semantic Search: Finds the most relevant memories using high-performance vector similarity search.
  • Multi-Modal Access: Interact with the memory system through a REST API, a command-line interface (CLI), or as a library in your Rust application.
  • Agent Framework Integration: Provides tools and adapters to easily plug into popular AI agent frameworks.
  • Web Dashboard: A dedicated web UI (cortex-mem-insights) for monitoring, analyzing, and visualizing the agent's memory.

🌐 The Cortex Memory Ecosystem

Cortex Memory is a modular system composed of several crates, each with a specific purpose. This design provides flexibility and separation of concerns.

graph TD
    subgraph "User Interfaces"
        CLI["cortex-mem-cli"]
        Insights["cortex-mem-insights"]
    end

    subgraph "APIs & Integrations"
        Service["cortex-mem-service"]
        MCP["cortex-mem-mcp"]
        Rig["cortex-mem-rig"]
    end
    
    subgraph "Core Engine"
        Core["cortex-mem-core"]
    end

    subgraph "External Services"
        VectorDB[("Vector Database")]
        LLM[("LLM Provider")]
    end

    %% Define Dependencies
    Insights --> Service

    CLI --> Core
    Service --> Core
    MCP --> Core
    Rig --> Core
    
    Core --> VectorDB
    Core --> LLM
Loading
  • cortex-mem-core: The heart of the system. It contains all the business logic for memory management, including extraction, optimization, and search.
  • cortex-mem-service: Exposes the core logic via a high-performance REST API, making it accessible to any programming language or system.
  • cortex-mem-cli: A command-line tool for developers and administrators to directly interact with the memory store for testing and management.
  • cortex-mem-insights: A web-based management tool that provides analytics and visualization of the agent's memory by consuming the cortex-mem-service API.
  • cortex-mem-mcp / cortex-mem-rig: Specialized adapter crates to integrate Cortex Memory as a "tool" within various AI agent frameworks.
  • cortex-mem-config: Shared configuration and type definitions used across the ecosystem.

πŸ–ΌοΈ Observability Tools​ Integration

Cortex Memory includes a powerful web-based dashboard (cortex-mem-insights) that provides real-time monitoring, analytics and management capabilities. Here's what you can expect to see:

Cortex Memory Dashboard

Interactive Dashboard: Get an overview of memory usage, system health, and activity statistics

View and manage individual memory records Analyze and optimize memory quality
snapshot-1 snapshot-2
Monitor memory performance and activity Detailed insights and trends over time
snapshot-1 snapshot-2

These visual tools help you understand how Cortex Memory is performing and how your AI agent's memory is evolving over time.

🌟 Community Showcase: Cortex TARS

Meet Cortex TARS β€” a production-ready AI-native TUI (Terminal User Interface) application that demonstrates the true power of Cortex Memory. Built as a "second brain" companion, Cortex TARS brings auditory presence to your AI experience and can truly hear and remember your voice in the real world, showcases how persistent memory transforms AI interactions from fleeting chats into lasting, intelligent partnerships.

What Makes Cortex TARS Special?

Cortex TARS is more than just a chatbot β€” it's a comprehensive AI assistant platform that leverages Cortex Memory's advanced capabilities:

🎭 Multi-Agent Management

Create and manage multiple AI personas, each with distinct personalities, system prompts, and specialized knowledge areas. Whether you need a coding assistant, a creative writing partner, or a productivity coach, Cortex TARS lets you run them all simultaneously with complete separation.

πŸ’Ύ Persistent Role Memory

Every agent maintains its own long-term memory, learning from interactions over time. Your coding assistant remembers your coding style and preferences; your writing coach adapts to your voice and goals. No more repeating yourself β€” each agent grows smarter with every conversation.

πŸ”’ Memory Isolation

Advanced memory architecture ensures complete isolation between agents and users. Each agent's knowledge base is separate, preventing cross-contamination while enabling personalized experiences across different contexts and use cases.

🎀 Real-Time Audio-to-Memory (The Game Changer)

This is where Cortex TARS truly shines. With real-time device audio capture, Cortex TARS can listen to your conversations, meetings, or lectures and automatically convert them into structured, searchable memories. Imagine attending a meeting while Cortex TARS silently captures key insights, decisions, and action items β€” all stored and ready for instant retrieval later. No more frantic note-taking or forgotten details!

Why Cortex TARS Matters

Cortex TARS isn't just an example β€” it's a fully functional application that demonstrates:

  • Real-world production readiness: Built with Rust, it's fast, reliable, and memory-safe
  • Seamless Cortex Memory integration: Shows best practices for leveraging the memory framework
  • Practical AI workflows: From multi-agent conversations to audio capture and memory extraction
  • User-centric design: Beautiful TUI interface with intuitive controls and rich features

Explore Cortex TARS

Ready to see Cortex Memory in action? Dive into the Cortex TARS project:

cd examples/cortex-mem-tars
cargo build --release
cargo run --release

Check out the Cortex TARS README for detailed setup instructions, configuration guides, and usage examples.

Cortex TARS proves that Cortex Memory isn't just a framework β€” it's the foundation for building intelligent, memory-aware applications that truly understand and remember.

πŸ† Benchmark

Cortex Memory has been rigorously evaluated against LangMem using the LOCOMO dataset (50 conversations, 150 questions) through a standardized memory system evaluation framework. The results demonstrate Cortex Memory's superior performance across multiple dimensions.

Performance Comparison

Cortex Memory vs LangMem Benchmark

Overall Performance: Cortex Memory significantly outperforms LangMem across all key metrics

Key Metrics

Metric Cortex Memory LangMem Improvement
Recall@1 93.33% 26.32% +67.02pp
Recall@3 94.00% 50.00% +44.00pp
Recall@5 94.67% 55.26% +39.40pp
Recall@10 94.67% 63.16% +31.51pp
Precision@1 93.33% 26.32% +67.02pp
MRR 93.72% 38.83% +54.90pp
NDCG@5 80.73% 18.72% +62.01pp
NDCG@10 79.41% 16.83% +62.58pp

Detailed Results

Cortex Memory Evaluation: Excellent retrieval performance with 93.33% Recall@1 and 93.72% MRR LangMem Evaluation: Modest performance with 26.32% Recall@1 and 38.83% MRR
Cortex Memory Evaluation LangMem Evaluation

Key Findings

  1. Significantly Improved Retrieval Accuracy: Cortex Memory achieves 93.33% Recall@1, a 67.02 percentage point improvement over LangMem's 26.32%. This indicates Cortex is far superior at retrieving relevant memories on the first attempt.

  2. Clear Ranking Quality Advantage: Cortex Memory's MRR of 93.72% vs LangMem's 38.83% shows it not only retrieves accurately but also ranks relevant memories higher in the result list.

  3. Comprehensive Performance Leadership: Across all metrics β€” especially NDCG@5 (80.73% vs 18.72%) β€” Cortex demonstrates consistent, significant advantages in retrieval quality, ranking accuracy, and overall performance.

  4. Technical Advantages: Cortex Memory's performance is attributed to:

    • Efficient Rust-based implementation
    • Powerful retrieval capabilities of Qdrant vector database
    • Optimized memory management strategies

Evaluation Framework

The benchmark uses a professional memory system evaluation framework located in examples/lomoco-evaluation, which includes:

  • Professional Metrics: Recall@K, Precision@K, MRR, NDCG, and answer quality metrics
  • Enhanced Dataset: 50 conversations with 150 questions covering various scenarios
  • Statistical Analysis: 95% confidence intervals, standard deviation, and category-based statistics
  • Multi-System Support: Supports comparison between Cortex Memory, LangMem, and Simple RAG baselines

For more details on running the evaluation, see the lomoco-evaluation README.

🧠 How It Works

Cortex Memory uses a sophisticated pipeline to process and manage memories, orchestrated by the MemoryManager in cortex-mem-core.

sequenceDiagram
    participant App as Application
    participant Service as cortex-mem-service
    participant Manager as MemoryManager (Core)
    participant Extractor as Fact Extractor (LLM)
    participant VectorStore as Vector Database
    participant Optimizer as Optimizer (LLM)

    App->>Service: Add new text (e.g., chat log)
    Service->>Manager: add_memory(text)
    Manager->>Extractor: Extract facts from text
    Extractor-->>Manager: Return structured facts
    Manager->>VectorStore: Store new facts as vectors
    
    loop Periodically
        Manager->>Optimizer: Start optimization plan
        Optimizer->>VectorStore: Fetch related memories
        Optimizer->>Optimizer: Consolidate & refine memories
        Optimizer->>VectorStore: Update/archive old memories
    end

    App->>Service: Search for relevant info
    Service->>Manager: search(query)
    Manager->>VectorStore: Find similar vectors
    VectorStore-->>Manager: Return relevant facts
    Manager-->>Service: Return results
    Service-->>App: Return relevant memories
Loading

πŸ–₯ Getting Started

Prerequisites

  • Rust (version 1.70 or later)
  • Qdrant or another compatible vector database
  • An OpenAI-compatible LLM API endpoint

Installation

The simplest way to get started is to use the CLI and Service binaries, which can be installed via cargo.

# Install the CLI for command-line management
cargo install cortex-mem-cli

# Install the REST API Service for application integration
cargo install cortex-mem-service

# Install the MCP server for specific agent framework integrations
cargo install cortex-mem-mcp

Configuration

Cortex Memory applications (cortex-mem-cli, cortex-mem-service, cortex-mem-mcp) are configured via a config.toml file. The CLI will look for this file in the current directory by default, or you can pass a path using the -c or --config flag.

Here is a sample config.toml with explanations:

# -----------------------------------------------------------------------------
# HTTP Server Configuration (`cortex-mem-service` only)
# -----------------------------------------------------------------------------
[server]
host = "0.0.0.0"       # IP address to bind the server to
port = 8000            # Port for the HTTP server
cors_origins = ["*"]   # Allowed origins for CORS (use ["*"] for permissive)

# -----------------------------------------------------------------------------
# Qdrant Vector Database Configuration
# -----------------------------------------------------------------------------
[qdrant]
url = "http://localhost:6333" # URL of your Qdrant instance
collection_name = "cortex-memory" # Name of the collection to use for memories
timeout_secs = 5              # Timeout for Qdrant operations
# embedding_dim is now auto-detected and no longer required here.

# -----------------------------------------------------------------------------
# LLM (Large Language Model) Configuration (for reasoning, summarization)
# -----------------------------------------------------------------------------
[llm]
api_base_url = "https://api.openai.com/v1" # Base URL of your LLM provider
api_key = "sk-your-openai-api-key"        # API key for the LLM provider (sensitive)
model_efficient = "gpt-5-mini"         # Model for simple tasks like classification
temperature = 0.7                         # Sampling temperature for LLM responses
max_tokens = 8192                         # Max tokens for LLM generation

# -----------------------------------------------------------------------------
# Embedding Service Configuration
# -----------------------------------------------------------------------------
[embedding]
api_base_url = "https://api.openai.com/v1" # Base URL of your embedding provider
api_key = "sk-your-openai-api-key"        # API key for the embedding provider (sensitive)
model_name = "text-embedding-3-small"     # Name of the embedding model to use
batch_size = 16                           # Number of texts to embed in a single batch
timeout_secs = 10                         # Timeout for embedding requests

# -----------------------------------------------------------------------------
# Memory Management Configuration
# -----------------------------------------------------------------------------
[memory]
max_memories = 10000              # Max number of memories to keep in the store
similarity_threshold = 0.65       # Threshold for considering memories similar
max_search_results = 50           # Default max results for a search query
auto_summary_threshold = 32768    # Token count threshold to trigger auto-summary
auto_enhance = true               # Automatically enhance memories with metadata
deduplicate = true                # Enable or disable memory deduplication
merge_threshold = 0.75            # Similarity threshold for merging memories during optimization
search_similarity_threshold = 0.50 # Minimum similarity for a memory to be included in search results

# -----------------------------------------------------------------------------
# Logging Configuration
# -----------------------------------------------------------------------------
[logging]
enabled = true                     # Enable or disable logging to a file
log_directory = "logs"             # Directory to store log files
level = "info"                     # Logging level (e.g., "info", "debug", "warn", "error")

πŸš€ Usage

CLI (cortex-mem-cli)

The CLI provides a powerful interface for direct interaction with the memory system. All commands require a config.toml file, which can be specified with --config <path>.

Add a Memory

Adds a new piece of information to the memory store.

cortex-mem-cli add --content "The user is interested in Rust programming." --user-id "user123"
  • --content <text>: (Required) The text content of the memory.
  • --user-id <id>: An optional user ID to associate with the memory.
  • --agent-id <id>: An optional agent ID to associate with the memory.

Search for Memories

Performs a semantic search on the memory store.

cortex-mem-cli search --query "what are the user's hobbies?" --user-id "user123" --limit 5
  • --query <text>: The natural language query for the search.
  • --user-id <id>: Filter memories by user ID.
  • --agent-id <id>: Filter memories by agent ID.
  • --topics <t1,t2>: Filter by a comma-separated list of topics.
  • --keywords <k1,k2>: Filter by a comma-separated list of keywords.
  • --limit <n>: The maximum number of results to return.

List Memories

Retrieves a list of memories based on metadata filters, without performing a semantic search.

cortex-mem-cli list --user-id "user123" --limit 20
  • Supports the same filters as search (--user-id, --agent-id, etc.), but does not use a --query.

Delete a Memory

Removes a memory from the store by its unique ID.

cortex-mem-cli delete <memory-id>

Manage Optimization

The CLI provides a full suite of tools to manage the memory optimization process.

# Manually trigger a new optimization run
cortex-mem-cli optimize start

# Check the status of a running or completed optimization job
cortex-mem-cli optimize-status --job-id <job-id>

# View or update the optimization schedule and parameters
cortex-mem-cli optimize-config --get
cortex-mem-cli optimize-config --set --schedule "0 0 * * * *" --enabled

REST API (cortex-mem-service)

The REST API allows you to integrate Cortex Memory into any application, regardless of the programming language.

Starting the Service

# Start the API server (will use configuration from config.toml)
cortex-mem-service

API Endpoints

Here are some of the primary endpoints available:

  • GET /health: Health check for the service.
  • POST /memories: Create a new memory.
  • GET /memories: List memories with metadata filtering.
  • POST /memories/search: Perform a semantic search for memories.
  • GET /memories/{id}: Retrieve a single memory by its ID.
  • PUT /memories/{id}: Update a memory.
  • DELETE /memories/{id}: Delete a memory.
  • POST /memories/batch/delete: Delete a batch of memories.
  • POST /memories/batch/update: Update a batch of memories.
  • POST /optimization: Manually start an optimization job.
  • GET /optimization/{job_id}: Get the status of an optimization job.

Example: Create a Memory

curl -X POST http://localhost:8000/memories \
  -H "Content-Type: application/json" \
  -d '{
    "content": "The user just signed up for the premium plan.",
    "metadata": {
      "user_id": "user-xyz-789",
      "agent_id": "billing-bot-01"
    }
  }'

Example: Search for Memories

curl -X POST http://localhost:8000/memories/search \
  -H "Content-Type: application/json" \
  -d '{
    "query": "What is the user's current plan?",
    "filters": {
      "user_id": "user-xyz-789"
    },
    "limit": 3
  }'

🀝 Contribute

We welcome all forms of contributions! Report bugs or submit feature requests through GitHub Issues.

Development Process

  1. Fork this project
  2. Create a feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add some amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Create a Pull Request

πŸͺͺ License

This project is licensed under the MIT License. See the LICENSE file for details.

About

🧠 The production-ready memory system for intelligent agents. A complete solution for memory management, from extraction and vector search to automated optimization, with a REST API, MCP, CLI, and insights dashboard out-of-the-box.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published