π§ The AI-native memory framework for building intelligent, context-aware applications π§
Built with Rust, Cortex Memory gives your AI agents a high-performance, persistent, and intelligent long-term memory.
Cortex Memory is a complete, production-ready framework for giving your AI applications a long-term memory. It moves beyond simple chat history, providing an intelligent memory system that automatically extracts, organizes, and optimizes information to make your AI agents smarter and more personalized.
Powered by Rust and LLMs, Cortex Memory analyzes conversations, deduces facts, and stores them in a structured, searchable knowledge base. This allows your agent to remember user preferences, past interactions, and key details, leading to more natural and context-aware conversations.
Transform your stateless AI into an intelligent, context-aware partner.
| Before Cortex Memory | After Cortex Memory |
|---|---|
|
Stateless AI
|
Intelligent AI with Cortex Memory
|
- Build Smarter Agents: Give your AI the ability to learn and remember, leading to more intelligent and useful interactions.
- Enhance User Experience: Create personalized, context-aware experiences that delight users and build long-term engagement.
- Automated Memory Management: Let the system handle the complexity of extracting, storing, and optimizing memories. No more manual data management.
- High Performance & Scalability: Built with Rust, Cortex Memory is fast, memory-safe, and ready to scale with your application.
- Flexible & Extensible: Integrate with your existing systems via a REST API, CLI, or direct library usage.
- Insightful Analytics: Use the provided web dashboard to visualize and understand your agent's memory.
π For:
- Developers building LLM-powered chatbots and agents.
- Teams creating personalized AI assistants.
- Open source projects that need a memory backbone.
- Anyone who wants to build truly intelligent AI applications!
β€οΈ Like Cortex Memory? Star it π or Sponsor Me! β€οΈ
- Intelligent Fact Extraction: Automatically extracts key facts and insights from unstructured text using LLMs.
- Memory Classification & Deduplication: Organizes memories and removes redundant information to keep the knowledge base clean and efficient.
- Automated Memory Optimization: Periodically reviews, consolidates, and refines memories to improve relevance and reduce cost.
- Vector-Based Semantic Search: Finds the most relevant memories using high-performance vector similarity search.
- Multi-Modal Access: Interact with the memory system through a REST API, a command-line interface (CLI), or as a library in your Rust application.
- Agent Framework Integration: Provides tools and adapters to easily plug into popular AI agent frameworks.
- Web Dashboard: A dedicated web UI (
cortex-mem-insights) for monitoring, analyzing, and visualizing the agent's memory.
Cortex Memory is a modular system composed of several crates, each with a specific purpose. This design provides flexibility and separation of concerns.
graph TD
subgraph "User Interfaces"
CLI["cortex-mem-cli"]
Insights["cortex-mem-insights"]
end
subgraph "APIs & Integrations"
Service["cortex-mem-service"]
MCP["cortex-mem-mcp"]
Rig["cortex-mem-rig"]
end
subgraph "Core Engine"
Core["cortex-mem-core"]
end
subgraph "External Services"
VectorDB[("Vector Database")]
LLM[("LLM Provider")]
end
%% Define Dependencies
Insights --> Service
CLI --> Core
Service --> Core
MCP --> Core
Rig --> Core
Core --> VectorDB
Core --> LLM
cortex-mem-core: The heart of the system. It contains all the business logic for memory management, including extraction, optimization, and search.cortex-mem-service: Exposes the core logic via a high-performance REST API, making it accessible to any programming language or system.cortex-mem-cli: A command-line tool for developers and administrators to directly interact with the memory store for testing and management.cortex-mem-insights: A web-based management tool that provides analytics and visualization of the agent's memory by consuming thecortex-mem-serviceAPI.cortex-mem-mcp/cortex-mem-rig: Specialized adapter crates to integrate Cortex Memory as a "tool" within various AI agent frameworks.cortex-mem-config: Shared configuration and type definitions used across the ecosystem.
Cortex Memory includes a powerful web-based dashboard (cortex-mem-insights) that provides real-time monitoring, analytics and management capabilities. Here's what you can expect to see:
Interactive Dashboard: Get an overview of memory usage, system health, and activity statistics
| View and manage individual memory records | Analyze and optimize memory quality |
|---|---|
![]() |
![]() |
| Monitor memory performance and activity | Detailed insights and trends over time |
![]() |
![]() |
These visual tools help you understand how Cortex Memory is performing and how your AI agent's memory is evolving over time.
Meet Cortex TARS β a production-ready AI-native TUI (Terminal User Interface) application that demonstrates the true power of Cortex Memory. Built as a "second brain" companion, Cortex TARS brings auditory presence to your AI experience and can truly hear and remember your voice in the real world, showcases how persistent memory transforms AI interactions from fleeting chats into lasting, intelligent partnerships.
Cortex TARS is more than just a chatbot β it's a comprehensive AI assistant platform that leverages Cortex Memory's advanced capabilities:
Create and manage multiple AI personas, each with distinct personalities, system prompts, and specialized knowledge areas. Whether you need a coding assistant, a creative writing partner, or a productivity coach, Cortex TARS lets you run them all simultaneously with complete separation.
Every agent maintains its own long-term memory, learning from interactions over time. Your coding assistant remembers your coding style and preferences; your writing coach adapts to your voice and goals. No more repeating yourself β each agent grows smarter with every conversation.
Advanced memory architecture ensures complete isolation between agents and users. Each agent's knowledge base is separate, preventing cross-contamination while enabling personalized experiences across different contexts and use cases.
This is where Cortex TARS truly shines. With real-time device audio capture, Cortex TARS can listen to your conversations, meetings, or lectures and automatically convert them into structured, searchable memories. Imagine attending a meeting while Cortex TARS silently captures key insights, decisions, and action items β all stored and ready for instant retrieval later. No more frantic note-taking or forgotten details!
Cortex TARS isn't just an example β it's a fully functional application that demonstrates:
- Real-world production readiness: Built with Rust, it's fast, reliable, and memory-safe
- Seamless Cortex Memory integration: Shows best practices for leveraging the memory framework
- Practical AI workflows: From multi-agent conversations to audio capture and memory extraction
- User-centric design: Beautiful TUI interface with intuitive controls and rich features
Ready to see Cortex Memory in action? Dive into the Cortex TARS project:
cd examples/cortex-mem-tars
cargo build --release
cargo run --releaseCheck out the Cortex TARS README for detailed setup instructions, configuration guides, and usage examples.
Cortex TARS proves that Cortex Memory isn't just a framework β it's the foundation for building intelligent, memory-aware applications that truly understand and remember.
Cortex Memory has been rigorously evaluated against LangMem using the LOCOMO dataset (50 conversations, 150 questions) through a standardized memory system evaluation framework. The results demonstrate Cortex Memory's superior performance across multiple dimensions.
Overall Performance: Cortex Memory significantly outperforms LangMem across all key metrics
| Metric | Cortex Memory | LangMem | Improvement |
|---|---|---|---|
| Recall@1 | 93.33% | 26.32% | +67.02pp |
| Recall@3 | 94.00% | 50.00% | +44.00pp |
| Recall@5 | 94.67% | 55.26% | +39.40pp |
| Recall@10 | 94.67% | 63.16% | +31.51pp |
| Precision@1 | 93.33% | 26.32% | +67.02pp |
| MRR | 93.72% | 38.83% | +54.90pp |
| NDCG@5 | 80.73% | 18.72% | +62.01pp |
| NDCG@10 | 79.41% | 16.83% | +62.58pp |
| Cortex Memory Evaluation: Excellent retrieval performance with 93.33% Recall@1 and 93.72% MRR | LangMem Evaluation: Modest performance with 26.32% Recall@1 and 38.83% MRR |
|---|---|
![]() |
![]() |
-
Significantly Improved Retrieval Accuracy: Cortex Memory achieves 93.33% Recall@1, a 67.02 percentage point improvement over LangMem's 26.32%. This indicates Cortex is far superior at retrieving relevant memories on the first attempt.
-
Clear Ranking Quality Advantage: Cortex Memory's MRR of 93.72% vs LangMem's 38.83% shows it not only retrieves accurately but also ranks relevant memories higher in the result list.
-
Comprehensive Performance Leadership: Across all metrics β especially NDCG@5 (80.73% vs 18.72%) β Cortex demonstrates consistent, significant advantages in retrieval quality, ranking accuracy, and overall performance.
-
Technical Advantages: Cortex Memory's performance is attributed to:
- Efficient Rust-based implementation
- Powerful retrieval capabilities of Qdrant vector database
- Optimized memory management strategies
The benchmark uses a professional memory system evaluation framework located in examples/lomoco-evaluation, which includes:
- Professional Metrics: Recall@K, Precision@K, MRR, NDCG, and answer quality metrics
- Enhanced Dataset: 50 conversations with 150 questions covering various scenarios
- Statistical Analysis: 95% confidence intervals, standard deviation, and category-based statistics
- Multi-System Support: Supports comparison between Cortex Memory, LangMem, and Simple RAG baselines
For more details on running the evaluation, see the lomoco-evaluation README.
Cortex Memory uses a sophisticated pipeline to process and manage memories, orchestrated by the MemoryManager in cortex-mem-core.
sequenceDiagram
participant App as Application
participant Service as cortex-mem-service
participant Manager as MemoryManager (Core)
participant Extractor as Fact Extractor (LLM)
participant VectorStore as Vector Database
participant Optimizer as Optimizer (LLM)
App->>Service: Add new text (e.g., chat log)
Service->>Manager: add_memory(text)
Manager->>Extractor: Extract facts from text
Extractor-->>Manager: Return structured facts
Manager->>VectorStore: Store new facts as vectors
loop Periodically
Manager->>Optimizer: Start optimization plan
Optimizer->>VectorStore: Fetch related memories
Optimizer->>Optimizer: Consolidate & refine memories
Optimizer->>VectorStore: Update/archive old memories
end
App->>Service: Search for relevant info
Service->>Manager: search(query)
Manager->>VectorStore: Find similar vectors
VectorStore-->>Manager: Return relevant facts
Manager-->>Service: Return results
Service-->>App: Return relevant memories
- Rust (version 1.70 or later)
- Qdrant or another compatible vector database
- An OpenAI-compatible LLM API endpoint
The simplest way to get started is to use the CLI and Service binaries, which can be installed via cargo.
# Install the CLI for command-line management
cargo install cortex-mem-cli
# Install the REST API Service for application integration
cargo install cortex-mem-service
# Install the MCP server for specific agent framework integrations
cargo install cortex-mem-mcpCortex Memory applications (cortex-mem-cli, cortex-mem-service, cortex-mem-mcp) are configured via a config.toml file. The CLI will look for this file in the current directory by default, or you can pass a path using the -c or --config flag.
Here is a sample config.toml with explanations:
# -----------------------------------------------------------------------------
# HTTP Server Configuration (`cortex-mem-service` only)
# -----------------------------------------------------------------------------
[server]
host = "0.0.0.0" # IP address to bind the server to
port = 8000 # Port for the HTTP server
cors_origins = ["*"] # Allowed origins for CORS (use ["*"] for permissive)
# -----------------------------------------------------------------------------
# Qdrant Vector Database Configuration
# -----------------------------------------------------------------------------
[qdrant]
url = "http://localhost:6333" # URL of your Qdrant instance
collection_name = "cortex-memory" # Name of the collection to use for memories
timeout_secs = 5 # Timeout for Qdrant operations
# embedding_dim is now auto-detected and no longer required here.
# -----------------------------------------------------------------------------
# LLM (Large Language Model) Configuration (for reasoning, summarization)
# -----------------------------------------------------------------------------
[llm]
api_base_url = "https://api.openai.com/v1" # Base URL of your LLM provider
api_key = "sk-your-openai-api-key" # API key for the LLM provider (sensitive)
model_efficient = "gpt-5-mini" # Model for simple tasks like classification
temperature = 0.7 # Sampling temperature for LLM responses
max_tokens = 8192 # Max tokens for LLM generation
# -----------------------------------------------------------------------------
# Embedding Service Configuration
# -----------------------------------------------------------------------------
[embedding]
api_base_url = "https://api.openai.com/v1" # Base URL of your embedding provider
api_key = "sk-your-openai-api-key" # API key for the embedding provider (sensitive)
model_name = "text-embedding-3-small" # Name of the embedding model to use
batch_size = 16 # Number of texts to embed in a single batch
timeout_secs = 10 # Timeout for embedding requests
# -----------------------------------------------------------------------------
# Memory Management Configuration
# -----------------------------------------------------------------------------
[memory]
max_memories = 10000 # Max number of memories to keep in the store
similarity_threshold = 0.65 # Threshold for considering memories similar
max_search_results = 50 # Default max results for a search query
auto_summary_threshold = 32768 # Token count threshold to trigger auto-summary
auto_enhance = true # Automatically enhance memories with metadata
deduplicate = true # Enable or disable memory deduplication
merge_threshold = 0.75 # Similarity threshold for merging memories during optimization
search_similarity_threshold = 0.50 # Minimum similarity for a memory to be included in search results
# -----------------------------------------------------------------------------
# Logging Configuration
# -----------------------------------------------------------------------------
[logging]
enabled = true # Enable or disable logging to a file
log_directory = "logs" # Directory to store log files
level = "info" # Logging level (e.g., "info", "debug", "warn", "error")The CLI provides a powerful interface for direct interaction with the memory system. All commands require a config.toml file, which can be specified with --config <path>.
Adds a new piece of information to the memory store.
cortex-mem-cli add --content "The user is interested in Rust programming." --user-id "user123"--content <text>: (Required) The text content of the memory.--user-id <id>: An optional user ID to associate with the memory.--agent-id <id>: An optional agent ID to associate with the memory.
Performs a semantic search on the memory store.
cortex-mem-cli search --query "what are the user's hobbies?" --user-id "user123" --limit 5--query <text>: The natural language query for the search.--user-id <id>: Filter memories by user ID.--agent-id <id>: Filter memories by agent ID.--topics <t1,t2>: Filter by a comma-separated list of topics.--keywords <k1,k2>: Filter by a comma-separated list of keywords.--limit <n>: The maximum number of results to return.
Retrieves a list of memories based on metadata filters, without performing a semantic search.
cortex-mem-cli list --user-id "user123" --limit 20- Supports the same filters as
search(--user-id,--agent-id, etc.), but does not use a--query.
Removes a memory from the store by its unique ID.
cortex-mem-cli delete <memory-id>The CLI provides a full suite of tools to manage the memory optimization process.
# Manually trigger a new optimization run
cortex-mem-cli optimize start
# Check the status of a running or completed optimization job
cortex-mem-cli optimize-status --job-id <job-id>
# View or update the optimization schedule and parameters
cortex-mem-cli optimize-config --get
cortex-mem-cli optimize-config --set --schedule "0 0 * * * *" --enabledThe REST API allows you to integrate Cortex Memory into any application, regardless of the programming language.
# Start the API server (will use configuration from config.toml)
cortex-mem-serviceHere are some of the primary endpoints available:
GET /health: Health check for the service.POST /memories: Create a new memory.GET /memories: List memories with metadata filtering.POST /memories/search: Perform a semantic search for memories.GET /memories/{id}: Retrieve a single memory by its ID.PUT /memories/{id}: Update a memory.DELETE /memories/{id}: Delete a memory.POST /memories/batch/delete: Delete a batch of memories.POST /memories/batch/update: Update a batch of memories.POST /optimization: Manually start an optimization job.GET /optimization/{job_id}: Get the status of an optimization job.
curl -X POST http://localhost:8000/memories \
-H "Content-Type: application/json" \
-d '{
"content": "The user just signed up for the premium plan.",
"metadata": {
"user_id": "user-xyz-789",
"agent_id": "billing-bot-01"
}
}'curl -X POST http://localhost:8000/memories/search \
-H "Content-Type: application/json" \
-d '{
"query": "What is the user's current plan?",
"filters": {
"user_id": "user-xyz-789"
},
"limit": 3
}'We welcome all forms of contributions! Report bugs or submit feature requests through GitHub Issues.
- Fork this project
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Create a Pull Request
This project is licensed under the MIT License. See the LICENSE file for details.








