Skip to content

Cortex Predictive Retrieval

geoffrey fernald edited this page Feb 1, 2026 · 1 revision

Cortex Predictive Retrieval

Predictive retrieval anticipates what memories you'll need before you ask, reducing latency and improving relevance.

Overview

Instead of waiting for queries, Cortex V2 predicts what you'll need based on:

  • Current file context
  • Recent activity patterns
  • Temporal signals (time of day, day of week)
  • Git activity

Prediction Signals

File Signals

interface FileSignals {
  activeFile: string;           // Currently open file
  recentFiles: string[];        // Recently edited files
  fileType: string;             // .ts, .tsx, .py, etc.
  directory: string;            // Current directory
  imports: string[];            // Imported modules
}

Temporal Signals

interface TemporalSignals {
  hourOfDay: number;            // 0-23
  dayOfWeek: number;            // 0-6
  sessionDuration: number;      // Minutes in session
  timeSinceLastQuery: number;   // Seconds
}

Behavioral Signals

interface BehavioralSignals {
  recentIntents: Intent[];      // Recent query intents
  recentTopics: string[];       // Recent focus areas
  queryFrequency: number;       // Queries per hour
  correctionRate: number;       // Corrections per query
}

Git Signals

interface GitSignals {
  currentBranch: string;        // Feature branch name
  recentCommits: string[];      // Recent commit messages
  stagedFiles: string[];        // Files staged for commit
  modifiedFiles: string[];      // Uncommitted changes
}

Prediction Engine

The engine combines signals to predict relevant memories:

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Signal Gatherer β”‚
β”‚  - File signals  β”‚
β”‚  - Temporal      β”‚
β”‚  - Behavioral    β”‚
β”‚  - Git           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Predictors     β”‚
β”‚  - File-based   β”‚
β”‚  - Pattern-basedβ”‚
β”‚  - Temporal     β”‚
β”‚  - Behavioral   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜
         β”‚
         β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  Prediction     β”‚
β”‚  Cache          β”‚
β”‚  (preloaded)    β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Prediction Types

File-Based Prediction

Predicts memories based on current file:

// If editing src/auth/login.ts
// Predict: auth patterns, security constraints, login-related tribal knowledge

Pattern-Based Prediction

Predicts based on detected patterns in code:

// If file contains Express route handlers
// Predict: API patterns, error handling, validation patterns

Temporal Prediction

Predicts based on time patterns:

// If it's Monday morning
// Predict: memories frequently accessed on Monday mornings

Behavioral Prediction

Predicts based on recent activity:

// If recent queries were about "authentication"
// Predict: more auth-related memories

Prediction Cache

Predicted memories are preloaded into a fast cache:

interface PredictionCache {
  memories: Map<string, Memory>;  // Preloaded memories
  predictions: PredictedMemory[]; // Ranked predictions
  lastUpdated: Date;
  hitRate: number;                // Cache effectiveness
}

Cache Warming

// On file open
await predictionCache.warmForFile('src/auth/login.ts');

// On session start
await predictionCache.warmForSession(sessionContext);

API Usage

Get Predictions

const predictions = await cortex.getPredictions({
  activeFile: 'src/auth/login.ts',
  limit: 10
});

// Returns:
// [
//   { memory: {...}, confidence: 0.92, reason: 'file_match' },
//   { memory: {...}, confidence: 0.85, reason: 'pattern_match' },
//   ...
// ]

Preload Predictions

// Preload into cache for instant retrieval
await cortex.preloadPredictions({
  activeFile: 'src/auth/login.ts',
  maxMemories: 20
});

MCP Tool: drift_memory_predict

{
  "activeFile": "src/auth/login.ts",
  "recentFiles": ["src/auth/logout.ts", "src/middleware/auth.ts"],
  "intent": "add_feature",
  "limit": 10
}

Response:

{
  "predictions": [
    {
      "memoryId": "mem_abc123",
      "summary": "JWT tokens must be validated on every request",
      "confidence": 0.92,
      "reason": "file_match",
      "signals": ["activeFile contains 'auth'", "recent intent was 'add_feature'"]
    }
  ],
  "cacheStatus": {
    "preloaded": 15,
    "hitRate": 0.78
  }
}

Performance

Metric Without Prediction With Prediction
First query latency 150ms 20ms (cache hit)
Relevance score 0.75 0.88
Token efficiency 1x 1.3x (better targeting)

Configuration

const predictionConfig = {
  enabled: true,
  maxCacheSize: 100,           // Max memories in cache
  cacheWarmingThreshold: 0.6,  // Min confidence to cache
  signalWeights: {
    file: 0.4,
    pattern: 0.3,
    temporal: 0.15,
    behavioral: 0.15
  }
};

Best Practices

  1. Enable prediction β€” Significant latency improvement
  2. Monitor hit rate β€” Should be > 60%
  3. Tune weights β€” Adjust based on your workflow
  4. Warm on file open β€” Preload when opening files

Related Documentation

Clone this wiki locally