feat: add per-user memory with LLM extraction, dedup, and decay by Koopa0 · Pull Request #67 · Koopa0/koopa

Koopa0 · 2026-02-17T17:00:21Z

PostgreSQL + pgvector native memory store (two-threshold dedup, time-decay)
LLM-powered fact extraction with nonce-delimited prompt injection defense
LLM conflict arbitration for near-duplicate memories

- PostgreSQL + pgvector native memory store (two-threshold dedup, time-decay) - LLM-powered fact extraction with nonce-delimited prompt injection defense - LLM conflict arbitration for near-duplicate memories

codereviewbot-ai · 2026-02-17T17:02:52Z

db/migrations/000001_init_schema.down.sql

+DROP TABLE IF EXISTS memories;
 DROP TABLE IF EXISTS messages;
-
-- ============================================================================
-- Drop Sessions Table (including indexes)
-- ============================================================================
-
+DROP INDEX IF EXISTS idx_sessions_owner_id;
 DROP INDEX IF EXISTS idx_sessions_updated_at;
 DROP TABLE IF EXISTS sessions;
-
-- ============================================================================
-- Drop Documents Table (including indexes)
-- ============================================================================
-
+DROP INDEX IF EXISTS idx_documents_owner;
 DROP INDEX IF EXISTS idx_documents_metadata_gin;
 DROP INDEX IF EXISTS idx_documents_source_type;
 DROP INDEX IF EXISTS idx_documents_embedding;
 DROP TABLE IF EXISTS documents;
-
-- ============================================================================
-- Drop Extensions
-- Note: Only drop if no other schemas depend on it
-- ============================================================================
-
 DROP EXTENSION IF EXISTS vector;


Indexes are dropped after their associated tables (e.g., indexes on 'sessions' and 'documents' are dropped after the tables themselves). This can cause errors if the table no longer exists when attempting to drop the index. Recommendation: Drop all indexes before dropping their associated tables to avoid dependency errors.

codereviewbot-ai · 2026-02-17T17:02:52Z

db/migrations/000001_init_schema.down.sql

+DROP TABLE IF EXISTS memories;
 DROP TABLE IF EXISTS messages;
-
-- ============================================================================
-- Drop Sessions Table (including indexes)
-- ============================================================================
-
+DROP INDEX IF EXISTS idx_sessions_owner_id;
 DROP INDEX IF EXISTS idx_sessions_updated_at;
 DROP TABLE IF EXISTS sessions;
-
-- ============================================================================
-- Drop Documents Table (including indexes)
-- ============================================================================
-
+DROP INDEX IF EXISTS idx_documents_owner;
 DROP INDEX IF EXISTS idx_documents_metadata_gin;
 DROP INDEX IF EXISTS idx_documents_source_type;
 DROP INDEX IF EXISTS idx_documents_embedding;
 DROP TABLE IF EXISTS documents;
-
-- ============================================================================
-- Drop Extensions
-- Note: Only drop if no other schemas depend on it
-- ============================================================================
-
 DROP EXTENSION IF EXISTS vector;


The migration script does not wrap the DROP statements in a transaction. If an error occurs partway through, the database could be left in an inconsistent state. Recommendation: Wrap all statements in a transaction block (e.g., BEGIN; ... COMMIT;) to ensure atomicity.

codereviewbot-ai · 2026-02-17T17:02:52Z

db/migrations/000001_init_schema.up.sql

 CREATE TABLE IF NOT EXISTS sessions (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    title TEXT,
+    owner_id TEXT NOT NULL DEFAULT '',


Potential Data Integrity and Security Issue:

The owner_id column in the sessions table is defined as TEXT NOT NULL DEFAULT ''. This allows sessions to be created with an empty string as the owner, which may lead to orphaned sessions and complicate access control. It is recommended to:

Remove the default value and require explicit assignment of a valid owner.

Alternatively, use NULL to indicate no owner and enforce ownership at the application level.

Example:

owner_id TEXT NOT NULL

Or, if nullable:

owner_id TEXT

codereviewbot-ai · 2026-02-17T17:02:52Z

db/migrations/000001_init_schema.up.sql

+CREATE UNIQUE INDEX idx_memories_owner_content_unique
+    ON memories(owner_id, md5(content)) WHERE active = true;


Deduplication Vulnerability via MD5 Hash Collisions:

The unique index on md5(content) for active memories may allow hash collisions, potentially permitting duplicate content. Additionally, deduplication does not apply to inactive memories, which could result in inconsistent data. Consider using a stronger hash function (e.g., SHA256) or a direct unique constraint on the content column if performance and storage allow.

Example (if using SHA256):

ON memories(owner_id, encode(digest(content, 'sha256'), 'hex')) WHERE active = true;

codereviewbot-ai · 2026-02-17T17:02:52Z

internal/app/app.go

 //
 // Shutdown order:
 //  1. Cancel context (signals background tasks to stop)
-//  2. Close DB pool
-//  3. Flush OTel spans
+//  2. Wait for background goroutines (scheduler) to exit
+//  3. Close DB pool
+//  4. Flush OTel spans
 func (a *App) Close() error {
 	a.closeOnce.Do(func() {
 		slog.Info("shutting down application")


Resource cleanup error handling is missing in Close().

The Close method does not capture or propagate errors from dbCleanup or otelCleanup. If these cleanup functions fail, the error will be silently ignored, potentially masking critical shutdown failures. Consider aggregating errors from each cleanup step and returning a combined error:

var err error if a.dbCleanup != nil { if cerr := a.dbCleanup(); cerr != nil { err = errors.Join(err, cerr) } } // ... same for otelCleanup return err

codereviewbot-ai · 2026-02-17T17:02:53Z

internal/memory/sanitize_test.go

+func FuzzContainsSecrets(f *testing.F) {
+	f.Add("hello world")
+	f.Add("sk-1234567890abcdefghijklmnop")
+	f.Add("")
+	f.Add("password=secret123456")
+	f.Fuzz(func(_ *testing.T, input string) {
+		ContainsSecrets(input) // must not panic
+	})
+}


Fuzz test does not validate correctness of output

The fuzz test for ContainsSecrets (lines 103-111) only checks that the function does not panic, but does not assert the correctness of its output. This could allow silent failures if the function starts returning incorrect results without panicking. To improve robustness, consider adding assertions for expected output on known inputs, or at least logging unexpected results for further analysis.

Example improvement:

f.Fuzz(func(t *testing.T, input string) { got := ContainsSecrets(input) // Optionally assert or log output for known cases // t.Logf("ContainsSecrets(%q) = %v", input, got) })

codereviewbot-ai · 2026-02-17T17:02:53Z

internal/memory/scheduler.go

+	if n, err := s.store.UpdateDecayScores(ctx); err != nil {
+		s.logger.Warn("decay update failed", "error", err)
+	} else if n > 0 {
+		s.logger.Debug("decay scores updated", "count", n)
+	}


Error Handling Limitation:

If UpdateDecayScores fails, the error is only logged and not propagated or retried. This could result in persistent data inconsistencies if failures are not transient. Consider implementing a retry mechanism or escalating the error to ensure critical failures are not silently ignored.

Example:

for retries := 0; retries < maxRetries; retries++ { n, err := s.store.UpdateDecayScores(ctx) if err == nil { if n > 0 { s.logger.Debug("decay scores updated", "count", n) } break } s.logger.Warn("decay update failed", "error", err) time.Sleep(retryDelay) }

codereviewbot-ai · 2026-02-17T17:02:53Z

internal/memory/scheduler.go

+	if n, err := s.store.DeleteStale(ctx); err != nil {
+		s.logger.Warn("stale expiry failed", "error", err)
+	} else if n > 0 {
+		s.logger.Info("expired stale memories", "count", n)
+	}


Error Handling Limitation:

Similar to the previous block, if DeleteStale fails, the error is only logged. This could lead to stale data persisting indefinitely. Consider adding a retry mechanism or alerting to ensure failures are addressed promptly.

Example:

for retries := 0; retries < maxRetries; retries++ { n, err := s.store.DeleteStale(ctx) if err == nil { if n > 0 { s.logger.Info("expired stale memories", "count", n) } break } s.logger.Warn("stale expiry failed", "error", err) time.Sleep(retryDelay) }

codereviewbot-ai · 2026-02-17T17:02:53Z

internal/memory/store.go

+	defer func() {
+		if rbErr := tx.Rollback(ctx); rbErr != nil && !errors.Is(rbErr, pgx.ErrTxClosed) {
+			s.logger.Debug("transaction rollback", "error", rbErr)
+		}
+	}()


Deferred Rollback Always Executes After Commit

The deferred rollback in the Add method will always execute, even after a successful commit. This can result in unnecessary debug logs and may mask real rollback errors. To improve reliability and clarity, track whether the transaction has been committed and only attempt rollback if it has not:

committed := false // ... if err := tx.Commit(ctx); err != nil { return fmt.Errorf("committing memory transaction: %w", err) } committed = true // ... defer func() { if !committed { if rbErr := tx.Rollback(ctx); rbErr != nil && !errors.Is(rbErr, pgx.ErrTxClosed) { s.logger.Debug("transaction rollback", "error", rbErr) } } }()

codereviewbot-ai · 2026-02-17T17:02:53Z

internal/memory/store.go

+func (s *Store) evictIfNeeded(ctx context.Context, ownerID string) error {
+	var count int
+	if err := s.pool.QueryRow(ctx,
+		`SELECT COUNT(*) FROM memories WHERE owner_id = $1 AND active = true`,
+		ownerID,
+	).Scan(&count); err != nil {
+		return fmt.Errorf("counting memories: %w", err)
+	}
+
+	if count <= MaxPerUser {
+		return nil
+	}
+
+	excess := count - MaxPerUser
+
+	// First try to evict inactive memories.
+	tag, err := s.pool.Exec(ctx,
+		`DELETE FROM memories
+		 WHERE id IN (
+		   SELECT id FROM memories
+		   WHERE owner_id = $1 AND active = false
+		   ORDER BY updated_at ASC, id ASC
+		   LIMIT $2
+		 )`,
+		ownerID, excess,
+	)
+	if err != nil {
+		return fmt.Errorf("evicting inactive: %w", err)
+	}
+
+	remaining := excess - int(tag.RowsAffected())
+	if remaining <= 0 {
+		return nil
+	}
+
+	// Evict oldest active by created_at.
+	_, err = s.pool.Exec(ctx,
+		`DELETE FROM memories
+		 WHERE id IN (
+		   SELECT id FROM memories
+		   WHERE owner_id = $1 AND active = true
+		   ORDER BY created_at ASC, id ASC
+		   LIMIT $2
+		 )`,
+		ownerID, remaining,
+	)
+	if err != nil {
+		return fmt.Errorf("evicting oldest active: %w", err)
+	}
+
+	return nil
+}


Eviction Failure Handling May Allow Resource Exhaustion

In evictIfNeeded, eviction failures are only logged as warnings and not propagated. If eviction is critical for enforcing memory limits, this could allow users to exceed MaxPerUser, potentially leading to resource exhaustion. Consider whether eviction failures should be handled more strictly, such as retrying or propagating the error to the caller, depending on the application's requirements.

feat: add per-user memory with LLM extraction, dedup, and decay

f204fde

- PostgreSQL + pgvector native memory store (two-threshold dedup, time-decay) - LLM-powered fact extraction with nonce-delimited prompt injection defense - LLM conflict arbitration for near-duplicate memories

Koopa0 merged commit 6a27145 into main Feb 17, 2026
7 of 8 checks passed

Koopa0 deleted the feat/memory-integration branch February 17, 2026 17:00

codereviewbot-ai bot reviewed Feb 17, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

feat: add per-user memory with LLM extraction, dedup, and decay#67

feat: add per-user memory with LLM extraction, dedup, and decay#67
Koopa0 merged 1 commit intomainfrom
feat/memory-integration

Koopa0 commented Feb 17, 2026

Uh oh!

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

		CREATE UNIQUE INDEX idx_memories_owner_content_unique
		ON memories(owner_id, md5(content)) WHERE active = true;

Comments

Conversation

Koopa0 commented Feb 17, 2026

Uh oh!

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

codereviewbot-ai bot Feb 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant