Skip to content

Comments

fix: address code audit findings (24 items) and review fixes#69

Merged
Koopa0 merged 1 commit intomainfrom
feat/code-cleaner
Feb 21, 2026
Merged

fix: address code audit findings (24 items) and review fixes#69
Koopa0 merged 1 commit intomainfrom
feat/code-cleaner

Conversation

@Koopa0
Copy link
Owner

@Koopa0 Koopa0 commented Feb 21, 2026

  • B1: replace panic with log.Fatalf in config mustBind
  • B2: remove CONCURRENTLY from trigram indexes migration
  • H2: restrict git subcommands to read-only (remove branch/tag)
  • H3: stop logging full search queries
  • W1-W19: SQL optimization, config validation, security hardening, import grouping, doc fixes
  • Review fixes: migration backfill, git branch/tag removal, memory import grouping

  - B1: replace panic with log.Fatalf in config mustBind
  - B2: remove CONCURRENTLY from trigram indexes migration
  - H2: restrict git subcommands to read-only (remove branch/tag)
  - H3: stop logging full search queries
  - W1-W19: SQL optimization, config validation, security hardening,
    import grouping, doc fixes
  - Review fixes: migration backfill, git branch/tag removal, memory
    import grouping
@Koopa0 Koopa0 merged commit d8a54d3 into main Feb 21, 2026
4 of 8 checks passed
@Koopa0 Koopa0 deleted the feat/code-cleaner branch February 21, 2026 17:04
Comment on lines +12 to 17
CREATE INDEX IF NOT EXISTS idx_messages_text_content_trgm
ON messages USING gin (text_content gin_trgm_ops);

-- GIN trigram index on memories.content for similarity() scoring.
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_memories_content_trgm
CREATE INDEX IF NOT EXISTS idx_memories_content_trgm
ON memories USING gin (content gin_trgm_ops);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Risk of Table Locks and Downtime on Large Tables

Creating GIN indexes without the CONCURRENTLY keyword (lines 12-17) can lock writes to the messages and memories tables, potentially causing downtime in production environments with large datasets. Although the comments warn about this, the migration itself does not enforce safe index creation. To mitigate this risk, consider splitting index creation into a separate migration using CONCURRENTLY, or ensure that this migration is not run on large tables in production. Example:

CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_messages_text_content_trgm
    ON messages USING gin (text_content gin_trgm_ops);

Recommendation:

  • For production environments, manually create these indexes with CONCURRENTLY before running this migration, or modify the migration to use CONCURRENTLY if your migration framework supports it.

@@ -0,0 +1,2 @@
ALTER TABLE sessions ALTER COLUMN owner_id SET DEFAULT '';

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Data Integrity Risk: Setting the default value of owner_id to an empty string ('') may lead to ambiguous or invalid data, as empty strings are not valid identifiers. This can cause issues in application logic and downstream processes.

Recommendation: Consider using NULL as the default if the absence of an owner is valid, or ensure that a valid identifier is always provided.

@@ -0,0 +1,7 @@
-- Backfill empty owner_id before adding constraint.
-- Sessions created before ownership was enforced may have owner_id = ''.
UPDATE sessions SET owner_id = 'anonymous' WHERE owner_id = '';

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential Data Integrity Issue:
The UPDATE sessions SET owner_id = 'anonymous' WHERE owner_id = ''; statement only backfills empty strings, not NULL values. If any rows have owner_id IS NULL, they will not be updated and may violate the new constraint.

Recommended Solution:
Update both empty and NULL values:

UPDATE sessions SET owner_id = 'anonymous' WHERE owner_id IS NULL OR owner_id = '';

UPDATE sessions SET owner_id = 'anonymous' WHERE owner_id = '';

-- Prevent empty owner_id in sessions (enforces NOT NULL + non-empty).
ALTER TABLE sessions ADD CONSTRAINT sessions_owner_id_not_empty CHECK (owner_id != '');

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constraint Does Not Prevent NULL Values:
The CHECK constraint CHECK (owner_id != '') only prevents empty strings, not NULLs. If the column is not already defined as NOT NULL, NULL values can still be inserted, bypassing the constraint.

Recommended Solution:
Ensure the column is NOT NULL or update the constraint to:

ALTER TABLE sessions ADD CONSTRAINT sessions_owner_id_not_empty CHECK (owner_id IS NOT NULL AND owner_id != '');

Comment on lines 47 to 51
if err != nil {
h.logger.Error("searching messages", "error", err, "user_id", userID, "query", query)
h.logger.Error("searching messages", "error", err, "user_id", userID, "query_len", len(query))
WriteError(w, http.StatusInternalServerError, "search_failed", "failed to search messages", h.logger)
return
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error Handling Granularity:
The error handling block does not distinguish between user input errors (e.g., malformed queries) and internal server errors. All errors are treated as internal failures, which may result in less informative responses for the client and complicate debugging.

Recommendation:
Consider inspecting the error returned by h.store.SearchMessages and returning more specific error codes/messages for user errors versus internal errors. For example:

if errors.Is(err, session.ErrInvalidQuery) {
    WriteError(w, http.StatusBadRequest, "invalid_query", "query is malformed", h.logger)
    return
}

This improves maintainability and user experience.

Comment on lines 307 to 313
t.Fatalf("NewServer(): %v", err)
}

result, _, err := server.FileInfo(context.Background(), &mcp.CallToolRequest{}, tools.GetFileInfoInput{
result, _, err := server.FileInfo(context.Background(), &mcp.CallToolRequest{}, tools.FileInfoInput{
Path: filepath.Join(h.tempDir, "nonexistent.txt"),
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing Validation of Error Message Content

The test only checks that IsError is true for a non-existent file, but does not verify the error message content. This can lead to missed issues where the error message is unclear or incorrect.

Recommendation:
Add an assertion to check that the error message in result.Content contains relevant information (e.g., mentions file not found):

if !strings.Contains(result.Content, "not found") {
    t.Errorf("unexpected error message: %v", result.Content)
}

This ensures the error feedback is meaningful.

Comment on lines 82 to 88
}

// Env handles the getEnv MCP tool call.
func (s *Server) Env(ctx context.Context, _ *mcp.CallToolRequest, input tools.GetEnvInput) (*mcp.CallToolResult, any, error) {
func (s *Server) Env(ctx context.Context, _ *mcp.CallToolRequest, input tools.EnvInput) (*mcp.CallToolResult, any, error) {
toolCtx := &ai.ToolContext{Context: ctx}
result, err := s.system.Env(toolCtx, input)
if err != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential Security Risk: Sensitive Environment Variables Exposure

The Env handler relies on s.system.Env to filter sensitive environment variables (those containing KEY, SECRET, TOKEN, PASSWORD). If this filtering is not strictly enforced in s.system.Env, there is a risk of exposing sensitive data.

Recommendation:
Ensure that s.system.Env implements robust filtering for sensitive variable names. If not, add explicit checks in this handler to prevent returning values for sensitive variables:

if isSensitive(input.Name) {
    return nil, nil, fmt.Errorf("access to sensitive environment variable denied")
}

Where isSensitive checks for substrings KEY, SECRET, TOKEN, PASSWORD (case-insensitive).

Comment on lines 190 to 196
t.Fatalf("NewServer(): %v", err)
}

result, _, err := server.Env(context.Background(), &mcp.CallToolRequest{}, tools.GetEnvInput{
result, _, err := server.Env(context.Background(), &mcp.CallToolRequest{}, tools.EnvInput{
Key: "NONEXISTENT_VAR_12345",
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Insufficient validation of error content for unset environment variable

The test only checks that no Go error is returned and that the MCP result is not an error, but does not verify that the returned content correctly indicates the variable is not set. This could allow silent failures if the implementation changes and the content does not properly reflect the unset state.

Recommendation:
Add a check to ensure the returned content explicitly indicates the variable is not set, e.g.:

if !strings.Contains(textContent.Text, "isSet":false) {
    t.Errorf("GetEnv output should contain isSet:false for unset var: %s", textContent.Text)
}

This ensures the test validates the expected output structure.

Comment on lines 232 to 238

for _, key := range sensitiveKeys {
t.Run(key, func(t *testing.T) {
result, _, err := server.Env(context.Background(), &mcp.CallToolRequest{}, tools.GetEnvInput{
result, _, err := server.Env(context.Background(), &mcp.CallToolRequest{}, tools.EnvInput{
Key: key,
})

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lack of validation for error message content when blocking sensitive environment variables

The test only checks that IsError is true for sensitive variables, but does not verify that the error message or content is informative or that the blocking is complete. This could miss cases where the error is not descriptive or the sensitive variable is partially exposed.

Recommendation:
Add assertions to verify that the error content contains a clear message indicating access is blocked, e.g.:

if !strings.Contains(result.Content[0].(*mcp.TextContent).Text, "blocked" ) {
    t.Errorf("GetEnv error message should indicate blocking for sensitive variable %s: %s", key, result.Content[0].(*mcp.TextContent).Text)
}

This ensures the test validates both the error flag and the error message.

Comment on lines 604 to 649
return nil
}

// UpdateDecayScores recalculates decay_score for all active memories.
// Processes per-category with batched UPDATEs to avoid large locks.
// UpdateDecayScores recalculates decay_score for all active memories in a single query.
// Uses a VALUES join to apply per-category lambda rates atomically.
// Does NOT update updated_at to preserve the decay index.
// Returns total number of rows updated.
//
// The Go-side formula must stay in sync with the SQL expression:
//
// Go: math.Exp(-lambda * hours)
// SQL: exp(-$1 * extract(epoch from (now() - updated_at)) / 3600.0)
// SQL: exp(-v.lambda * extract(epoch from (now() - m.updated_at)) / 3600.0)
//
// NOTE: The explicit $1::float8 cast is required because pgx v5 sends
// NOTE: The explicit ::float8 cast is required because pgx v5 sends
// Go float64 as an untyped parameter. When PostgreSQL sees `$1 = 0`,
// it infers the parameter as integer, silently truncating 0.001925 → 0.
// The cast forces float8 inference. See: github.com/jackc/pgx/issues/2125
func (s *Store) UpdateDecayScores(ctx context.Context) (int, error) {
categories := AllCategories()

var total int
for _, cat := range categories {
lambda := cat.DecayLambda()

tag, err := s.pool.Exec(ctx,
`UPDATE memories
SET decay_score = CASE
WHEN $1::float8 = 0.0 THEN 1.0
ELSE LEAST(1.0, exp(-$1::float8 * extract(epoch from (now() - updated_at)) / 3600.0))
END
WHERE active = true
AND superseded_by IS NULL
AND category = $2`,
lambda, string(cat),
)
if err != nil {
return total, fmt.Errorf("updating decay scores for %s: %w", cat, err)
}
total += int(tag.RowsAffected())
// Build VALUES clause and params for a single batched UPDATE.
// Each category contributes ($N::text, $N+1::float8) to the VALUES list.
params := make([]any, 0, len(categories)*2)
valueParts := make([]string, 0, len(categories))
for i, cat := range categories {
paramIdx := i*2 + 1 // $1, $3, $5, $7
valueParts = append(valueParts, fmt.Sprintf("($%d::text, $%d::float8)", paramIdx, paramIdx+1))
params = append(params, string(cat), cat.DecayLambda())
}

query := `UPDATE memories m
SET decay_score = CASE
WHEN v.lambda = 0.0 THEN 1.0
ELSE LEAST(1.0, exp(-v.lambda * extract(epoch from (now() - m.updated_at)) / 3600.0))
END
FROM (VALUES ` + strings.Join(valueParts, ", ") + `) AS v(cat, lambda)
WHERE m.active = true
AND m.superseded_by IS NULL
AND m.category = v.cat`

tag, err := s.pool.Exec(ctx, query, params...)
if err != nil {
return 0, fmt.Errorf("updating decay scores: %w", err)
}

return total, nil
return int(tag.RowsAffected()), nil
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential SQL Injection Risk in Dynamic VALUES Clause

The dynamic construction of the VALUES clause in UpdateDecayScores (lines 625-643) is currently safe because category names and decay lambdas are controlled by the application and not user input. However, if the set of categories or their names ever becomes user-configurable or influenced by external input, this could introduce a SQL injection vulnerability.

Recommendation:

  • Ensure that category names remain strictly controlled and not user-configurable.
  • If there is any possibility of user input influencing category names, sanitize or validate them rigorously before including them in the query construction.
  • Consider documenting this assumption in the code to prevent future misuse.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant