fix: address code audit findings (24 items) and review fixes#69
Conversation
Koopa0
commented
Feb 21, 2026
- B1: replace panic with log.Fatalf in config mustBind
- B2: remove CONCURRENTLY from trigram indexes migration
- H2: restrict git subcommands to read-only (remove branch/tag)
- H3: stop logging full search queries
- W1-W19: SQL optimization, config validation, security hardening, import grouping, doc fixes
- Review fixes: migration backfill, git branch/tag removal, memory import grouping
- B1: replace panic with log.Fatalf in config mustBind
- B2: remove CONCURRENTLY from trigram indexes migration
- H2: restrict git subcommands to read-only (remove branch/tag)
- H3: stop logging full search queries
- W1-W19: SQL optimization, config validation, security hardening,
import grouping, doc fixes
- Review fixes: migration backfill, git branch/tag removal, memory
import grouping
| CREATE INDEX IF NOT EXISTS idx_messages_text_content_trgm | ||
| ON messages USING gin (text_content gin_trgm_ops); | ||
|
|
||
| -- GIN trigram index on memories.content for similarity() scoring. | ||
| CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_memories_content_trgm | ||
| CREATE INDEX IF NOT EXISTS idx_memories_content_trgm | ||
| ON memories USING gin (content gin_trgm_ops); |
There was a problem hiding this comment.
Risk of Table Locks and Downtime on Large Tables
Creating GIN indexes without the CONCURRENTLY keyword (lines 12-17) can lock writes to the messages and memories tables, potentially causing downtime in production environments with large datasets. Although the comments warn about this, the migration itself does not enforce safe index creation. To mitigate this risk, consider splitting index creation into a separate migration using CONCURRENTLY, or ensure that this migration is not run on large tables in production. Example:
CREATE INDEX CONCURRENTLY IF NOT EXISTS idx_messages_text_content_trgm
ON messages USING gin (text_content gin_trgm_ops);Recommendation:
- For production environments, manually create these indexes with
CONCURRENTLYbefore running this migration, or modify the migration to useCONCURRENTLYif your migration framework supports it.
| @@ -0,0 +1,2 @@ | |||
| ALTER TABLE sessions ALTER COLUMN owner_id SET DEFAULT ''; | |||
There was a problem hiding this comment.
Data Integrity Risk: Setting the default value of owner_id to an empty string ('') may lead to ambiguous or invalid data, as empty strings are not valid identifiers. This can cause issues in application logic and downstream processes.
Recommendation: Consider using NULL as the default if the absence of an owner is valid, or ensure that a valid identifier is always provided.
| @@ -0,0 +1,7 @@ | |||
| -- Backfill empty owner_id before adding constraint. | |||
| -- Sessions created before ownership was enforced may have owner_id = ''. | |||
| UPDATE sessions SET owner_id = 'anonymous' WHERE owner_id = ''; | |||
There was a problem hiding this comment.
Potential Data Integrity Issue:
The UPDATE sessions SET owner_id = 'anonymous' WHERE owner_id = ''; statement only backfills empty strings, not NULL values. If any rows have owner_id IS NULL, they will not be updated and may violate the new constraint.
Recommended Solution:
Update both empty and NULL values:
UPDATE sessions SET owner_id = 'anonymous' WHERE owner_id IS NULL OR owner_id = '';| UPDATE sessions SET owner_id = 'anonymous' WHERE owner_id = ''; | ||
|
|
||
| -- Prevent empty owner_id in sessions (enforces NOT NULL + non-empty). | ||
| ALTER TABLE sessions ADD CONSTRAINT sessions_owner_id_not_empty CHECK (owner_id != ''); |
There was a problem hiding this comment.
Constraint Does Not Prevent NULL Values:
The CHECK constraint CHECK (owner_id != '') only prevents empty strings, not NULLs. If the column is not already defined as NOT NULL, NULL values can still be inserted, bypassing the constraint.
Recommended Solution:
Ensure the column is NOT NULL or update the constraint to:
ALTER TABLE sessions ADD CONSTRAINT sessions_owner_id_not_empty CHECK (owner_id IS NOT NULL AND owner_id != '');| if err != nil { | ||
| h.logger.Error("searching messages", "error", err, "user_id", userID, "query", query) | ||
| h.logger.Error("searching messages", "error", err, "user_id", userID, "query_len", len(query)) | ||
| WriteError(w, http.StatusInternalServerError, "search_failed", "failed to search messages", h.logger) | ||
| return | ||
| } |
There was a problem hiding this comment.
Error Handling Granularity:
The error handling block does not distinguish between user input errors (e.g., malformed queries) and internal server errors. All errors are treated as internal failures, which may result in less informative responses for the client and complicate debugging.
Recommendation:
Consider inspecting the error returned by h.store.SearchMessages and returning more specific error codes/messages for user errors versus internal errors. For example:
if errors.Is(err, session.ErrInvalidQuery) {
WriteError(w, http.StatusBadRequest, "invalid_query", "query is malformed", h.logger)
return
}This improves maintainability and user experience.
| t.Fatalf("NewServer(): %v", err) | ||
| } | ||
|
|
||
| result, _, err := server.FileInfo(context.Background(), &mcp.CallToolRequest{}, tools.GetFileInfoInput{ | ||
| result, _, err := server.FileInfo(context.Background(), &mcp.CallToolRequest{}, tools.FileInfoInput{ | ||
| Path: filepath.Join(h.tempDir, "nonexistent.txt"), | ||
| }) | ||
|
|
There was a problem hiding this comment.
Missing Validation of Error Message Content
The test only checks that IsError is true for a non-existent file, but does not verify the error message content. This can lead to missed issues where the error message is unclear or incorrect.
Recommendation:
Add an assertion to check that the error message in result.Content contains relevant information (e.g., mentions file not found):
if !strings.Contains(result.Content, "not found") {
t.Errorf("unexpected error message: %v", result.Content)
}This ensures the error feedback is meaningful.
| } | ||
|
|
||
| // Env handles the getEnv MCP tool call. | ||
| func (s *Server) Env(ctx context.Context, _ *mcp.CallToolRequest, input tools.GetEnvInput) (*mcp.CallToolResult, any, error) { | ||
| func (s *Server) Env(ctx context.Context, _ *mcp.CallToolRequest, input tools.EnvInput) (*mcp.CallToolResult, any, error) { | ||
| toolCtx := &ai.ToolContext{Context: ctx} | ||
| result, err := s.system.Env(toolCtx, input) | ||
| if err != nil { |
There was a problem hiding this comment.
Potential Security Risk: Sensitive Environment Variables Exposure
The Env handler relies on s.system.Env to filter sensitive environment variables (those containing KEY, SECRET, TOKEN, PASSWORD). If this filtering is not strictly enforced in s.system.Env, there is a risk of exposing sensitive data.
Recommendation:
Ensure that s.system.Env implements robust filtering for sensitive variable names. If not, add explicit checks in this handler to prevent returning values for sensitive variables:
if isSensitive(input.Name) {
return nil, nil, fmt.Errorf("access to sensitive environment variable denied")
}Where isSensitive checks for substrings KEY, SECRET, TOKEN, PASSWORD (case-insensitive).
| t.Fatalf("NewServer(): %v", err) | ||
| } | ||
|
|
||
| result, _, err := server.Env(context.Background(), &mcp.CallToolRequest{}, tools.GetEnvInput{ | ||
| result, _, err := server.Env(context.Background(), &mcp.CallToolRequest{}, tools.EnvInput{ | ||
| Key: "NONEXISTENT_VAR_12345", | ||
| }) | ||
|
|
There was a problem hiding this comment.
Insufficient validation of error content for unset environment variable
The test only checks that no Go error is returned and that the MCP result is not an error, but does not verify that the returned content correctly indicates the variable is not set. This could allow silent failures if the implementation changes and the content does not properly reflect the unset state.
Recommendation:
Add a check to ensure the returned content explicitly indicates the variable is not set, e.g.:
if !strings.Contains(textContent.Text, "isSet":false) {
t.Errorf("GetEnv output should contain isSet:false for unset var: %s", textContent.Text)
}This ensures the test validates the expected output structure.
|
|
||
| for _, key := range sensitiveKeys { | ||
| t.Run(key, func(t *testing.T) { | ||
| result, _, err := server.Env(context.Background(), &mcp.CallToolRequest{}, tools.GetEnvInput{ | ||
| result, _, err := server.Env(context.Background(), &mcp.CallToolRequest{}, tools.EnvInput{ | ||
| Key: key, | ||
| }) | ||
|
|
There was a problem hiding this comment.
Lack of validation for error message content when blocking sensitive environment variables
The test only checks that IsError is true for sensitive variables, but does not verify that the error message or content is informative or that the blocking is complete. This could miss cases where the error is not descriptive or the sensitive variable is partially exposed.
Recommendation:
Add assertions to verify that the error content contains a clear message indicating access is blocked, e.g.:
if !strings.Contains(result.Content[0].(*mcp.TextContent).Text, "blocked" ) {
t.Errorf("GetEnv error message should indicate blocking for sensitive variable %s: %s", key, result.Content[0].(*mcp.TextContent).Text)
}This ensures the test validates both the error flag and the error message.
| return nil | ||
| } | ||
|
|
||
| // UpdateDecayScores recalculates decay_score for all active memories. | ||
| // Processes per-category with batched UPDATEs to avoid large locks. | ||
| // UpdateDecayScores recalculates decay_score for all active memories in a single query. | ||
| // Uses a VALUES join to apply per-category lambda rates atomically. | ||
| // Does NOT update updated_at to preserve the decay index. | ||
| // Returns total number of rows updated. | ||
| // | ||
| // The Go-side formula must stay in sync with the SQL expression: | ||
| // | ||
| // Go: math.Exp(-lambda * hours) | ||
| // SQL: exp(-$1 * extract(epoch from (now() - updated_at)) / 3600.0) | ||
| // SQL: exp(-v.lambda * extract(epoch from (now() - m.updated_at)) / 3600.0) | ||
| // | ||
| // NOTE: The explicit $1::float8 cast is required because pgx v5 sends | ||
| // NOTE: The explicit ::float8 cast is required because pgx v5 sends | ||
| // Go float64 as an untyped parameter. When PostgreSQL sees `$1 = 0`, | ||
| // it infers the parameter as integer, silently truncating 0.001925 → 0. | ||
| // The cast forces float8 inference. See: github.com/jackc/pgx/issues/2125 | ||
| func (s *Store) UpdateDecayScores(ctx context.Context) (int, error) { | ||
| categories := AllCategories() | ||
|
|
||
| var total int | ||
| for _, cat := range categories { | ||
| lambda := cat.DecayLambda() | ||
|
|
||
| tag, err := s.pool.Exec(ctx, | ||
| `UPDATE memories | ||
| SET decay_score = CASE | ||
| WHEN $1::float8 = 0.0 THEN 1.0 | ||
| ELSE LEAST(1.0, exp(-$1::float8 * extract(epoch from (now() - updated_at)) / 3600.0)) | ||
| END | ||
| WHERE active = true | ||
| AND superseded_by IS NULL | ||
| AND category = $2`, | ||
| lambda, string(cat), | ||
| ) | ||
| if err != nil { | ||
| return total, fmt.Errorf("updating decay scores for %s: %w", cat, err) | ||
| } | ||
| total += int(tag.RowsAffected()) | ||
| // Build VALUES clause and params for a single batched UPDATE. | ||
| // Each category contributes ($N::text, $N+1::float8) to the VALUES list. | ||
| params := make([]any, 0, len(categories)*2) | ||
| valueParts := make([]string, 0, len(categories)) | ||
| for i, cat := range categories { | ||
| paramIdx := i*2 + 1 // $1, $3, $5, $7 | ||
| valueParts = append(valueParts, fmt.Sprintf("($%d::text, $%d::float8)", paramIdx, paramIdx+1)) | ||
| params = append(params, string(cat), cat.DecayLambda()) | ||
| } | ||
|
|
||
| query := `UPDATE memories m | ||
| SET decay_score = CASE | ||
| WHEN v.lambda = 0.0 THEN 1.0 | ||
| ELSE LEAST(1.0, exp(-v.lambda * extract(epoch from (now() - m.updated_at)) / 3600.0)) | ||
| END | ||
| FROM (VALUES ` + strings.Join(valueParts, ", ") + `) AS v(cat, lambda) | ||
| WHERE m.active = true | ||
| AND m.superseded_by IS NULL | ||
| AND m.category = v.cat` | ||
|
|
||
| tag, err := s.pool.Exec(ctx, query, params...) | ||
| if err != nil { | ||
| return 0, fmt.Errorf("updating decay scores: %w", err) | ||
| } | ||
|
|
||
| return total, nil | ||
| return int(tag.RowsAffected()), nil | ||
| } |
There was a problem hiding this comment.
Potential SQL Injection Risk in Dynamic VALUES Clause
The dynamic construction of the VALUES clause in UpdateDecayScores (lines 625-643) is currently safe because category names and decay lambdas are controlled by the application and not user input. However, if the set of categories or their names ever becomes user-configurable or influenced by external input, this could introduce a SQL injection vulnerability.
Recommendation:
- Ensure that category names remain strictly controlled and not user-configurable.
- If there is any possibility of user input influencing category names, sanitize or validate them rigorously before including them in the query construction.
- Consider documenting this assumption in the code to prevent future misuse.