add: implement auto-partitioning function for matches table, integr…#1
add: implement auto-partitioning function for matches table, integr…#14RH1T3CT0R7 merged 2 commits intomainfrom
matches table, integr…#1Conversation
…ate partition creation at app startup, and improve team membership checks and panic recovery logic
…support for animations, and refine accessibility and transitions
There was a problem hiding this comment.
Pull request overview
This PR improves runtime resilience and security by adding automatic Postgres partition creation for the matches table at startup, tightening authorization checks for program uploads, and enhancing error/edge-case handling in worker, WebSocket, and domain services.
Changes:
- Add a Postgres function + startup hook to auto-create monthly
matchespartitions. - Enforce team membership for multipart program uploads and add related handler tests.
- Improve operational robustness: recover worker panics with respawn logic, add WebSocket broadcast timeout behavior, and introduce a dead-letter queue for malformed queue items.
Reviewed changes
Copilot reviewed 35 out of 36 changed files in this pull request and generated 8 comments.
Show a summary per file
| File | Description |
|---|---|
| web/src/pages/TournamentDetail.tsx | Forces WebSocket hook to be enabled for tournament updates. |
| web/src/pages/GameView.tsx | Sanitizes markdown link URLs before rendering. |
| web/src/pages/GameDetail.tsx | Sanitizes markdown link URLs before rendering. |
| migrations/000024_add_auto_partition_function.up.sql | Adds create_matches_partition_if_needed() and runs it once. |
| migrations/000024_add_auto_partition_function.down.sql | Drops the auto-partition function. |
| internal/infrastructure/db/db.go | Adds EnsureMatchPartitions() startup helper. |
| cmd/api/main.go | Calls EnsureMatchPartitions() during API startup; wires teamRepo into ProgramHandler. |
| cmd/worker/main.go | Calls EnsureMatchPartitions() during worker startup. |
| internal/infrastructure/db/match_repository.go | Removes N+1 match fetching by loading all tournament matches in one query. |
| internal/infrastructure/db/tournament_repository.go | Expands tournament update query to include additional fields. |
| internal/api/handlers/program.go | Adds TeamMembershipChecker and enforces membership on file upload. |
| internal/api/handlers/program_test.go | Updates handler construction + adds tests for membership enforcement. |
| internal/domain/team/service.go | Prevents deleting the last-member team during an active tournament. |
| internal/domain/team/service_test.go | Adds coverage for active-tournament delete restriction. |
| internal/worker/pool.go | Adds panic recovery with logging + respawn; ensures scaling respects MinWorkers after panic. |
| internal/websocket/hub.go | Adds a 1s timeout retry before dropping broadcast messages. |
| internal/infrastructure/queue/queue.go | On unmarshal failure, pushes the raw item to a dead-letter queue and logs details. |
| internal/domain/rating/elo.go | Clamps Elo rating floor at 0. |
| internal/domain/rating/elo_test.go | Updates tests to reflect rating floor behavior. |
| internal/domain/auth/service.go | Adds dummy bcrypt compare on user-not-found and enforces email uniqueness in profile update. |
| internal/domain/auth/service_test.go | Adds/updates tests for email uniqueness checks. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| p.wg.Add(1) | ||
| go func() { | ||
| defer p.wg.Done() | ||
| defer p.totalWorkers.Add(-1) | ||
| defer func() { | ||
| if r := recover(); r != nil { | ||
| p.log.Error("Worker panic recovered", | ||
| zap.Int32("worker_id", current), | ||
| zap.Any("panic", r), | ||
| zap.String("stack", string(debug.Stack())), | ||
| ) | ||
| // Respawn worker if pool is still running and below minimum capacity | ||
| if p.ctx.Err() == nil { | ||
| time.AfterFunc(time.Second, func() { | ||
| if p.ctx.Err() != nil { | ||
| return // Pool was stopped, do not respawn | ||
| } | ||
| if int(p.totalWorkers.Load()) < p.config.MinWorkers { | ||
| p.log.Info("Respawning worker after panic", | ||
| zap.Int32("current_workers", p.totalWorkers.Load()), | ||
| zap.Int("min_workers", p.config.MinWorkers), | ||
| ) | ||
| p.spawnWorker() | ||
| } | ||
| }) | ||
| } |
There was a problem hiding this comment.
When a worker panics, the goroutine exits without removing its workerCancel from p.workerCancels. Because the respawn path appends a new cancel, repeated panics can grow workerCancels unbounded and make scale-down accounting inaccurate. Consider tracking cancels by worker ID (map) or removing the cancel in a deferred cleanup, and ensure workerCancel() is called on worker exit to release context resources.
| // Channel full, try with timeout before dropping | ||
| timer := time.NewTimer(time.Second) | ||
| defer timer.Stop() | ||
| select { | ||
| case h.broadcast <- message: | ||
| case <-timer.C: | ||
| h.log.Error("Broadcast channel full, message dropped after 1s timeout", | ||
| zap.String("tournament_id", tournamentID.String()), | ||
| zap.String("type", messageType), | ||
| ) | ||
| } |
There was a problem hiding this comment.
Broadcast now blocks up to 1s when the broadcast channel is full. This can tie up caller goroutines under load and amplify backpressure (potentially impacting HTTP request latency if broadcasts happen inline). If dropping is acceptable, consider keeping it non-blocking (or using a much shorter timeout / metrics-based shedding) and emitting a counter/metric when drops occur.
| // Channel full, try with timeout before dropping | |
| timer := time.NewTimer(time.Second) | |
| defer timer.Stop() | |
| select { | |
| case h.broadcast <- message: | |
| case <-timer.C: | |
| h.log.Error("Broadcast channel full, message dropped after 1s timeout", | |
| zap.String("tournament_id", tournamentID.String()), | |
| zap.String("type", messageType), | |
| ) | |
| } | |
| // Channel full, drop message immediately to avoid blocking caller | |
| h.log.Error("Broadcast channel full, message dropped", | |
| zap.String("tournament_id", tournamentID.String()), | |
| zap.String("type", messageType), | |
| ) |
| qm.log.Error("Failed to unmarshal match, moved to dead-letter queue", | ||
| zap.Error(err), | ||
| zap.String("raw_data", result[1]), |
There was a problem hiding this comment.
Logging raw_data from Redis on JSON unmarshal failure can significantly increase log volume and may unintentionally record large/untrusted payloads. Consider logging a truncated prefix (or a hash/length), and relying on the dead-letter queue entry for full inspection.
| qm.log.Error("Failed to unmarshal match, moved to dead-letter queue", | |
| zap.Error(err), | |
| zap.String("raw_data", result[1]), | |
| // Log only a truncated prefix of the raw data to avoid large/untrusted payloads in logs. | |
| rawData := result[1] | |
| const maxLoggedRawDataLen = 512 | |
| truncatedRawData := rawData | |
| if len(truncatedRawData) > maxLoggedRawDataLen { | |
| truncatedRawData = truncatedRawData[:maxLoggedRawDataLen] | |
| } | |
| qm.log.Error("Failed to unmarshal match, moved to dead-letter queue", | |
| zap.Error(err), | |
| zap.String("raw_data_truncated", truncatedRawData), | |
| zap.Int("raw_data_length", len(rawData)), |
| current_partition := 'matches_' || to_char(current_month, 'YYYY_MM'); | ||
| IF NOT EXISTS ( | ||
| SELECT 1 FROM pg_class WHERE relname = current_partition | ||
| ) THEN |
There was a problem hiding this comment.
The partition existence check uses pg_class WHERE relname = ..., which is schema-agnostic and doesn’t verify that the found relation is actually a partition of matches. In environments with multiple schemas or an unrelated table with the same name, this can skip creating the needed partition. Consider using to_regclass() with an explicit schema (or joining pg_class to pg_namespace and verifying pg_inherits) to make the check unambiguous.
|
|
||
| // NewProgramHandler создаёт новый program handler | ||
| func NewProgramHandler( | ||
| programRepo ProgramRepository, | ||
| tournamentRepo TournamentParticipantAdder, | ||
| matchScheduler MatchScheduler, | ||
| gameLookup GameLookup, | ||
| matchChecker MatchExistenceChecker, | ||
| roundChecker RoundCompletionChecker, | ||
| teamChecker TeamMembershipChecker, | ||
| uploadDir string, | ||
| log *logger.Logger, | ||
| ) *ProgramHandler { |
There was a problem hiding this comment.
NewProgramHandler now requires an extra teamChecker argument, but there is still at least one call site using the old signature (tests/benchmark/api_benchmark_test.go:127), which will break compilation of the test package. Update that call site (and any others) to pass a TeamMembershipChecker (or nil if appropriate).
| // EnsureMatchPartitions создаёт партиции таблицы matches для текущего и следующего месяца. | ||
| // Вызывается при старте приложения для гарантии наличия партиций. | ||
| func (db *DB) EnsureMatchPartitions(ctx context.Context) error { | ||
| _, err := db.ExecContext(ctx, "SELECT create_matches_partition_if_needed()") | ||
| if err != nil { | ||
| db.log.Warn("Failed to ensure match partitions (function may not exist yet)", zap.Error(err)) | ||
| return nil // Non-fatal: partitions may already exist or function not yet migrated | ||
| } | ||
| db.log.Info("Match partitions verified") | ||
| return nil |
There was a problem hiding this comment.
EnsureMatchPartitions logs and returns nil for all errors, which can silently mask real failures (e.g., permissions/DDL errors) and still lead to runtime INSERT failures once partitions are missing. Consider only suppressing the specific “function does not exist” case, and return other errors so callers can decide whether startup should fail.
| _ = database.EnsureMatchPartitions(context.Background()) | ||
|
|
There was a problem hiding this comment.
Startup calls EnsureMatchPartitions with context.Background() and ignores the returned error. If the DB stalls or DDL fails, this can hang startup or hide a critical operational issue (missing partitions). Consider using a bounded startup context (timeout) and handling/logging the returned error explicitly.
| _ = database.EnsureMatchPartitions(context.Background()) | |
| partitionsCtx, partitionsCancel := context.WithTimeout(context.Background(), 30*time.Second) | |
| defer partitionsCancel() | |
| if err := database.EnsureMatchPartitions(partitionsCtx); err != nil { | |
| log.Fatal("Failed to ensure match partitions", zap.Error(err)) | |
| } |
| } | ||
|
|
||
| // Обеспечиваем наличие партиций таблицы matches | ||
| _ = database.EnsureMatchPartitions(context.Background()) |
There was a problem hiding this comment.
Startup calls EnsureMatchPartitions with context.Background() and discards the returned error. If partition creation fails (permissions, lock contention, etc.) you may only discover it later via INSERT failures. Consider using a timeout context and handling the error (even if you choose to continue startup).
| _ = database.EnsureMatchPartitions(context.Background()) | |
| matchPartitionsCtx, matchPartitionsCancel := context.WithTimeout(context.Background(), 30*time.Second) | |
| defer matchPartitionsCancel() | |
| if err := database.EnsureMatchPartitions(matchPartitionsCtx); err != nil { | |
| log.Error("Failed to ensure match partitions", zap.Error(err)) | |
| } |
…ate partition creation at app startup, and improve team membership checks and panic recovery logic