Skip to content

Conversation

@tonyalaribe
Copy link
Contributor

This commit reorganizes the migration files to:

  1. Make all queries safe for re-runs using IF NOT EXISTS/IF EXISTS patterns
  2. Consolidate ALTER TABLE ADD COLUMN statements into original CREATE TABLE
  3. Remove redundant migrations where tables/views were created then dropped
  4. Remove migrations that were superseded by later ones
  5. Renumber files for a clean sequential order (0001-0007)

Key changes:

  • 0001: Merged all ALTER TABLE statements, wrapped CREATE TYPE in DO blocks with EXCEPTION handling, removed materialized views that get dropped, included final procedure version, added ON CONFLICT DO NOTHING to inserts
  • 0002: Changed retention from 14 to 30 days, wrapped CREATE TYPE in DO blocks, consolidated dashboard columns (teams, file_path, file_sha), absorbed log_pattern and summary_pattern columns
  • 0003: Unchanged (already safe)
  • 0004: Rebuilt issues table without DEFERRABLE constraint (it was dropped anyway)
  • 0005: Simplified teams table (removed redundant ALTER TABLE statements)
  • 0006: Simplified github_sync (removed dashboard column ALTERs now in 0002)
  • 0007: Unchanged (already safe)

Removed 14 redundant migration files that were either:

  • Superseded by later migrations (0006, 0008, 0009)
  • Created objects that were later dropped (0010, 0011, 0012)
  • Already consolidated into earlier migrations (0013, 0014, 0015, 0016x2)
  • Updating procedures already in 0001 (0018, 0021)
  • Self-canceling operations (0019 added then removed columns)

Closes #

How to test

Checklist

  • Make sure you have described your changes and added all relevant screenshots or data.
  • Make sure your changes are tested (stories and/or unit, integration, or end-to-end tests).
  • Make sure to add/update documentation regarding your changes (or request one from the team).
  • You are NOT deprecating/removing a feature.

This commit reorganizes the migration files to:

1. Make all queries safe for re-runs using IF NOT EXISTS/IF EXISTS patterns
2. Consolidate ALTER TABLE ADD COLUMN statements into original CREATE TABLE
3. Remove redundant migrations where tables/views were created then dropped
4. Remove migrations that were superseded by later ones
5. Renumber files for a clean sequential order (0001-0007)

Key changes:
- 0001: Merged all ALTER TABLE statements, wrapped CREATE TYPE in DO blocks
        with EXCEPTION handling, removed materialized views that get dropped,
        included final procedure version, added ON CONFLICT DO NOTHING to inserts
- 0002: Changed retention from 14 to 30 days, wrapped CREATE TYPE in DO blocks,
        consolidated dashboard columns (teams, file_path, file_sha),
        absorbed log_pattern and summary_pattern columns
- 0003: Unchanged (already safe)
- 0004: Rebuilt issues table without DEFERRABLE constraint (it was dropped anyway)
- 0005: Simplified teams table (removed redundant ALTER TABLE statements)
- 0006: Simplified github_sync (removed dashboard column ALTERs now in 0002)
- 0007: Unchanged (already safe)

Removed 14 redundant migration files that were either:
- Superseded by later migrations (0006, 0008, 0009)
- Created objects that were later dropped (0010, 0011, 0012)
- Already consolidated into earlier migrations (0013, 0014, 0015, 0016x2)
- Updating procedures already in 0001 (0018, 0021)
- Self-canceling operations (0019 added then removed columns)
@claude
Copy link

claude bot commented Jan 3, 2026

PR Review: Database Migration Consolidation

Overall Assessment

This migration consolidation is well-intentioned but has critical issues that need addressing before merging. The approach of making migrations idempotent and consolidating redundant files is sound, but there are bugs and missed optimizations.


🚨 Critical Issues

1. Duplicate Index Creation (0001_create_monoscope_tables.sql:485)

CREATE INDEX IF NOT EXISTS idx_apis_request_dumps_project_id_shape_hash ON apis.request_dumps(project_id, shape_hash, created_at DESC);
CREATE INDEX IF NOT EXISTS idx_apis_request_dumps_project_id_shape_hash ON apis.request_dumps(project_id, shape_hash, created_at DESC);

Issue: Identical index created twice. This appears to be a merge artifact.
Fix: Remove one of the duplicate lines.

2. Missing teams Column in 0006 (migration deleted but column referenced)

The PR removes static/migrations/0006_github_sync.sql but monitors.query_monitors table references a teams column that appears to have been added there. Check if this column addition was properly consolidated into 0001.

Looking at line 684 in the diff:

ALTER TABLE monitors.query_monitors ADD COLUMN IF NOT EXISTS visualization_type TEXT NOT NULL DEFAULT 'timeseries';

The teams column is added inline in 0001:684-685, but the PR description claims migration 0006 was simplified by "removing dashboard column ALTERs now in 0002". Verify the teams column consolidation is correct.

3. Dangerous Procedure Rewrite (0001:729-741)

CREATE OR REPLACE PROCEDURE monitors.check_triggered_query_monitors(job_id int, config jsonb)
LANGUAGE PLPGSQL AS $$
BEGIN
    INSERT INTO background_jobs (run_at, status, payload)
    VALUES (NOW(), 'queued', jsonb_build_object('tag', 'QueryMonitorsCheck'));

Issue: The new implementation completely changes behavior - it no longer evaluates query monitors or passes their IDs. The old version had actual logic to check thresholds and queue specific monitors. The new version just blindly queues a generic job.

This is likely a breaking change unless the Haskell backend was updated to handle the new payload format. Verify that the backend code in the PR's parent commit handles 'QueryMonitorsCheck' without monitor IDs.


⚠️ Code Quality Issues

4. Redundant teams Declaration

In 0001:685, teams UUID[] DEFAULT '{}' is added to query_monitors, but there's no corresponding consolidation comment. Add a comment like -- Consolidated from migration 0006.

5. Inconsistent Column Ordering

When consolidating columns into CREATE TABLE, they're appended at the end with a comment. This is fine, but consider grouping related columns (e.g., all notification columns together) for readability.

6. Missing Transaction Safety

Migrations don't use explicit transactions. While PostgreSQL DDL is transactional, consider wrapping each migration in:

BEGIN;
-- migration content
COMMIT;

This makes rollback intent clearer and prevents partial application if psql is run with --single-transaction=off.


✅ Improvements & Best Practices

7. Excellent: Idempotent Type Creation

DO $$ BEGIN
  CREATE TYPE notification_channel_enum AS ENUM (...);
  EXCEPTION WHEN duplicate_object THEN null;
END $$;

This pattern is perfect for re-runnable migrations.

8. Good: Consolidated ALTER TABLEs

Merging multiple ALTER TABLE statements into the original CREATE TABLE reduces migration count and improves clarity.

9. Good: ON CONFLICT DO NOTHING

INSERT into projects.projects (...) VALUES (...) ON CONFLICT DO NOTHING;

Makes seed data idempotent.


🔍 Performance Considerations

10. Retention Policy Change (0002:206)

-SELECT add_retention_policy('otel_logs_and_spans',INTERVAL '14 days',true);
+SELECT add_retention_policy('otel_logs_and_spans',INTERVAL '30 days',true);

Impact: Doubles storage requirements for telemetry data. Ensure this was intentional and infrastructure can handle it.

11. Missing Index Optimization Opportunity

The apis.errors table adds columns first_trace_id and recent_trace_id with a composite index. Consider if individual indexes would be more useful depending on query patterns:

CREATE INDEX IF NOT EXISTS idx_errors_recent_trace ON apis.errors (recent_trace_id) WHERE recent_trace_id IS NOT NULL;
CREATE INDEX IF NOT EXISTS idx_errors_first_trace ON apis.errors (first_trace_id) WHERE first_trace_id IS NOT NULL;

This uses partial indexes to save space since these columns are nullable.


🧪 Testing Concerns

12. Migration Re-run Testing

Have you tested that these migrations can be re-run safely on an existing database? The idempotent patterns look good, but actual testing would confirm.

13. Data Migration

The PR removes migrations that created and dropped materialized views. Verify no production databases are stuck in an intermediate state that would break when running the consolidated migrations.


📝 Suggested Changes

  1. Remove duplicate index (line 485 in 0001)
  2. Verify check_triggered_query_monitors behavior change - confirm Haskell backend handles new payload
  3. Add consolidation comments for all merged columns
  4. Consider wrapping migrations in explicit transactions
  5. Review partial index opportunity for trace_id columns
  6. Test migration re-runs on a copy of production data

Summary

  • 3 Critical bugs (duplicate index, missing column verification, dangerous procedure change)
  • Good consolidation approach overall
  • Safety patterns (idempotent creates) are well-implemented
  • Missing validation that backend code matches new SQL procedure behavior

Recommendation: Request changes - fix the duplicate index and verify the procedure rewrite is intentional and coordinated with backend changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants