Skip to content

Comments

fix: normalize schema-qualified identifiers in SQL strings#157

Merged
tianzhou merged 1 commit intopgplex:mainfrom
p-c-h-b:fix/schema-normalization
Nov 8, 2025
Merged

fix: normalize schema-qualified identifiers in SQL strings#157
tianzhou merged 1 commit intopgplex:mainfrom
p-c-h-b:fix/schema-normalization

Conversation

@p-c-h-b
Copy link
Contributor

@p-c-h-b p-c-h-b commented Nov 8, 2025

This fixes a bug where schema-qualified identifiers appearing inside SQL strings were not normalized when the desired schema was applied to a temporary schema during plan validation.

Problem

When pgschema validates the desired state by applying SQL to a temporary schema (e.g., pgschema_tmp_20251107_180338), it inspects the result and builds an intermediate representation (IR). While it normalizes schema names in most places, it didn't normalize schema qualifications inside SQL text fields like:

  • Function bodies: SELECT * FROM pgschema_tmp_20251107_180338.users
  • Policy expressions: (pgschema_tmp_20251107_180338.orders o JOIN pgschema_tmp_20251107_180338.users u ...)
  • Trigger function references
  • Column defaults and generated expressions
  • Index WHERE clauses

The generated diff would then emit SQL referencing the non-existent temporary schema, causing "schema does not exist" errors when applied to the target database.

Solution

Added comprehensive string replacement to normalizeSchemaNames():

  1. New helper function newSchemaStringReplacer() that creates a string replacer handling all forms of schema qualification:

    • Quoted identifiers: "schema".table
    • Unquoted identifiers: schema.table
    • Bare schema names: "schema" or schema
  2. Extended normalization to cover all textual SQL fields:

    • Column data types, defaults, generated expressions
    • Index WHERE clauses
    • Trigger functions and conditions
    • RLS policy USING and WITH CHECK expressions
    • View definitions
    • Function/procedure definitions and parameter types
    • Type definitions, base types, and constraints
    • Sequence data types and owned-by references
    • Aggregate function references and state types

Changes

  • Added newSchemaStringReplacer() helper function in cmd/plan/plan.go
  • Modified normalizeSchemaNames() to apply string replacement to all relevant text fields

Fixes #155 (partially - this is the second of three fixes mentioned in the issue)

This fixes a bug where schema-qualified identifiers appearing inside
SQL strings (defaults, policy expressions, trigger bodies, function
definitions, etc.) were not normalized when the desired schema was
applied to a temporary schema during plan validation.

The diff would then emit SQL referencing the temporary schema name
(e.g., 'pgschema_tmp_20251107_180338.users') instead of the target
schema, causing 'schema does not exist' errors.

Changes:
- Added newSchemaStringReplacer() helper that creates a replacer for
  schema-qualified identifiers in various forms (quoted, unquoted, etc.)
- Extended normalizeSchemaNames() to normalize all textual fields:
  - Column data types, defaults, and generated expressions
  - Index WHERE clauses
  - Trigger functions and conditions
  - RLS policy USING and WITH CHECK expressions
  - View definitions
  - Function/procedure definitions and parameter types
  - Type definitions, base types, and constraints
  - Sequence data types and owned-by references
  - Aggregate function references and state types

Fixes pgplex#155 (partially - this is the second of three fixes)
Copilot AI review requested due to automatic review settings November 8, 2025 03:03
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR enhances the normalizeSchemaNames function to comprehensively replace schema references in SQL expressions and string fields across all database object types. The main improvement is the introduction of a helper function newSchemaStringReplacer that handles schema name replacements in various SQL string formats (quoted, qualified, etc.).

Key changes:

  • Added newSchemaStringReplacer helper function for consistent schema name replacement in SQL strings
  • Extended normalization to cover SQL expressions in constraints, columns, indexes, triggers, policies, views, functions, procedures, types, sequences, and aggregates
  • Improved handling of schema-qualified references with proper quoting support

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +542 to +547
replacements := []string{
fmt.Sprintf(`"%s".`, fromSchema), fmt.Sprintf(`"%s".`, toSchema),
fmt.Sprintf(`%s.`, fromSchema), fmt.Sprintf(`%s.`, toSchema),
fmt.Sprintf(`"%s"`, fromSchema), fmt.Sprintf(`"%s"`, toSchema),
fromSchema, toSchema,
}
Copy link

Copilot AI Nov 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The replacement order in strings.NewReplacer can cause incorrect results. More specific patterns (like \"schema\".) should be replaced before less specific ones (like bare schema). However, strings.NewReplacer processes replacements in order, and a bare schema name could be replaced first, breaking more specific patterns. For example, if fromSchema is "temp" and toSchema is "public", the string \"temp\".table might first match the bare temp replacement, becoming \"public\".table instead of \"public\".table. While this specific example works, consider cases where partial matches could cause issues. The safer approach is to process replacements from most specific to least specific, which requires either reordering the slice to ensure longer patterns are first, or using a different replacement strategy like regexp with word boundaries.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@tianzhou tianzhou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@tianzhou tianzhou merged commit ae952f4 into pgplex:main Nov 8, 2025
7 checks passed
@tianzhou
Copy link
Contributor

tianzhou commented Nov 8, 2025

follow up in #159

@p-c-h-b p-c-h-b deleted the fix/schema-normalization branch November 8, 2025 14:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Fix dependency handling: FK ordering, schema normalization, and function replacements

2 participants