Skip to content

Prevent TiCDC from Using the Same TiDB Cluster as Both Upstream and Downstream #3949

@wlwilliamx

Description

@wlwilliamx

Background

TiCDC allows replicating data from an upstream TiDB cluster to a downstream system, including another TiDB cluster. However, using the same TiDB cluster as both the upstream and downstream can cause unexpected replication issues, including feedback loops and data inconsistencies.

To prevent this, we need a mechanism to reliably determine whether the upstream and downstream TiDB clusters are the same and reject such configurations.

Proposed Solution

We will use the Cluster ID stored in TiDB’s system table(mysql.tidb) to compare the upstream and downstream clusters.

  1. Retrieve Upstream Cluster ID

    • TiCDC already obtains the upstream Cluster ID via PD’s gRPC API.
  2. Retrieve Downstream Cluster ID

    • TiDB will introduce a system table mysql.tidb to store the Cluster ID.
    • TiCDC will query this system table via SQL to fetch the downstream Cluster ID.
  3. Comparison and Validation

    • If the upstream and downstream Cluster IDs match, TiCDC will reject the changefeed creating, updating and resuming with an appropriate error message.

Tasks

  • Implement the system table in TiDB to store the Cluster ID. tidb#59511
  • Modify TiCDC to query the mysql.tidb and retrieve the downstream Cluster ID.
  • Compare the upstream and downstream Cluster IDs during changefeed creation.
  • Add error handling and logging when the same Cluster ID is detected.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions