MDEV-34924 : gtid_slave_pos table neven been deleted on non replica n… #3718

janlindstrom · 2024-12-19T07:47:43Z

…odes (wsrep_gtid_mode = 1)

The Jira issue number for this PR is: MDEV-34924

Description

Problem was caused by MDEV-31413 commit 277968a where mysql.gtid_slave_pos table was replicated by Galera. However, as not all nodes in Galera cluster are replica nodes, rows were not deleted from table.

In this fix this is corrected so that mysql.gtid_slave_pos table is not replicated by Galera. Instead when Galera node receives GTID event and wsrep_gtid_mode=1, this event is stored to mysql.gtid_slave_pos table

Added test case galera_2primary_replica for 2 async primaries replicating to galera cluster.

Added test case galera_circular_replication where async primary replicates to galera cluster and
one of the galera cluster nodes is masterto async replica.

Modified test case galera_restart_replica to monitor gtid positions and rows in mysql.gtid_pos_table

Release Notes

TODO: What should the release notes say about this change?
Include any changed system variables, status variables or behaviour. Optionally list any https://mariadb.com/kb/ pages that need changing.

How can this PR be tested?

TODO: modify the automated test suite to verify that the PR causes MariaDB to behave as intended.
Consult the documentation on "Writing good test cases".

If the changes are not amenable to automated testing, please explain why not and carefully describe how to test manually.

Basing the PR against the correct MariaDB version

This is a new feature or a refactoring, and the PR is based against the main branch.
[ x] This is a bug fix, and the PR is based against the earliest maintained branch in which the bug can be reproduced.

PR quality check

[x ] I checked the CODING_STANDARDS.md file and my PR conforms to this where appropriate.
[ x] For any trivial modifications to the PR, I am ok with the reviewer making the changes themselves.

temeo · 2025-01-09T11:58:41Z

sql/wsrep_schema.cc

+  void *hton= NULL;
+  const bool in_transaction= (gtid->flags2 & Gtid_log_event::FL_TRANSACTIONAL);
+  const bool in_ddl= (gtid->flags2 & Gtid_log_event::FL_DDL);
+  Wsrep_schema_impl::wsrep_off  wsrep_off(thd);


Turning wsrep off here causes the first statement to run with wsrep OFF. If mysql.gtid_slave_pos table is InnoDB table, this causes the transaction to be started with wsrep OFF. As the wsrep status is synced only at the beginning of the transaction in InnoDB, this causes the rest of the transaction to run with wsrep OFF too.

Maybe better to use `Wsrep_schema_impl::wsrep_ignore_table.

…odes (wsrep_gtid_mode = 1) Problem was caused by MDEV-31413 commit 277968a where mysql.gtid_slave_pos table was replicated by Galera. However, as not all nodes in Galera cluster are replica nodes, rows were not deleted from table. In this fix this is corrected so that mysql.gtid_slave_pos table is not replicated by Galera. Instead when Galera node receives GTID event and wsrep_gtid_mode=1, this event is stored to mysql.gtid_slave_pos table Added test case galera_2primary_replica for 2 async primaries replicating to galera cluster. Added test case galera_circular_replication where async primary replicates to galera cluster and one of the galera cluster nodes is master to async replica. Modified test case galera_restart_replica to monitor gtid positions and rows in mysql.gtid_pos_table

bnestere

Hi @janlindstrom !

In general I think the approach works. See my comments for some specific suggestions/questions.

Also, where your git commit message does a good job explaining what the patch does, it doesn't cover why/how this approach works (which I think it should).

bnestere · 2025-01-12T15:33:07Z

sql/wsrep_schema.cc

+    if((error= trans_commit_stmt(thd)))
+      goto out;
+
+    (void)trans_commit(thd);


I'm not really familiar with the Galera code - when in the transaction life-cycle is store_gtid_event()/wsrep_apply_events() called? It looks to me like this is before any of the transaction events have actually been applied. If so, and then we update & commit mysql.gtid_slave_pos here before actually applying the DDL, then if something happens while the DDL is running and we restart, gtid_slave_pos would be ahead of the actual database state.

In case of DDL, it is problematic. If GTID event is executed before DDL, it will be committed anyway as DDL does implicit commit before start. IF GTID event is executed after DDL it is also not atomic with DDL.

bnestere · 2025-01-12T15:35:33Z

sql/wsrep_schema.h

@@ -133,6 +133,16 @@ class Wsrep_schema
  */
  int recover_sr_transactions(THD* orig_thd);

+  /**
+     Store GTID-event to mysql.gtid_slave_pos table.


I think it would be good to extend this comment to explain why it is needed.

bnestere · 2025-01-12T15:37:56Z

mysql-test/suite/galera/t/galera_restart_replica.test

+#  - Verify that mysql.gtid_slave_pos has some rows in all Galera nodes
+#  - Verify that gtid_slave_pos, gtid_binlog_pos and gtid_current_pos are
+#    same in all Galera nodes
+#  - Verify that we can shutdown and restart Galera replica (node #2)


In our previous email exchange, you noted an extended use case for this, where (adapting it to this test) node_2 would go down, and there would be an SST from node1->node_2 , and then node_2 would need to restart at the correct slave position. I think it would be good to extend this (or one of your other) test cases to handle that.

See line 59, we intentionally force SST (here it does not matter which Galera node is donor) and at line 70 node_2 is restarted and because it has skip-slave-restart=0 slave thread is started.

bnestere · 2025-01-12T15:46:14Z

mysql-test/suite/galera/t/galera_2primary_replica.cnf

+gtid_domain_id=11
+gtid_strict_mode=1
+wsrep-slave-threads=4
+slave-parallel-threads=2


I think it would be good to set gtid-cleanup-batch-size differently on the mariadb nodes, to add a layer of verification on the reason of a lesser gtid_slave_pos table size. I' dsuggest, on one node, made it really big, so the cleanup never happens; and on another node, keep it small. Then your validations would be more strict (rather than just < 1000).

On the node with a really big gtid-cleanup-batch-size, you could validate that SELECT COUNT(*) FROM mysql.gtid_slave_pos == <domain_3_seqno> + <domain_4_seqno> (note the gtid-cleanup-batch-size value should be much more than <domain_3_seqno> + <domain_4_seqno> (2006), as previously run tests can impact the cleanup time).

And on the node with a small gtid-cleanup-batch-size, you should be able to assert SELECT COUNT(*) FROM mysql.gtid_slave_pos < @@gtid-cleanup-batch-size

Possible, but as this is tested on rpl test I do not see reason here. Note that current code uses same function as rpl it should work as designed. Previous patch used to replicate writes directly to InnoDB table of mysql.gtid_pos_table and it did not use any real gtid pos updating functions.

sysprg · 2025-01-14T01:01:59Z

@janlindstrom Thanks, the fix has been merged with the head revision: 133e26f
@bnestere I accept this patch as the current working version after your review which generally finds no problems here in light of Jan's explanations - since without fixes from this commit we will not close the current issues for customers and will not be able to move on to solving the next critical issue: https://jira.mariadb.org/browse/MDEV-20065 since the Codership team needs this fix as a prerequisite for a future PR on MDEV-20065. I propose that all future improvements, if we deem them necessary, be made as separate MDEVs.

janlindstrom assigned sysprg and bnestere Dec 19, 2024

janlindstrom added the Codership Codership Galera label Dec 19, 2024

temeo suggested changes Jan 9, 2025

View reviewed changes

janlindstrom force-pushed the 10.5-MDEV-34924 branch from d91b6e8 to c184de4 Compare January 9, 2025 12:26

bnestere reviewed Jan 12, 2025

View reviewed changes

sysprg closed this Jan 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MDEV-34924 : gtid_slave_pos table neven been deleted on non replica n… #3718

MDEV-34924 : gtid_slave_pos table neven been deleted on non replica n… #3718

janlindstrom commented Dec 19, 2024

temeo Jan 9, 2025 •

edited

Loading

janlindstrom Jan 9, 2025

bnestere left a comment

bnestere Jan 12, 2025

janlindstrom Jan 13, 2025

bnestere Jan 12, 2025

bnestere Jan 12, 2025

janlindstrom Jan 13, 2025

bnestere Jan 12, 2025

janlindstrom Jan 13, 2025

sysprg commented Jan 14, 2025

MDEV-34924 : gtid_slave_pos table neven been deleted on non replica n… #3718

MDEV-34924 : gtid_slave_pos table neven been deleted on non replica n… #3718

Conversation

janlindstrom commented Dec 19, 2024

Description

Release Notes

How can this PR be tested?

Basing the PR against the correct MariaDB version

PR quality check

temeo Jan 9, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bnestere left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sysprg commented Jan 14, 2025

temeo Jan 9, 2025 •

edited

Loading