-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
MDEV-34924 : gtid_slave_pos table neven been deleted on non replica n… #3718
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sql/wsrep_schema.cc
Outdated
void *hton= NULL; | ||
const bool in_transaction= (gtid->flags2 & Gtid_log_event::FL_TRANSACTIONAL); | ||
const bool in_ddl= (gtid->flags2 & Gtid_log_event::FL_DDL); | ||
Wsrep_schema_impl::wsrep_off wsrep_off(thd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Turning wsrep off here causes the first statement to run with wsrep OFF. If mysql.gtid_slave_pos
table is InnoDB table, this causes the transaction to be started with wsrep OFF. As the wsrep status is synced only at the beginning of the transaction in InnoDB, this causes the rest of the transaction to run with wsrep OFF too.
Maybe better to use `Wsrep_schema_impl::wsrep_ignore_table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
…odes (wsrep_gtid_mode = 1) Problem was caused by MDEV-31413 commit 277968a where mysql.gtid_slave_pos table was replicated by Galera. However, as not all nodes in Galera cluster are replica nodes, rows were not deleted from table. In this fix this is corrected so that mysql.gtid_slave_pos table is not replicated by Galera. Instead when Galera node receives GTID event and wsrep_gtid_mode=1, this event is stored to mysql.gtid_slave_pos table Added test case galera_2primary_replica for 2 async primaries replicating to galera cluster. Added test case galera_circular_replication where async primary replicates to galera cluster and one of the galera cluster nodes is master to async replica. Modified test case galera_restart_replica to monitor gtid positions and rows in mysql.gtid_pos_table
d91b6e8
to
c184de4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @janlindstrom !
In general I think the approach works. See my comments for some specific suggestions/questions.
Also, where your git commit message does a good job explaining what the patch does, it doesn't cover why/how this approach works (which I think it should).
if((error= trans_commit_stmt(thd))) | ||
goto out; | ||
|
||
(void)trans_commit(thd); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really familiar with the Galera code - when in the transaction life-cycle is store_gtid_event()
/wsrep_apply_events()
called? It looks to me like this is before any of the transaction events have actually been applied. If so, and then we update & commit mysql.gtid_slave_pos
here before actually applying the DDL, then if something happens while the DDL is running and we restart, gtid_slave_pos
would be ahead of the actual database state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case of DDL, it is problematic. If GTID event is executed before DDL, it will be committed anyway as DDL does implicit commit before start. IF GTID event is executed after DDL it is also not atomic with DDL.
@@ -133,6 +133,16 @@ class Wsrep_schema | |||
*/ | |||
int recover_sr_transactions(THD* orig_thd); | |||
|
|||
/** | |||
Store GTID-event to mysql.gtid_slave_pos table. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to extend this comment to explain why it is needed.
# - Verify that mysql.gtid_slave_pos has some rows in all Galera nodes | ||
# - Verify that gtid_slave_pos, gtid_binlog_pos and gtid_current_pos are | ||
# same in all Galera nodes | ||
# - Verify that we can shutdown and restart Galera replica (node #2) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In our previous email exchange, you noted an extended use case for this, where (adapting it to this test) node_2
would go down, and there would be an SST from node1
->node_2
, and then node_2
would need to restart at the correct slave position. I think it would be good to extend this (or one of your other) test cases to handle that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See line 59, we intentionally force SST (here it does not matter which Galera node is donor) and at line 70 node_2 is restarted and because it has skip-slave-restart=0 slave thread is started.
gtid_domain_id=11 | ||
gtid_strict_mode=1 | ||
wsrep-slave-threads=4 | ||
slave-parallel-threads=2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be good to set gtid-cleanup-batch-size
differently on the mariadb nodes, to add a layer of verification on the reason of a lesser gtid_slave_pos
table size. I' dsuggest, on one node, made it really big, so the cleanup never happens; and on another node, keep it small. Then your validations would be more strict (rather than just < 1000
).
On the node with a really big gtid-cleanup-batch-size
, you could validate that SELECT COUNT(*) FROM mysql.gtid_slave_pos == <domain_3_seqno> + <domain_4_seqno>
(note the gtid-cleanup-batch-size
value should be much more than <domain_3_seqno> + <domain_4_seqno>
(2006
), as previously run tests can impact the cleanup time).
And on the node with a small gtid-cleanup-batch-size
, you should be able to assert SELECT COUNT(*) FROM mysql.gtid_slave_pos < @@gtid-cleanup-batch-size
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possible, but as this is tested on rpl test I do not see reason here. Note that current code uses same function as rpl it should work as designed. Previous patch used to replicate writes directly to InnoDB table of mysql.gtid_pos_table and it did not use any real gtid pos updating functions.
@janlindstrom Thanks, the fix has been merged with the head revision: 133e26f |
…odes (wsrep_gtid_mode = 1)
Description
Problem was caused by MDEV-31413 commit 277968a where mysql.gtid_slave_pos table was replicated by Galera. However, as not all nodes in Galera cluster are replica nodes, rows were not deleted from table.
In this fix this is corrected so that mysql.gtid_slave_pos table is not replicated by Galera. Instead when Galera node receives GTID event and wsrep_gtid_mode=1, this event is stored to mysql.gtid_slave_pos table
Added test case galera_2primary_replica for 2 async primaries replicating to galera cluster.
Added test case galera_circular_replication where async primary replicates to galera cluster and
one of the galera cluster nodes is masterto async replica.
Modified test case galera_restart_replica to monitor gtid positions and rows in mysql.gtid_pos_table
Release Notes
TODO: What should the release notes say about this change?
Include any changed system variables, status variables or behaviour. Optionally list any https://mariadb.com/kb/ pages that need changing.
How can this PR be tested?
TODO: modify the automated test suite to verify that the PR causes MariaDB to behave as intended.
Consult the documentation on "Writing good test cases".
If the changes are not amenable to automated testing, please explain why not and carefully describe how to test manually.
Basing the PR against the correct MariaDB version
main
branch.PR quality check