Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VReplication: Improve replication plan builder and event application errors #16596

Merged
merged 6 commits into from
Aug 21, 2024

Conversation

mattlord
Copy link
Contributor

@mattlord mattlord commented Aug 14, 2024

Description

This is a quick bit of internal cleanup that improves the replication plan builder's error messages so that they contain the table name. This is pretty important when a workflow is operating on thousands of tables. This is a sibling PR to: #16588

It should have been changed in that PR as well, but here we are... 🙂

I also took this opportunity to make two improvements to the event application errors in vplayer, for which replication plan builder errors would be one cause:

  1. Include the table name in the error message (it was only included in the log message)
  2. Add the stream position we were trying to process when encountering the error

To demonstrate:

./101_initial_cluster.sh; ./201_customer_tablets.sh

mysql commerce -e "create table t1 (id int not null, name varchar(100), street varchar(50), primary key(id))"

vtctldclient MoveTables --workflow commerce2customer --target-keyspace customer create --source-keyspace commerce --tables "t1"

mysql commerce -e "alter table t1 drop id, add primary key(name)"

mysql commerce -e "insert into t1 values ('Matt', 'Birch Hill Drive')"

❯ vtctldclient MoveTables --target-keyspace customer --workflow commerce2customer show --compact --include-logs=false | grep message
              "message": "error applying event for table t1 while processing position d77af336-5fc9-11ef-aed0-053911a91e99:1-38: failed to build replication plan for t1 table: primary key column &{id   int int true false} not found in table's select filter or the TableMap event within the GTID",

❯ mysqlbinlog -vvv --base64-output=DECODE-ROWS /opt/vtdataroot/vt_0000000101/bin-logs/vt-0000000101-bin.000001 --include-gtids=d77af336-5fc9-11ef-aed0-053911a91e99:38
...
SET @@SESSION.GTID_NEXT= 'd77af336-5fc9-11ef-aed0-053911a91e99:38'/*!*/;
# at 21837
#240821 10:30:21 server id 1138093001  end_log_pos 21919 CRC32 0x3a40f493 	Query	thread_id=44	exec_time=0	error_code=0
SET TIMESTAMP=1724250621/*!*/;
SET @@session.pseudo_thread_id=44/*!*/;
SET @@session.foreign_key_checks=1, @@session.sql_auto_is_null=0, @@session.unique_checks=1, @@session.autocommit=1/*!*/;
SET @@session.sql_mode=1168113696/*!*/;
SET @@session.auto_increment_increment=1, @@session.auto_increment_offset=1/*!*/;
/*!\C utf8mb4 *//*!*/;
SET @@session.character_set_client=255,@@session.collation_connection=255,@@session.collation_server=255/*!*/;
SET @@session.lc_time_names=0/*!*/;
SET @@session.collation_database=DEFAULT/*!*/;
/*!80011 SET @@session.default_collation_for_utf8mb4=255*//*!*/;
BEGIN
/*!*/;
# at 21919
#240821 10:30:21 server id 1138093001  end_log_pos 21981 CRC32 0xf2caa447 	Table_map: `vt_commerce`.`t1` mapped to number 178
# has_generated_invisible_primary_key=0
# at 21981
#240821 10:30:21 server id 1138093001  end_log_pos 22040 CRC32 0x0c94e906 	Write_rows: table id 178 flags: STMT_END_F
### INSERT INTO `vt_commerce`.`t1`
### SET
###   @1='Matt' /* VARSTRING(400) meta=400 nullable=0 is_null=0 */
###   @2='Birch Hill Drive' /* VARSTRING(200) meta=200 nullable=1 is_null=0 */
...

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Signed-off-by: Matt Lord <mattalord@gmail.com>
Copy link

codecov bot commented Aug 14, 2024

Codecov Report

Attention: Patch coverage is 85.00000% with 3 lines in your changes missing coverage. Please review.

Project coverage is 68.83%. Comparing base (cc68dd5) to head (07b7e3a).
Report is 14 commits behind head on main.

Files Patch % Lines
.../vt/vttablet/tabletmanager/vreplication/vplayer.go 89.47% 2 Missing ⚠️
...blet/tabletmanager/vreplication/replicator_plan.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main   #16596      +/-   ##
==========================================
- Coverage   68.85%   68.83%   -0.03%     
==========================================
  Files        1557     1557              
  Lines      199891   199949      +58     
==========================================
- Hits       137644   137627      -17     
- Misses      62247    62322      +75     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord mattlord changed the title VReplication: Improve replication plan builder errors VReplication: Improve replication plan builder and event application errors Aug 14, 2024
Copy link
Contributor

vitess-bot bot commented Aug 14, 2024

Review Checklist

Hello reviewers! 👋 Please follow this checklist when reviewing this Pull Request.

General

  • Ensure that the Pull Request has a descriptive title.
  • Ensure there is a link to an issue (except for internal cleanup and flaky test fixes), new features should have an RFC that documents use cases and test cases.

Tests

  • Bug fixes should have at least one unit or end-to-end test, enhancement and new features should have a sufficient number of tests.

Documentation

  • Apply the release notes (needs details) label if users need to know about this change.
  • New features should be documented.
  • There should be some code comments as to why things are implemented the way they are.
  • There should be a comment at the top of each new or modified test to explain what the test does.

New flags

  • Is this flag really necessary?
  • Flag names must be clear and intuitive, use dashes (-), and have a clear help text.

If a workflow is added or modified:

  • Each item in Jobs should be named in order to mark it as required.
  • If the workflow needs to be marked as required, the maintainer team must be notified.

Backward compatibility

  • Protobuf changes should be wire-compatible.
  • Changes to _vt tables and RPCs need to be backward compatible.
  • RPC changes should be compatible with vitess-operator
  • If a flag is removed, then it should also be removed from vitess-operator and arewefastyet, if used there.
  • vtctl command output order should be stable and awk-able.

@vitess-bot vitess-bot bot added NeedsBackportReason If backport labels have been applied to a PR, a justification is required NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsIssue A linked issue is missing for this Pull Request NeedsWebsiteDocsUpdate What it says labels Aug 14, 2024
Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord mattlord removed NeedsDescriptionUpdate The description is not clear or comprehensive enough, and needs work NeedsWebsiteDocsUpdate What it says NeedsIssue A linked issue is missing for this Pull Request NeedsBackportReason If backport labels have been applied to a PR, a justification is required labels Aug 14, 2024
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord mattlord marked this pull request as ready for review August 14, 2024 20:53
Signed-off-by: Matt Lord <mattalord@gmail.com>
@mattlord mattlord force-pushed the replicator_plan_errors branch from 7b9a1ab to 07b7e3a Compare August 14, 2024 20:54
@mattlord mattlord requested a review from a team August 19, 2024 13:28
Comment on lines +566 to +567
log.Errorf("Error applying event%s%s: %s", tableLogMsg, gtidLogMsg, err.Error())
err = vterrors.Wrapf(err, "error applying event%s%s", tableLogMsg, gtidLogMsg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am unsure, but shouldn't there be spaces between "event" and the two "%s"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, as those are optional portions — each starting with a space. If we had a hardcoded space after "event" then in some cases we'd have trailing whitespace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay makes sense!

@deepthi deepthi merged commit 4206c2a into main Aug 21, 2024
221 checks passed
@deepthi deepthi deleted the replicator_plan_errors branch August 21, 2024 14:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants