Skip to content

Conversation

tmacro
Copy link
Contributor

@tmacro tmacro commented Jun 6, 2024

This PR adds support for limiting reindex to particular account(s).
As part of the changes, rather than having --bucket and --account differ in behavior, and to have a more consistent CLI, I've pulled in some of the requested features from S3C-8077 (support multiple buckets).
I also added a --dry-run flag to skip redis updates to help both dev and field use.

Modifies:
--bucket: Can now be passed multiple times to reindex multiple buckets

Adds:
--account: Limit reindex to an account canonical Id. Can be passed multiple times.
--account-file: Read canonical Ids from a file. 1 per line.
--bucket-file: Read bucket names from a file. 1 per line.
--dry-run: Skip updating redis.

@bert-e
Copy link
Contributor

bert-e commented Jun 6, 2024

Hello tmacro,

My role is to assist you with the merge of this
pull request. Please type @bert-e help to get information
on this process, or consult the user documentation.

Available options
name description privileged authored
/after_pull_request Wait for the given pull request id to be merged before continuing with the current one.
/bypass_author_approval Bypass the pull request author's approval
/bypass_build_status Bypass the build and test status
/bypass_commit_size Bypass the check on the size of the changeset TBA
/bypass_incompatible_branch Bypass the check on the source branch prefix
/bypass_jira_check Bypass the Jira issue check
/bypass_peer_approval Bypass the pull request peers' approval
/bypass_leader_approval Bypass the pull request leaders' approval
/approve Instruct Bert-E that the author has approved the pull request. ✍️
/create_pull_requests Allow the creation of integration pull requests.
/create_integration_branches Allow the creation of integration branches.
/no_octopus Prevent Wall-E from doing any octopus merge and use multiple consecutive merge instead
/unanimity Change review acceptance criteria from one reviewer at least to all reviewers
/wait Instruct Bert-E not to run until further notice.
Available commands
name description privileged
/help Print Bert-E's manual in the pull request.
/status Print Bert-E's current status in the pull request TBA
/clear Remove all comments from Bert-E from the history TBA
/retry Re-start a fresh build TBA
/build Re-start a fresh build TBA
/force_reset Delete integration branches & pull requests, and restart merge process from the beginning.
/reset Try to remove integration branches unless there are commits on them which do not appear on the source branch.

Status report is not available.

@scality scality deleted a comment from bert-e Jun 6, 2024
@bert-e
Copy link
Contributor

bert-e commented Jun 6, 2024

Request integration branches

Waiting for integration branch creation to be requested by the user.

To request integration branches, please comment on this pull request with the following command:

/create_integration_branches

Alternatively, the /approve and /create_pull_requests commands will automatically
create the integration branches.

@tmacro
Copy link
Contributor Author

tmacro commented Jun 6, 2024

/create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Jun 6, 2024

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/6.4
  • development/7.10
  • development/7.4

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

The following options are set: create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Jun 6, 2024

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

The following options are set: create_integration_branches

Copy link
Contributor

@jonathan-gramain jonathan-gramain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with some minor suggestions

Comment on lines 449 to 500
_log.warning(
"DryRun: resource buckets [%s] will be not updated with obj_count %i and total_size %i" % (
bucket, report['obj_count'], report['total_size']
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest _log.info, as it is a dry-run it is expected by the user.

Also thinking the formulation is somewhat odd or sounds negative and could be improved but I don't have a good substitution in mind at the moment. Thinking that the message should just convey what are the values that would be updated could work better (maybe just rewording as would be updated could work).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just another proposition:

DryRun: obj_count %i and total_size %i was calculated for resource bucket [%s]. The bucket has not been updated.

if options.dry_run:
for userid, report in account_reports.items():
_log.warning(
"DryRun: resource account [%s] will be not updated with obj_count %i and total_size %i" % (
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment here for the messaging and _log.info

def existing_file(path):
path = Path(path).resolve()
if not path.exists():
raise argparse.ArgumentTypeError("File does not exist")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be nice to show the path to the file (may not be obvious to the user which file doesn't exist)

# Break on the first matching bucket if a name is given
break
if names:
seen_buckets.update(b.name for b in buckets)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather do the update with the bucket name just processed after buckets.append(bucket)

Comment on lines 234 to 236
# Break if we have seen all the buckets we are looking for
if all(b in seen_buckets for b in names):
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor suggestion: if we know that names doesn't have duplicate bucket names (or alternatively we could ensure it doesn't), it should be enough to check that the size of the set is equal to len(names).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While fiddling with this I realized that with the addition of get_bucket_md the names param isn't used anywhere. I've simplified the function to just remove it.

Comment on lines 521 to 515
if not options.bucket and not options.account:
stale_buckets = recorded_buckets.difference(observed_buckets)
elif options.bucket:
stale_buckets = { b for b in options.bucket if b not in observed_buckets }
elif options.account:
_log.warning('Stale buckets will not be cleared when using the --account or --account-file flags')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be slightly reorganized for simplicity as

if options.bucket:
    ...
elif options.account:
    ...
else:
    # neither bucket nor account
    ...

if options.account:
for account in options.account:
if account in failed_accounts:
_log.error("No metrics updated for %s, one or more buckets failed" % account)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_log.error("No metrics updated for %s, one or more buckets failed" % account)
_log.error("No metrics updated for account %s, one or more buckets failed" % account)

return parser.parse_args()
group = parser.add_mutually_exclusive_group()
group.add_argument("-b", "--bucket", default=[], help="bucket name", action="append", type=nonempty_string('bucket'))
group.add_argument("--bucket-file", default=None, help="file containing bucket names", type=existing_file)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
group.add_argument("--bucket-file", default=None, help="file containing bucket names", type=existing_file)
group.add_argument("--bucket-file", default=None, help="file containing bucket names, one bucket name per line", type=existing_file)

parser.add_argument("--dry-run", action="store_true", help="Do not update redis")
group = parser.add_mutually_exclusive_group()
group.add_argument("-a", "--account", default=[], help="account canonical ID (all account buckets will be processed)", action="append", type=nonempty_string('account'))
group.add_argument("--account-file", default=None, help="file containing account canonical IDs", type=existing_file)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment about giving a hint for the file format.

@tmacro tmacro force-pushed the improvement/UTAPI-103/support_reindex_by_account branch from c14f39e to 69b94c5 Compare June 12, 2024 18:27
@bert-e
Copy link
Contributor

bert-e commented Jun 12, 2024

History mismatch

Merge commit #f87a65065ad4e7ad325a556ff2042c39e9794ad8 on the integration branch
w/8.1/improvement/UTAPI-103/support_reindex_by_account is merging a branch which is neither the current
branch improvement/UTAPI-103/support_reindex_by_account nor the development branch
development/8.1.

It is likely due to a rebase of the branch improvement/UTAPI-103/support_reindex_by_account and the
merge is not possible until all related w/* branches are deleted or updated.

Please use the reset command to have me reinitialize these branches.

The following options are set: create_integration_branches

@tmacro
Copy link
Contributor Author

tmacro commented Jun 12, 2024

/reset

@bert-e
Copy link
Contributor

bert-e commented Jun 12, 2024

Reset complete

I have successfully deleted this pull request's integration branches.

The following options are set: create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Jun 12, 2024

Integration data created

I have created the integration data for the additional destination branches.

The following branches will NOT be impacted:

  • development/6.4
  • development/7.10
  • development/7.4

You can set option create_pull_requests if you need me to create
integration pull requests in addition to integration branches, with:

@bert-e create_pull_requests

The following options are set: create_integration_branches

@bert-e
Copy link
Contributor

bert-e commented Jun 12, 2024

Waiting for approval

The following approvals are needed before I can proceed with the merge:

  • the author

  • 2 peers

The following options are set: create_integration_branches

@tmacro
Copy link
Contributor Author

tmacro commented Jun 12, 2024

/approve

@bert-e
Copy link
Contributor

bert-e commented Jun 12, 2024

I have successfully merged the changeset of this pull request
into targetted development branches:

  • ✔️ development/7.70

  • ✔️ development/8.1

The following branches have NOT changed:

  • development/6.4
  • development/7.10
  • development/7.4

Please check the status of the associated issue UTAPI-103.

Goodbye tmacro.

The following options are set: approve, create_integration_branches

@bert-e bert-e merged commit 69b94c5 into development/7.70 Jun 12, 2024
@bert-e bert-e deleted the improvement/UTAPI-103/support_reindex_by_account branch June 12, 2024 18:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants