-
Notifications
You must be signed in to change notification settings - Fork 214
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove and de-duplicate tags with leading/trailing whitespace #4199
Comments
@WordPress/openverse-catalog I'd like to take a shot at this issue. Am I correct to assume this should use the batched update DAG? And if so, I think I'd like to try it in two steps, as suggested, basically doing something like this:
Is such a thing possible with the batched update DAG? Are there any potentially helpful examples of how we've used that recently I could work off of? |
@sarayourfriend You're correct. It's possible to do it with the |
If possible, it'd be best to combine both of those steps into a single batched update, that way we don't have to do two passes on the data! Might make for a tricky query, but then we only have to run it once 😄 |
I guess Thanks for the input, y'all. |
Reopening for the pending execution of the |
Solved in #4557. |
Description
We have some records in our data where there are duplicate tags, only the duplicate tag has leading or trailing whitespace. Here's an example: https://api.openverse.engineering/v1/images/2d454032-0cc1-48a5-8f40-e9235f1a4f12/
This might need to be tackled in two steps, or a least an operation which covers both cases:
We will also want to check, similar to #1566, that any new tags added always have extra whitespace stripped.
Additional context
Related to #430
The text was updated successfully, but these errors were encountered: