Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update raw_tags to avoid duplicates in the catalog #3926

Closed
krysal opened this issue Mar 14, 2024 · 0 comments · Fixed by #4014
Closed

Update raw_tags to avoid duplicates in the catalog #3926

krysal opened this issue Mar 14, 2024 · 0 comments · Fixed by #4014
Assignees
Labels
💻 aspect: code Concerns the software code in the repository 🧰 goal: internal improvement Improvement that benefits maintainers, not users 🟧 priority: high Stalls work on the project or its dependents 🧱 stack: catalog Related to the catalog and Airflow DAGs

Comments

@krysal
Copy link
Member

krysal commented Mar 14, 2024

Problem

We are not checking for possible duplicates of tags in provider script at the moment of ingestion. It is assumed that they do not exist from providers but it is better to be sure on our side.

Description

Change the raw_tags field type to be a set instead of a list of strings.

Additional context

Arises from #1566.

@krysal krysal added 🟧 priority: high Stalls work on the project or its dependents 💻 aspect: code Concerns the software code in the repository 🧰 goal: internal improvement Improvement that benefits maintainers, not users 🧱 stack: catalog Related to the catalog and Airflow DAGs labels Mar 14, 2024
@openverse-bot openverse-bot moved this to 📋 Backlog in Openverse Backlog Mar 14, 2024
@krysal krysal self-assigned this Mar 14, 2024
@openverse-bot openverse-bot moved this from 📋 Backlog to 📅 To Do in Openverse Backlog Mar 14, 2024
@openverse-bot openverse-bot moved this from 📅 To Do to 🏗 In Progress in Openverse Backlog Mar 14, 2024
@krysal krysal mentioned this issue Mar 14, 2024
1 task
@openverse-bot openverse-bot moved this from 🏗 In Progress to 📋 Backlog in Openverse Backlog Apr 3, 2024
@openverse-bot openverse-bot moved this from 📋 Backlog to 🏗 In Progress in Openverse Backlog Apr 3, 2024
@openverse-bot openverse-bot moved this from 🏗 In Progress to ✅ Done in Openverse Backlog Apr 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository 🧰 goal: internal improvement Improvement that benefits maintainers, not users 🟧 priority: high Stalls work on the project or its dependents 🧱 stack: catalog Related to the catalog and Airflow DAGs
Projects
Archived in project
1 participant