Update raw_tags
to avoid duplicates in the catalog
#3926
Labels
💻 aspect: code
Concerns the software code in the repository
🧰 goal: internal improvement
Improvement that benefits maintainers, not users
🟧 priority: high
Stalls work on the project or its dependents
🧱 stack: catalog
Related to the catalog and Airflow DAGs
Problem
We are not checking for possible duplicates of tags in provider script at the moment of ingestion. It is assumed that they do not exist from providers but it is better to be sure on our side.
Description
Change the
raw_tags
field type to be a set instead of a list of strings.Additional context
Arises from #1566.
The text was updated successfully, but these errors were encountered: