Use batched update to clean up empty JSON objects in tags fields #4091
Labels
🗄️ aspect: data
Concerns the data in our catalog and/or databases
🛠 goal: fix
Bug fix
🟨 priority: medium
Not blocking but should be addressed soon
🧱 stack: catalog
Related to the catalog and Airflow DAGs
Description
While investigating the existing Clarifai tags for #3948 (comment), I had some trouble with the Postgres query used because I assumed the
tags
field would either beNULL
or an array. It turns out that some of our older records which haven't been touched since 2020 (30,376,519 to be exact) have empty objects in them instead (e.g.{}
). Example:This can complicate some of the logic necessary for updating tags down the line, and it may even be causing issues with updating those tags now.
I'd like to use the batched update DAG to set all of these values to
NULL
Additional context
Related to #3948
The text was updated successfully, but these errors were encountered: