Science Museum halts early despite skipping ingestion errors #4207
Labels
💻 aspect: code
Concerns the software code in the repository
🛠 goal: fix
Bug fix
🟧 priority: high
Stalls work on the project or its dependents
🧱 stack: catalog
Related to the catalog and Airflow DAGs
Description
Due to an upstream failure tracked in #4013, Science Museum occasionally fails. We are running the DAG in production with
SKIPPED_INGESTION_ERRORS
skipping 503s to allow the DAG to complete.However in the latest production run, this did not work as expected. When the batch with the 503 error is reached, the logs indicate that the batch was successfully skipped -- but ingestion also halts immediately afterward, instead of moving on to the next batch:
This is a concern because it means that the provider stops ingesting after records dated to 1750 (so, it doesn't reach the vast majority of the records). This is high priority because we need a full ingestion run of this provider in order to fix data that has been broken by recent upstream changes. including the URLs.
The text was updated successfully, but these errors were encountered: