Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve aggregate reingestion reporting #4121

Open
stacimc opened this issue Apr 15, 2024 · 0 comments
Open

Improve aggregate reingestion reporting #4121

stacimc opened this issue Apr 15, 2024 · 0 comments
Labels
💻 aspect: code Concerns the software code in the repository 🧰 goal: internal improvement Improvement that benefits maintainers, not users 🟩 priority: low Low priority and doesn't need to be rushed 🧱 stack: catalog Related to the catalog and Airflow DAGs

Comments

@stacimc
Copy link
Collaborator

stacimc commented Apr 15, 2024

Current Situation

#4074 modified reingestion workflows to skip Slack error reporting during the pull_data tasks, and instead report errors in aggregate in a task at the end of the Dagrun. Because a reingestion workflow can have >100 ingestion days, it's possible this aggregate message could contain links to 100s of logfiles in the worst case. Consequently the message is configured to only link the first 5 error logs.

There is also no context as to what the errors were in the Slack message itself, just the link to the log file for the failed task.

Suggested Improvement

It would be great to include information about the errors that were detected in the Slack message. In particular, when there is more than one failed reingestion day it would be very useful to know whether all the failed days encountered the same error, in which case it may not be necessary to investigate them individually.

We should weigh the benefits of presenting this information against the desire to keep the Slack notifications as brief as possible.

Additional context

Comment thread on the aggregate reporting PR which discussed this.

@stacimc stacimc added 🟩 priority: low Low priority and doesn't need to be rushed 💻 aspect: code Concerns the software code in the repository 🧰 goal: internal improvement Improvement that benefits maintainers, not users 🧱 stack: catalog Related to the catalog and Airflow DAGs labels Apr 15, 2024
@openverse-bot openverse-bot moved this to 📋 Backlog in Openverse Backlog Apr 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
💻 aspect: code Concerns the software code in the repository 🧰 goal: internal improvement Improvement that benefits maintainers, not users 🟩 priority: low Low priority and doesn't need to be rushed 🧱 stack: catalog Related to the catalog and Airflow DAGs
Projects
Status: 📋 Backlog
Development

No branches or pull requests

1 participant