Airflow DAG to monitor Elasticsearch cluster health #3747
Labels
💻 aspect: code
Concerns the software code in the repository
🌟 goal: addition
Addition of new feature
🟧 priority: high
Stalls work on the project or its dependents
🧱 stack: catalog
Related to the catalog and Airflow DAGs
Problem
We do not have a sufficiently flexible monitoring tool that can read Elasticsearch cluster health information and send an alert when appropriate.
Description
As a stop-gap solution until we have a better long-term solution, proposed by @AetherUnbound, is an Airflow DAG to read the cluster health endpoint, and report to the alerts channel when the cluster health is not green or when the number of nodes does not match the expectation (6).
Alternatives
The only immediately available alternative is to invest a significant amount of time enabling Kibana's monitoring integrations. However, that will require changing the way we deploy Elasticsearch and Kibana, in such significant ways, that we simply do not have the time. It is also difficult to justify that investment only for monitoring Elasticsearch, when Grafana Cloud would require less time overall and is more flexible, and can provide monitoring for all our infrastructure as well as the data visualisation that Kibana is good at.
The text was updated successfully, but these errors were encountered: