Releases: m-lab/prometheus-support
Releases · m-lab/prometheus-support
Fixes missing metric alert
This is provisional release to address a production alert which has already been identified and fixed in staging. It addresses this PR: #611
It also brings along with it removal of the downloader cluster deployments from the Travis config.
Weekly release
- Deploy a new gcs-exporter to the prometheus-federation cluster.
- Renames the data-processing cluster, along with any config changes to facilitate the rename.
- Adds a new alert for missing "generic" (non-experiment specific) metrics upon which mlab-ns relies.
- Properly prefixes mlab-ns-related alerts with
MlabNS_
.
Weekly release
Further reduce number of alerts with severity=page (#602) * Further reduce number of alerts with severity=page. * PlatformCluster_FederationScrapeJobFailing has severity=ticket now.
Weekly release
Rollback bqx version (#601) * Rollback bqx version * Fix project flag
Weekly release
Point of interest in this release:
- Removes virtually all legacy/PLC configurations.
- Updated versions for most all Docker images for prometheus-federation cluster.
- New alert for when the Prometheus VM's persistent disk gets too full.
- A new tcpinfo dashboard.
- A couple small dashboard fixes/improvements.
Update neubot targets and Grafana dashboards
Add pusher dashboard (#572) * Add new alert dashboard * Add reference to pusher dashboard
Use cert-manager, update GMX and alerts
v2.28.0 Remove unused bqx queries (#568)
Weekly release
- An updated version of GMX: v1.1.0
- Two small fixes/improvements to a couple of Grafana dashboards.
Weekly release
- Updates the
RolloutTooSlowOrStuck
alerts.
Weekly release
- Updates GMX to v1.0.1.
- Adds new alert for stuck DaemonSet rollouts.
- RateLimiter alert now tolerates 1 error before alerting.
- snmp-exporter now mounts its config from a ConfigMap, not a file copied into an
emptyDir
.