Skip to content

Releases: m-lab/prometheus-support

Fixes missing metric alert

16 Jan 20:16
3e0fab9
Compare
Choose a tag to compare

This is provisional release to address a production alert which has already been identified and fixed in staging. It addresses this PR: #611

It also brings along with it removal of the downloader cluster deployments from the Travis config.

Weekly release

14 Jan 17:58
932eba2
Compare
Choose a tag to compare
  • Deploy a new gcs-exporter to the prometheus-federation cluster.
  • Renames the data-processing cluster, along with any config changes to facilitate the rename.
  • Adds a new alert for missing "generic" (non-experiment specific) metrics upon which mlab-ns relies.
  • Properly prefixes mlab-ns-related alerts with MlabNS_.

Weekly release

16 Dec 16:58
547f459
Compare
Choose a tag to compare
Further reduce number of alerts with severity=page (#602)

* Further reduce number of alerts with severity=page.

* PlatformCluster_FederationScrapeJobFailing has severity=ticket now.

Weekly release

12 Dec 15:46
0864137
Compare
Choose a tag to compare
Rollback bqx version (#601)

* Rollback bqx version

* Fix project flag

Weekly release

02 Dec 18:20
7654e34
Compare
Choose a tag to compare

Point of interest in this release:

  • Removes virtually all legacy/PLC configurations.
  • Updated versions for most all Docker images for prometheus-federation cluster.
  • New alert for when the Prometheus VM's persistent disk gets too full.
  • A new tcpinfo dashboard.
  • A couple small dashboard fixes/improvements.

Update neubot targets and Grafana dashboards

20 Nov 17:01
b030c0a
Compare
Choose a tag to compare
Add pusher dashboard (#572)

* Add new alert dashboard

* Add reference to pusher dashboard

Use cert-manager, update GMX and alerts

12 Nov 08:11
035d672
Compare
Choose a tag to compare
v2.28.0

Remove unused bqx queries (#568)

Weekly release

30 Oct 17:09
8a4fdb4
Compare
Choose a tag to compare
  • An updated version of GMX: v1.1.0
  • Two small fixes/improvements to a couple of Grafana dashboards.

Weekly release

30 Sep 19:40
6d337f6
Compare
Choose a tag to compare
  • Updates the RolloutTooSlowOrStuck alerts.

Weekly release

25 Sep 20:10
081d805
Compare
Choose a tag to compare
  • Updates GMX to v1.0.1.
  • Adds new alert for stuck DaemonSet rollouts.
  • RateLimiter alert now tolerates 1 error before alerting.
  • snmp-exporter now mounts its config from a ConfigMap, not a file copied into an emptyDir.