Releases: m-lab/prometheus-support
Releases · m-lab/prometheus-support
fix the hard coded data source problem for rate limit dashboard
remove hard coded project name from dashboard json file
Create new dashboard which calculate the counts & % that rate limits are triggered
Add sql to calculate the counts that rate limits are triggered
Alerts for platform-cluster & monitoring for six-hour mlab-ns clients
Removed prometheus 1.8 config, new alerts and dashboards
Changes include:
- Removed Prometheus 1.8 configuration
- Updated Annotation & Gardener dashboards
- Updated gcp-service-discovery version to v1.3.1
- New alerts:
- Epoxy server is not online
- Nodes that fail to boot successfully
- Too many AppEngine versions
- Too many inactive AppEngine instances
- Updates to k8s dashboards
Updated Grafana to v5.4.3
This closes m-lab/dev-tracker#94.
Update GMX to 0.1.2
Updated GMX's version number to deploy new version including @nkinkade 's bugfixes.
Rerelease snmp scraping from internal GKE cluster
Merge pull request #394 from m-lab/sandbox-roberto Change high disk usage threshold for NPAD to 9GB
Revert "Use snmp service running on gke cluster"
Merge pull request #385 from m-lab/sandbox-roberto Revert "Use snmp service running on gke cluster"
snmp_exporter + kubeIP as k8s deployment
Changes include:
- Added snmp_exporter and kubeIP as Kubernetes deployments
- Added rebot on EB as scraping target
- Several improvements to monitoring and alerts
- Disable collection of gardener traceroute metrics
- Improvements to dashboards
Dashboard improvements, jsonlint and monitoring of SSH on port 22
Changes include:
- Added GMX and lame-duck status for each node in the Ops: Pod overview dashboard
- Added monitoring of SSH running on port 22
- Fixed CPU panel & several improvements to the Pipeline Annotation Service dashboard
- Added jsonlint to .travis.ci