Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bombastictranz/monitoring-dashboard-samples #14

Merged
merged 35 commits into from
Mar 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
b4c6a03
updated readme, metadata, and prometheus metadata to reflect addition…
algchoo Feb 27, 2024
faa9c1e
updated dashboard with new layout and kpi metrics
algchoo Feb 27, 2024
dfeb3b1
updated screenshot
algchoo Feb 29, 2024
e2457c0
Added reservation utilization alerts to google-gce.
ryannk Feb 29, 2024
e3d64fe
Merge branch 'GoogleCloudPlatform:master' into reservation_alerts
ryannk Mar 7, 2024
13a5c79
Removed malformed filename in metadata.yaml.
ryannk Mar 7, 2024
e72b267
updated prometheus metadata, readme, and metadata with new metrics
algchoo Mar 7, 2024
4517f81
updated jetty prometheus dashboard json
algchoo Mar 7, 2024
90d67e0
updated jetty prometheus dashboard screenshot
algchoo Mar 7, 2024
2104cba
Add SAP HANA and NetWeaver Availability Monitoring dashboards with sc…
dmivor Mar 7, 2024
d8527f8
Merge branch 'GoogleCloudPlatform:master' into master
dmivor Mar 7, 2024
c528857
updated the default metrics, docs, and metadata
algchoo Feb 23, 2024
14ea8c6
updated dashboard screenshot
algchoo Feb 23, 2024
b23043d
updated dashboard json
algchoo Feb 23, 2024
cd59acb
opted for percentile metrics, updated header colors
algchoo Feb 23, 2024
d603150
updated the default metrics, docs, and metadata
algchoo Feb 23, 2024
b3f60f5
updated dashboard screenshot
algchoo Feb 23, 2024
0414168
removed computeId from dashboard json
algchoo Mar 7, 2024
d9cd665
updated readme, metadata, and prometheus metadata to reflect addition…
algchoo Feb 26, 2024
0d6a7c5
updated dashboard json to reflect new layout and metrics
algchoo Feb 26, 2024
85e0035
updated screenshot
algchoo Feb 29, 2024
67f1746
updated job filters to use regex
algchoo Mar 8, 2024
e2b2b46
scorecards are summed and bucket queries have aggregation
algchoo Mar 8, 2024
91a405e
updated histogram chart titles
algchoo Mar 8, 2024
4f040e4
fix: import.sh replace grep -P with GNU param expansion for MacOS sup…
bwplotka Mar 12, 2024
0a040ab
removed computedId fields
algchoo Mar 12, 2024
002375f
Merge pull request #714 from observIQ/dashboard/consul-kpi-layout-update
EvanSimpson Mar 12, 2024
4abeda2
Update HBase Prometheus dashboard KPIs and layout (#727)
mrsillydog Mar 13, 2024
a1e37f4
Merge pull request #750 from bwplotka/fix-grep
stevezease Mar 13, 2024
596b2e3
Change documentation URL from Planning to All Guides
dmivor Mar 13, 2024
323c22e
Merge pull request #746 from dmivor/master
varun-c Mar 13, 2024
0ff653e
Merge pull request #728 from ryannk/reservation_alerts
cocosheng Mar 13, 2024
97d308c
Merge pull request #724 from observIQ/dashboard/argo-workflows-kpi-la…
johnbryan Mar 14, 2024
a0d1636
Merge pull request #745 from observIQ/dashboard/jetty-kpi-layout-update
johnbryan Mar 14, 2024
2d901a1
Merge pull request #721 from observIQ/dashboard/jenkins-kpi-layout-up…
johnbryan Mar 14, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 14 additions & 0 deletions alerts/google-gce/metadata.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,17 @@ alert_policy_templates:
related_integrations:
- id: gce
platform: GCP
-
id: reservation-utilization-too-high
description: "Monitors reservation utilization across all GCE Reservations in the current project and will notify you if the utilization rises above 90%. Reservation utilization is (in use count / reserved count)."
version: 1
related_integrations:
- id: gce
platform: GCP
-
id: reservation-utilization-too-low
description: "Monitors reservation utilization across all GCE Reservations in the current project and will notify you if the utilization falls below 10% for 20 of the past 23 hours. Reservation utilization is (in use count / reserved count)."
version: 1
related_integrations:
- id: gce
platform: GCP
21 changes: 21 additions & 0 deletions alerts/google-gce/reservation-utilization-too-high.v1.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"displayName": "Reservation - High Utilization",
"userLabels": {},
"conditions": [
{
"displayName": "High Reservation Utilization",
"conditionMonitoringQueryLanguage": {
"duration": "0s",
"query": "fetch compute.googleapis.com/Reservation\n|\n{ metric 'compute.googleapis.com/reservation/used'\n| align next_older(5m) | every 5m ;\nmetric 'compute.googleapis.com/reservation/reserved'\n| align next_older(5m) | every 5m\n}\n| ratio\n| condition val() >= 0.9",
"trigger": {
"count": 1
}
}
}
],
"alertStrategy": {
"autoClose": "604800s"
},
"combiner": "OR",
"enabled": true
}
21 changes: 21 additions & 0 deletions alerts/google-gce/reservation-utilization-too-low.v1.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
{
"displayName": "Reservation - Low Utilization",
"userLabels": {},
"conditions": [
{
"displayName": "Low Usage for 20 hours out of 23 hours",
"conditionMonitoringQueryLanguage": {
"duration": "0s",
"query": "fetch compute.googleapis.com/Reservation\n|\n{ metric 'compute.googleapis.com/reservation/used'\n| align next_older(5m) | every 5m ;\nmetric 'compute.googleapis.com/reservation/reserved'\n| align next_older(5m) | every 5m\n}\n| ratio\n| value val() <= 0.1\n| count_true_aligner(23h)\n| condition val() > 20 * 12 # 20 hours * (12 5 min intervals in hour)",
"trigger": {
"count": 1
}
}
}
],
"alertStrategy": {
"autoClose": "604800s"
},
"combiner": "OR",
"enabled": true
}
2 changes: 1 addition & 1 deletion dashboards/argo-workflows/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,4 +7,4 @@
|Argo Workflows Prometheus|
|:------------------|
|Filename: [argo-workflows-prometheus.json](argo-workflows-prometheus.json)|
|This dashboard has charts displaying: `Running Workflows`, `Pending Workflows`, `Skipped Workflows`, `Succeeded Workflows`, `Failed Workflows`, `Errors`, `Operation Duration (seconds)`, `Queue Adds`, `Queue Depth`, and `Queue Latency`|
|This dashboard has charts displaying: `Running Workflows`, `Pending Workflows`, `Skipped Workflows`, `Succeeded Workflows`, `Workflows With Pods Not Running`, `Failed Workflows`, `Errors`, `Operation Duration Seconds`, `Kubernetes Request Rates`, `Queue Adds`, `Queue Depth`, and `Queue Latency`|
Loading
Loading