-
Notifications
You must be signed in to change notification settings - Fork 72
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: Kamesh Akella <kamesh.asp@gmail.com>
- Loading branch information
Showing
4 changed files
with
40 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
36 changes: 36 additions & 0 deletions
36
...netes/modules/ROOT/pages/running/metrics/keycloak_service_level_indicators.adoc
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,36 @@ | ||
= {project_name} Service Level Indicators | ||
:description: This document contains details of the SLI's to monitor your {project_name} deployment's performance. | ||
|
||
To ensure that {project_name} can be confidently run in a production environment, it is important for customers to have an overview of key metrics from both {project_name} and {jdgserver_name}. This will allow them to assess the health and performance of their system, ensuring smooth operation. Additionally, these metrics will provide critical insight for anyone supporting the deployment, allowing them to request and analyze the necessary data effectively. | ||
|
||
We assume that the scenario for defining the SLO's and SLI's is based on the below steps. | ||
|
||
==== | ||
As a {project_name} user, | ||
* I want to be able to log in. | ||
* refresh my token. | ||
* access and use the admin console. | ||
* and manage my profile through the account console. | ||
So that I can interact with the {project_name} system effectively and perform the necessary tasks without interruption. | ||
==== | ||
|
||
|
||
[cols="1,1,1,1,2,2,2", options="header"] | ||
|=== | ||
| SLO | SLO definition | Single Site SLO Target | Multi-Site SLO Target | SLI Metric | Metric Details | Dashboard | ||
|
||
| Availability | {project_name} should be available XX.XX% of the time. | 99.9% | 99.99% | Uptime percentage is the ratio of successful authentication requests to total authentication requests. | https://github.com/keycloak/keycloak/blob/main/docs/guides/high-availability/health-checks-multi-site.adoc[Health checks], | ||
and the `up` metric which indicates if the Prometheus server is able to scrape metrics from the {project_name} instance. This metric will have a value of 1 if the {project_name} service is available and responding to Prometheus scrape requests, and 0 if the service is down or unreachable. | ||
|
||
| NA | ||
|
||
| Authentication Latency | XX% of {project_name} authentication requests should have a latency below 200ms. | 99% | 95% | {project_name} server-side metrics to track latency for specific endpoints along with Response Time Distribution. | `http_server_requests_seconds_count`, `http_server_requests_seconds_sum`. | ||
|
||
https://www.keycloak.org/keycloak-benchmark/kubernetes-guide/latest/running/metrics/keycloak_cluster#processing-time[More details about the metrics are captured here.] | https://github.com/keycloak/keycloak-benchmark/blob/main/provision/minikube/monitoring/dashboards/authentication-code.json[Example Grafana dashboard] | ||
|
||
| Error Rate “during login” | The error rate should be less than X.X%. | 0.1% | 0.05% | The ratio of failed authentication requests to total requests. | Failed requests could be identified by the `5xx error codes` generated by the {project_name} server and those could be further per URL. | ||
|https://grafana.com/grafana/dashboards/10441-keycloak-metrics-dashboard/[Example Grafana dashboard] | ||
|=== | ||
|
File renamed without changes.