Skip to content

Commit fa38a90

Browse files
bonclay7elamaran11
andauthored
Move all dashboards to GitOps (#175)
* Typo * Remove Grafana provider * Temp: move dashbaords to gitOps * Move external labels to resource attributes * Avoid DDoS with using 0.0.0.0 * Pre-commit * Transition in two steps Will need to remove provider in a separate version to provide a transition path as removing this will break terraform and leave orphans in the state * Move patterns' dashboards creation to gitOps Standardize config objects for patterns as well * Pre-commit * Create AMP dashboard from external source with Grafana provider * Fix deprecated option * Fix Flux requirements * Run pre-commit * Update example with operator * Cleanup examples * Update multicluster example * Update multicluster example * Drop dead variable * Update docs * Change GitOps branch name * Update docs * Replacing Secrets Manager to SSM to store Grafana API Key (#178) * Fixing SSM * Fixing SSM * Replacing Secrets Manager with SSM * Replacing Secrets Manager with SSM * Update architecture diagram * Update architecture diagram * Update README.md * Update index.md * Fixing Grafana Operator Version * Fix multicluster example * Update docs --------- Co-authored-by: Ela AWS <51791117+elamaran11@users.noreply.github.com> Co-authored-by: Elamaran Shanmugam <elamaran.shan@gmail.com>
1 parent c5e4c0c commit fa38a90

File tree

46 files changed

+361
-4460
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+361
-4460
lines changed

README.md

Lines changed: 4 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,8 @@ your custom applications.
1717
You also can monitor your Amazon Managed Service for Prometheus workspaces ingestion,
1818
costs, active series with [this module](./modules/managed-prometheus-monitoring).
1919

20-
<img width="1501" alt="image" src="docs/images/dark-o11y-accelerator-amp-xray.png">
20+
![image](https://github.com/aws-observability/terraform-aws-observability-accelerator/assets/10175027/e83f8709-f754-4192-90f2-e3de96d2e26c)
21+
2122

2223
## Documentation
2324

@@ -33,15 +34,6 @@ visit the [Amazon EKS cluster monitoring documentation](https://aws-observabilit
3334
The sections below demonstrate how you can leverage AWS Observability Accelerator
3435
to enable monitoring to an existing EKS cluster.
3536

36-
### v2.x changes
37-
38-
v2+ releases introduces couple of breaking changes compared to previous versions:
39-
40-
- `modules/workloads/infra` module moves to `modules/eks-monitoring`
41-
- All EKS configuration options moves from the base module to the `eks-monitoring` module
42-
- All EKS workload modules `modules/workloads/{java,nginx}` merge into `eks-monitoring` as configuration options (patterns), see [examples](./examples) to provide a more complete visibility
43-
- All examples have been updated to reflect these changes
44-
- Introducing GitOps for Grafana contents (Dashboards, Folders and Data sources) with [Grafana Operator](https://github.com/grafana-operator/grafana-operator) and [Flux CD](https://fluxcd.io/)
4537

4638
### Base Module
4739

@@ -161,14 +153,13 @@ If you are interested in contributing, see the
161153
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.1.0 |
162154
| <a name="requirement_aws"></a> [aws](#requirement\_aws) | >= 4.0.0 |
163155
| <a name="requirement_awscc"></a> [awscc](#requirement\_awscc) | >= 0.24.0 |
164-
| <a name="requirement_grafana"></a> [grafana](#requirement\_grafana) | 1.25.0 |
156+
| <a name="requirement_grafana"></a> [grafana](#requirement\_grafana) | >= 1.25.0 |
165157

166158
## Providers
167159

168160
| Name | Version |
169161
|------|---------|
170162
| <a name="provider_aws"></a> [aws](#provider\_aws) | >= 4.0.0 |
171-
| <a name="provider_grafana"></a> [grafana](#provider\_grafana) | 1.25.0 |
172163

173164
## Modules
174165

@@ -180,8 +171,6 @@ No modules.
180171
|------|------|
181172
| [aws_prometheus_alert_manager_definition.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_alert_manager_definition) | resource |
182173
| [aws_prometheus_workspace.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/resources/prometheus_workspace) | resource |
183-
| [grafana_data_source.amp](https://registry.terraform.io/providers/grafana/grafana/1.25.0/docs/resources/data_source) | resource |
184-
| [grafana_folder.this](https://registry.terraform.io/providers/grafana/grafana/1.25.0/docs/resources/folder) | resource |
185174
| [aws_grafana_workspace.this](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/grafana_workspace) | data source |
186175
| [aws_region.current](https://registry.terraform.io/providers/hashicorp/aws/latest/docs/data-sources/region) | data source |
187176

@@ -190,12 +179,10 @@ No modules.
190179
| Name | Description | Type | Default | Required |
191180
|------|-------------|------|---------|:--------:|
192181
| <a name="input_aws_region"></a> [aws\_region](#input\_aws\_region) | AWS Region | `string` | n/a | yes |
193-
| <a name="input_create_dashboard_folder"></a> [create\_dashboard\_folder](#input\_create\_dashboard\_folder) | Boolean flag to enable Amazon Managed Grafana folder and dashboards | `bool` | `true` | no |
194-
| <a name="input_create_prometheus_data_source"></a> [create\_prometheus\_data\_source](#input\_create\_prometheus\_data\_source) | Boolean flag to enable Amazon Managed Grafana datasource | `bool` | `true` | no |
195182
| <a name="input_enable_alertmanager"></a> [enable\_alertmanager](#input\_enable\_alertmanager) | Creates Amazon Managed Service for Prometheus AlertManager for all workloads | `bool` | `false` | no |
196183
| <a name="input_enable_managed_prometheus"></a> [enable\_managed\_prometheus](#input\_enable\_managed\_prometheus) | Creates a new Amazon Managed Service for Prometheus Workspace | `bool` | `true` | no |
197184
| <a name="input_grafana_api_key"></a> [grafana\_api\_key](#input\_grafana\_api\_key) | Grafana API key for the Amazon Managed Grafana workspace | `string` | n/a | yes |
198-
| <a name="input_managed_grafana_workspace_id"></a> [managed\_grafana\_workspace\_id](#input\_managed\_grafana\_workspace\_id) | Amazon Managed Grafana Workspace ID | `string` | `""` | no |
185+
| <a name="input_managed_grafana_workspace_id"></a> [managed\_grafana\_workspace\_id](#input\_managed\_grafana\_workspace\_id) | Amazon Managed Grafana Workspace ID | `string` | n/a | yes |
199186
| <a name="input_managed_prometheus_workspace_id"></a> [managed\_prometheus\_workspace\_id](#input\_managed\_prometheus\_workspace\_id) | Amazon Managed Service for Prometheus Workspace ID | `string` | `""` | no |
200187
| <a name="input_managed_prometheus_workspace_region"></a> [managed\_prometheus\_workspace\_region](#input\_managed\_prometheus\_workspace\_region) | Region where Amazon Managed Service for Prometheus is deployed | `string` | `null` | no |
201188
| <a name="input_tags"></a> [tags](#input\_tags) | Additional tags (e.g. `map('BusinessUnit`,`XYZ`) | `map(string)` | `{}` | no |
@@ -205,15 +192,10 @@ No modules.
205192
| Name | Description |
206193
|------|-------------|
207194
| <a name="output_aws_region"></a> [aws\_region](#output\_aws\_region) | AWS Region |
208-
| <a name="output_grafana_dashboard_folder_created"></a> [grafana\_dashboard\_folder\_created](#output\_grafana\_dashboard\_folder\_created) | Boolean value indicating if the module created a dashboard folder in Amazon Managed Grafana |
209-
| <a name="output_grafana_dashboards_folder_id"></a> [grafana\_dashboards\_folder\_id](#output\_grafana\_dashboards\_folder\_id) | Grafana folder ID for automatic dashboards. Required by workload modules |
210-
| <a name="output_grafana_prometheus_datasource_test"></a> [grafana\_prometheus\_datasource\_test](#output\_grafana\_prometheus\_datasource\_test) | Grafana save & test URL for Amazon Managed Prometheus workspace |
211195
| <a name="output_managed_grafana_workspace_endpoint"></a> [managed\_grafana\_workspace\_endpoint](#output\_managed\_grafana\_workspace\_endpoint) | Amazon Managed Grafana workspace endpoint |
212-
| <a name="output_managed_grafana_workspace_id"></a> [managed\_grafana\_workspace\_id](#output\_managed\_grafana\_workspace\_id) | Amazon Managed Grafana workspace ID |
213196
| <a name="output_managed_prometheus_workspace_endpoint"></a> [managed\_prometheus\_workspace\_endpoint](#output\_managed\_prometheus\_workspace\_endpoint) | Amazon Managed Prometheus workspace endpoint |
214197
| <a name="output_managed_prometheus_workspace_id"></a> [managed\_prometheus\_workspace\_id](#output\_managed\_prometheus\_workspace\_id) | Amazon Managed Prometheus workspace ID |
215198
| <a name="output_managed_prometheus_workspace_region"></a> [managed\_prometheus\_workspace\_region](#output\_managed\_prometheus\_workspace\_region) | Amazon Managed Prometheus workspace region |
216-
| <a name="output_prometheus_data_source_created"></a> [prometheus\_data\_source\_created](#output\_prometheus\_data\_source\_created) | Boolean value indicating if the module created a prometheus data source in Amazon Managed Grafana |
217199
<!-- END OF PRE-COMMIT-TERRAFORM DOCS HOOK -->
218200

219201
## Contributing

docs/concepts.md

Lines changed: 13 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -39,22 +39,24 @@ The grafana-operator is a Kubernetes operator built to help you manage your Graf
3939

4040
GitOps is a way of managing application and infrastructure deployment so that the whole system is described declaratively in a Git repository. It is an operational model that offers you the ability to manage the state of multiple Kubernetes clusters leveraging the best practices of version control, immutable artifacts, and automation. Flux is a declarative, GitOps-based continuous delivery tool that can be integrated into any CI/CD pipeline. It gives users the flexibility of choosing their Git provider (GitHub, GitLab, BitBucket). Now, with grafana-operator supporting the management of external Grafana instances such as Amazon Managed Grafana, operations personas can use GitOps mechanisms using CNCF projects such as Flux to create and manage the lifecycle of resources in Amazon Managed Grafana.
4141

42-
We have setup a [GitRepository](https://fluxcd.io/flux/components/source/gitrepositories/) and [Kustomization](https://fluxcd.io/flux/components/kustomize/kustomization/) using flux to sync our GitHub Repository to add Grafana Datasources, folder and Dashboards to Amazon Managed Grafana using Grafana Operator. GitRepository defines a Source to produce an Artifact for a Git repository revision. Kustomization defines a pipeline for fetching, decrypting, building, validating and applying Kustomize overlays or plain Kubernetes manifests. we are also using [Flux Post build variable substitution](https://fluxcd.io/flux/components/kustomize/kustomization/#post-build-variable-substitution) to dynamically render variables such as AMG_AWS_REGION, AMP_ENDPOINT_URL, AMG_ENDPOINT_URL,GRAFANA_NODEEXP_DASH_URL on the YAML manifests during deployment time to avoid hardcoding on the YAML manifests stored in Git repo.
42+
We have setup a [GitRepository](https://fluxcd.io/flux/components/source/gitrepositories/) and [Kustomization](https://fluxcd.io/flux/components/kustomize/kustomization/) using Flux to sync our GitHub Repository to add Grafana Datasources, folder and Dashboards to Amazon Managed Grafana using Grafana Operator. GitRepository defines a Source to produce an Artifact for a Git repository revision. Kustomization defines a pipeline for fetching, decrypting, building, validating and applying Kustomize overlays or plain Kubernetes manifests. we are also using [Flux Post build variable substitution](https://fluxcd.io/flux/components/kustomize/kustomization/#post-build-variable-substitution) to dynamically render variables such as AMG_AWS_REGION, AMP_ENDPOINT_URL, AMG_ENDPOINT_URL,GRAFANA_NODEEXP_DASH_URL on the YAML manifests during deployment time to avoid hardcoding on the YAML manifests stored in Git repo.
4343

44-
We have placed our declarative code snippet to create an Amazon Managed Service For Promethes datasource and Grafana Dashboard in Amazon Managed Grafana in our [AWS Observabiity Accelerator GitHub Repository](https://github.com/aws-observability/aws-observability-accelerator/tree/main/artifacts/grafana-operator-manifests). We have setup a GitRepository to point to the AWS Observabiity Accelerator GitHub Repository and `Kustomization` for flux to sync Git Repository with artifacts in `./artifacts/grafana-operator-manifests` path in the AWS Observabiity Accelerator GitHub Repository. You can use this extension of our solution to point your own Kubernetes manifests to create Grafana Datasources and personified Grafana Dashboards of your choice using GitOps with Grafana Operator and Flux in Kubernetes native way with altering and redeploying this solution for changes to Grafana resources.
44+
We have placed our declarative code snippet to create an Amazon Managed Service For Promethes datasource and Grafana Dashboard in Amazon Managed Grafana in our [AWS Observabiity Accelerator GitHub Repository](https://github.com/aws-observability/aws-observability-accelerator). We have setup a GitRepository to point to the AWS Observabiity Accelerator GitHub Repository and `Kustomization` for flux to sync Git Repository with artifacts in `./artifacts/grafana-operator-manifests/*` path in the AWS Observabiity Accelerator GitHub Repository. You can use this extension of our solution to point your own Kubernetes manifests to create Grafana Datasources and personified Grafana Dashboards of your choice using GitOps with Grafana Operator and Flux in Kubernetes native way with altering and redeploying this solution for changes to Grafana resources.
4545

4646

4747

48-
## v2.x changes
48+
## Release notes
4949

50-
v2.x [releases](https://github.com/aws-observability/terraform-aws-observability-accelerator/releases) introduce
51-
couple of breaking changes compared to previous versions:
50+
We encourage you to use our [release versions](https://github.com/aws-observability/terraform-aws-observability-accelerator/releases)
51+
as much as possible to avoid breaking changes when deploying Terraform modules. You can
52+
read also our change log on the releases page. Here's an example of using a fixed version:
53+
54+
```hcl
55+
module "eks_monitoring" {
56+
source = "github.com/aws-observability/terraform-aws-observability-accelerator//modules/managed-prometheus-monitoring?ref=v2.5.0"
57+
}
58+
```
5259

53-
- `modules/workloads/infra` module moves to `modules/eks-monitoring`
54-
- EKS configuration options moves from the base module to the `eks-monitoring` module
55-
- EKS workload modules **java,nginx** merge into `eks-monitoring` as configuration options (patterns),
56-
see [examples](https://github.com/aws-observability/terraform-aws-observability-accelerator/tree/main/examples)
57-
- Examples have been updated to reflect these changes
5860

5961
## Base module
6062

@@ -138,4 +140,4 @@ classDiagram
138140

139141
If you are new to AWS Observability services, or want to dive deeper into them,
140142
check our [One Observability Workshop](https://catalog.workshops.aws/observability/)
141-
for a hands-on experience in a self-paced environement or at an AWS venue.
143+
for a hands-on experience in a self-paced environment or at an AWS venue.

docs/eks/index.md

Lines changed: 35 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -111,25 +111,40 @@ terraform apply
111111

112112
## Visualization
113113

114-
#### 1. Prometheus data source on Grafana
115114

116-
Make sure to open the link in the output. After a successful deployment, this will open
117-
the Prometheus data source configuration on Grafana.
118-
Click `Save & test` and you should see a notification confirming that the Amazon Managed Service for Prometheus workspace is ready to be used on Grafana.
115+
#### 1. Grafana dashboards
119116

120-
```bash
121-
terraform output grafana_prometheus_datasource_test
122-
```
117+
Login to your Grafana workspace and navigate to the Dashboards panel. You should see a list of dashboards under the `Observability Accelerator Dashboards`
118+
<img width="1540" alt="image" src="https://user-images.githubusercontent.com/10175027/190000716-29e16698-7c90-49d6-8c37-79ca1790e2cc.png">
123119

124-
#### 2. Grafana dashboards
120+
Open a specific dashboard and you should be able to view its visualization
121+
<img width="2056" alt="cluster headlines" src="https://user-images.githubusercontent.com/10175027/199110753-9bc7a9b7-1b45-4598-89d3-32980154080e.png">
125122

126-
Go to the Dashboards panel of your Grafana workspace. You should see a list of dashboards under the `Observability Accelerator Dashboards`
123+
With v2.5 and above, the dashboards are managed with a Grafana Operator running in your cluster.
124+
From the cluster to view all dashboards as Kubernetes objects, run
127125

128-
<img width="1540" alt="image" src="https://user-images.githubusercontent.com/10175027/190000716-29e16698-7c90-49d6-8c37-79ca1790e2cc.png">
126+
```console
127+
kubectl get grafanadashboards -A
128+
NAMESPACE NAME AGE
129+
grafana-operator cluster-grafanadashboard 138m
130+
grafana-operator java-grafanadashboard 143m
131+
grafana-operator kubelet-grafanadashboard 13h
132+
grafana-operator namespace-workloads-grafanadashboard 13h
133+
grafana-operator nginx-grafanadashboard 134m
134+
grafana-operator node-exporter-grafanadashboard 13h
135+
grafana-operator nodes-grafanadashboard 13h
136+
grafana-operator workloads-grafanadashboard 13h
137+
```
129138

130-
Open a specific dashboard and you should be able to view its visualization
139+
You can inspect more details per dashboard using this command
140+
141+
```console
142+
kubectl describe grafanadashboards cluster-grafanadashboard -n grafana-operator
143+
```
144+
145+
Grafana Operator and Flux always work together to synchronize your dashboards with Git.
146+
If you delete your dashboards by accident, they will be re-provisioned automatically.
131147

132-
<img width="2056" alt="cluster headlines" src="https://user-images.githubusercontent.com/10175027/199110753-9bc7a9b7-1b45-4598-89d3-32980154080e.png">
133148

134149
#### 3. Amazon Managed Service for Prometheus rules and alerts
135150

@@ -216,19 +231,20 @@ export GO_AMG_API_KEY=$(aws grafana create-workspace-api-key \
216231
--output text)
217232
```
218233

219-
- Next, lets grab the Grafana API key secret name from AWS Secrets Manager. The keyname should start with `terraform-..`
234+
- Finally, update the Grafana API key secret in AWS Secrets Manager using the above new Grafana API key:
220235

221236
```bash
222-
aws secretsmanager list-secrets
237+
aws aws ssm put-parameter \
238+
--name "/terraform-accelerator/grafana-api-key" \
239+
--type "SecureString" \
240+
--value "{\"GF_SECURITY_ADMIN_APIKEY\": \"${GO_AMG_API_KEY}\"}" \
241+
--region <Your AWS Region>
223242
```
224243

225-
- Finally, update the Grafana API key secret in AWS Secrets Manager using the above new Grafana API key:
244+
- If the issue persists, you can force the synchronization by deleting the `externalsecret` Kubernetes object.
226245

227246
```bash
228-
aws secretsmanager update-secret \
229-
--secret-id <Your Secret Name> \
230-
--secret-string "{\"GF_SECURITY_ADMIN_APIKEY\": \"${GO_AMG_API_KEY}\"}" \
231-
--region <Your AWS Region>
247+
kubectl delete externalsecret/external-secrets-sm -n grafana-operator
232248
```
233249

234250
### 2. Upgrade from 2.1.0 or earlier

docs/eks/java.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ Make sure to refresh your temporary Grafana API key
3232

3333
```bash
3434
export TF_VAR_managed_grafana_workspace_id=g-xxx
35-
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 1200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
35+
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 7200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
3636
```
3737

3838
## Deploy

docs/eks/multicluster.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ Using the example [eks-cluster-with-vpc](https://aws-observability.github.io/ter
1111
1. `eks-cluster-1`
1212
2. `eks-cluster-2`
1313

14-
#### 2. Amazon Managed Serivce for Prometheus (AMP) workspace
14+
#### 2. Amazon Managed Service for Prometheus (AMP) workspace
1515

1616
We recommend that you create a new AMP workspace. To do that you can run the following command.
1717

@@ -48,7 +48,7 @@ Ensure you have the following necessary IAM permissions
4848
* `grafana.DeleteWorkspaceApiKey`
4949

5050
```sh
51-
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 1200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
51+
export TF_VAR_grafana_api_key=`aws grafana create-workspace-api-key --key-name "observability-accelerator-$(date +%s)" --key-role ADMIN --seconds-to-live 7200 --workspace-id $TF_VAR_managed_grafana_workspace_id --query key --output text`
5252
```
5353

5454
## Setup
@@ -70,8 +70,8 @@ Verify by looking at the file `variables.tf` that there are two EKS clusters tar
7070

7171
The difference in deployment between these clusters is that Terraform, when setting up the EKS cluster behind variable `eks_cluster_1_id` for observability, also sets up:
7272

73-
* Dashboard folder and files in `AMG`
74-
* Prometheus and Java, alerting and recording rules in `AMP`
73+
* Dashboard folder and files in Amazon Managed Grafana
74+
* Prometheus and Java, alerting and recording rules in Amazon Managed Service for Prometheus
7575

7676
!!! warning
7777
To override the defaults, create a `terraform.tfvars` and change the default values of the variables.

0 commit comments

Comments
 (0)