Design/Thought exercise - Metrics & Query dashboard #16

ciaransweet · 2021-04-15T14:42:51Z

In conversation with @freitagb it was noted it'd be nice to have a similar dashboard to the HLS one: https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#dashboards:name=hls-production

This would collate not only metrics from CloudWatch, but also ideally the results of some pre-determined SQL queries to query the RDS instance to get an idea on some stats around downloads for a day etc.

If we can't work out getting results of a query onto a dashboard, my other suggestion is that we perhaps set a Lambda to periodically run the SQL queries and serialise these to a very simple HTML file that we host in a public bucket, that way it's just a static webpage we can navigate to.

ciaransweet · 2021-07-06T12:52:42Z

Update

An approach myself and @sharkinsspatial have conjured up that might be nice is to have any stack we deploy expose its metrics as a CloudWatch Dashboard Widget, for an overarching dashboard to consume and visualise.

It might look something like:

What does this diagram mean?

Simply, each stack is opinionated as to how its metrics should be visualised.

Stacks can have X many metrics, they will create a CloudWatch Dashboard Widget JSON object that specifies how to visualise the metrics they deems useful. This JSON object will then be set as a CloudFormation Output of the stack under a deterministic name (let's say dashboardwidget).

The 'stats' dashboard stack will then, on deploy, scrape the known list of HLS stacks, gather their dashboardwidget outputs and build a JSON body of a CloudWatch Dashboard, which will result in a dashboard of metric visualisations from several projects.

Got an example?

I'll try to provide some example code below to explain what I mean.

The first example is how a Stack might create a custom metric (I.E like how we want to run SQL queries for the downloader) and then expose it as a widget:

import json

from aws_cdk import (
    aws_cloudwatch,
    aws_events,
    aws_events_targets,
    aws_iam,
    aws_lambda,
    core,
)


class CdkStack(core.Stack):
    def __init__(
        self, scope: core.Construct, construct_id: str, identifier: str, **kwargs
    ) -> None:
        super().__init__(scope, construct_id, **kwargs)

        metric_1 = aws_cloudwatch.Metric(
            metric_name="ciarans-metric-1", namespace="ciarans-metrics"
        )

        metric_function = aws_lambda.Function(
            self,
            id=f"metric-function-{identifier}",
            code=aws_lambda.Code.from_inline(
                f"""
import boto3
import random

def handler(event, context):
    metric_data_1 = get_result_of_sql_query() # This could be # of downloads for yesterday
    metric_1 = {{
        "MetricName": "{metric_1.metric_name}",
        "Dimensions": [],
        "Unit": "None",
        "Value": metric_data_1
    }}
    print(f"Logging: {{metric_1}}")
    cloudwatch = boto3.client("cloudwatch")
    cloudwatch.put_metric_data(
        Namespace="{metric_1.namespace}",
        MetricData=[metric_1]
    )
"""
            ),
            handler="index.handler",
            runtime=aws_lambda.Runtime.PYTHON_3_8,
            timeout=core.Duration.seconds(10),
            retry_attempts=0,
        )

        metric_function.role.add_to_policy(
            aws_iam.PolicyStatement(
                effect=aws_iam.Effect.ALLOW,
                resources=["*"],
                actions=["cloudwatch:PutMetricData"],
            )
        )

        aws_events.Rule(
            self,
            id=f"metric-upload-rule-{identifier}",
            schedule=aws_events.Schedule.expression("cron(* * * * ? *)"),
        ).add_target(aws_events_targets.LambdaFunction(metric_function))

        widget = aws_cloudwatch.GraphWidget(
            left=[metric_1],
            period=core.Duration.minutes(1),
            title="Ciarans Test Widget",
        )

        core.CfnOutput(self, "dashboardwidget", value=json.dumps(self.resolve(widget)))

The above stack creates a metric, a function that runs every minute to query a datasource and put the value into the metric, and then a widget which says to display metric_1 on a GraphWidget with a specific title and period duration.

This is then set as a CfnOutput with the id dashboardwidget.

Below is how we might then access this widget and construct a dashboard with it (Full disclosure, this is likely not a truly working example, but you get the idea):

import json

import boto3
from aws_cdk import (
    aws_cloudwatch,
    core,
)


STACKS_TO_DISPLAY = [
    "stack-name-1",
    "stack-name-2"
]


class CdkStack(core.Stack):
    def __init__(
        self, scope: core.Construct, construct_id: str, identifier: str, **kwargs
    ) -> None:
        super().__init__(scope, construct_id, **kwargs)

        widgets = []

        cloudformation_client = boto3.client("cloudformation")
        for stack_name in STACKS_TO_DISPLAY:
            stack = cloudformation_client.Stack(stack_name)
            for output in stack.outputs:
                if output["OutputKey"] == "dashboardwidget":
                    widgets.add(output["OutputValue"])
                    break

        aws_cloudwatch.CfnDashboard(
            self,
            id=f"dashboard-{identifier}",
            dashboard_body=json.dumps(
                {
                    "start": "-PT6H",
                    "periodOverride": "inherit",
                    "widgets": widgets
                }
            )
        )

The above stack takes a list of stack names, iterates over the stacks to extract their dashboardwidget outputs, and then creates a CfnDashboard which creates the dashboard body from the list of widgets.

This then will be a dashboard containing all of the opinionated widgets of metrics that each stack exposes.

ciaransweet · 2021-07-06T12:53:47Z

Alternative

@alukach has been playing around with https://aws.amazon.com/quicksight/ which looks pretty swish. This might remove the need to create specific dashboards, allowing the users to spin up BI dashboards when and if they need them.

sharkinsspatial · 2021-07-06T15:50:24Z

@ciaranevans This looks great. I would like to investigate @alukach 's quicksight experiments but I'm guessing that we'll still need to build out some custom Cloudwatch metrics to support this.

One more vexxing question I have is how to automatically generate widgets at stack creation time which reference service level metrics that have not yet been created. A prime example of this being the S3 NumberOfObjects metric which is only populated every 24 hours (and thus doesn't exist at stack creation time). There are a host of other metrics as well whose arns' are not available at the time the widget would be created. Do we have some potential ideas on how we might tackle this? Also curious if @alukach is encountering this with CSDAP or @abarciauskas-bgse with MAAP?

ciaransweet · 2021-07-06T15:54:23Z

@sharkinsspatial I think for the case of metrics not yet existing, it doesn't matter. You don't get an ARN out of a metric so it's not really a problem if it doesn't exist. The worst case scenario is you've generated a dashboard referring to metric names and namespaces that don't yet exist and you'll just have blank graphs till they come into play.

Looking at what @alukach demoed, we might not actually need to generate custom metrics, quicksight provides means to run queries on your datasets which eliminates the need to do that as a custom metric

ciaransweet added Database Downloading Fetching Links enhancement New feature or request labels Apr 15, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design/Thought exercise - Metrics & Query dashboard #16

Design/Thought exercise - Metrics & Query dashboard #16

ciaransweet commented Apr 15, 2021

ciaransweet commented Jul 6, 2021

ciaransweet commented Jul 6, 2021

sharkinsspatial commented Jul 6, 2021

ciaransweet commented Jul 6, 2021

Design/Thought exercise - Metrics & Query dashboard #16

Design/Thought exercise - Metrics & Query dashboard #16

Comments

ciaransweet commented Apr 15, 2021

ciaransweet commented Jul 6, 2021

Update

What does this diagram mean?

Got an example?

ciaransweet commented Jul 6, 2021

Alternative

sharkinsspatial commented Jul 6, 2021

ciaransweet commented Jul 6, 2021