Skip to content

Conversation

humanzz
Copy link
Contributor

@humanzz humanzz commented Sep 29, 2025

Summary

Changes

Please provide a summary of what's being changed

Please add the issue number below, if no issue is present the PR might get blocked and not be reviewed

  • introduce Metrics.flushMetrics as a more powerful version of flushSingleMetrics to allow
    • using defaults by inheriting state e.g. namespace, dimensions and properties
    • emitting multiple metrics in one metrics context
  • update EmfMetricsLogger.addMetadata to use emf's putProperty
  • refactor flushSingleMetrics and captureColdStartMetric to use flushMetrics
  • consolidate request id and trace id setting into a newly introduced internal MetricsUtils

Issue number: #2153


By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Disclaimer: We value your time and bandwidth. As such, any pull requests created on non-triaged issues might not be successful.

- introduce `Metrics.flushMetrics` as a more powerful version of `flushSingleMetrics` to allow
  - using defaults by inheriting state e.g. namespace, dimensions and metadata
  - emitting multiple metrics in one metrics context
- refactor `flushSingleMetrics` to use `flushMetrics`
- move namespace/service setting from `MetricsFactory` to `EmfMetricsLogger`
- introduce `Metrics.flushMetrics` as a more powerful version of `flushSingleMetrics` to allow
  - using defaults by inheriting state e.g. namespace, dimensions and metadata
  - emitting multiple metrics in one metrics context
- refactor `flushSingleMetrics` to use `flushMetrics`
- move namespace/service setting from `MetricsFactory` to `EmfMetricsLogger`
Copy link
Contributor

@phipag phipag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @humanzz for sending this PR. Overall, it looks good and I left some comments that we need to address before moving forward.

Primarily, can you go back to the issue #2153 and let us know your use-case? We still need to decide whether we want to move forward with inheriting the default configuration of the metrics logger for flushMetrics and flushSingleMetric.

Comment on lines 49 to 60

// Apply default configuration from environment variables
String envNamespace = System.getenv("POWERTOOLS_METRICS_NAMESPACE");
if (envNamespace != null) {
metrics.setNamespace(envNamespace);
}

// Only set Service dimension if it's not the default undefined value
String serviceName = LambdaHandlerProcessor.serviceName();
if (!LambdaConstants.SERVICE_UNDEFINED.equals(serviceName)) {
metrics.setDefaultDimensions(DimensionSet.of("Service", serviceName));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see in your PR description you moved this logic to EmfMetricsLogger to reduce the number of places that can set the namespace. I don't see how this reduces the number of places. In fact, it will increase the number of places if we add another metrics backend in addition to the EMF metrics logger.

Let's keep this in MetricsFactory. The idea of this architecture is that EmfMetricsLogger is a pluggable metrics backend that is only to be instantiated with an initial configuration by MetricsFactory or MetricsBuilder. This is also why it is in the internal package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did get the idea behind MetricsFactory and pluggable backends and I had some initial impressions that it might benefit from a bit more tightening. Now I had a look again at the factory and the builder, I think you're more right but let me share a couple of thoughts

  • Initially, I thought that EmfMetricsLogger should use either the MetricsProvider or MetricsFactory to initialize a new instance of Metrics so it gets whichever default configurations (in this case environment variable namespace logic and the service name) which then contributed to moving more of the logic into the EmfMetricsLogger
  • The other motivation, is that I see properties getting set on Metrics, spread across LambdaMetricsAspect and EmfMetricsLogger and MetricsFactory
  • MetricsFactory is a bit odd in that it allows setting/overriding the MetricsProvider which can happen at any point in the lifetime of the class which can lead to issues if any piece of code decides to override the provider and other pieces rely on getMetricsInstance assuming it's all the global metrics instance that has been configured
  • The fact that one now creates a Metrics instance in the handler (https://docs.powertools.aws.dev/lambda/java/latest/core/metrics/#creating-metrics) via the MetricsFactory or MetricsBuilder which would still under the hood setup/use the MetricsFactory where other non-handler code, if they want to use that instance would rely on MetricsFactory.getMetricsInstance() and hope/assume it's the same one defined in the lambda handler - while setMetricsProvider leaves the door open to overriding the provider is a bit problematic

I know some of the points above are not necessarily directly related to this PR but the thoughts came as I was trying to work on this.

I will return this piece to the MetricsFactory, and maybe for later, we can think about the points raised above

Copy link
Contributor

@phipag phipag Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this feedback about the implementation design. Let me explain some thoughts that went into it:

  • The idea for MetricsFactory is to return a well configured (opinionated) Metrics backend Singleton. This should work independent of LambdaMetricsAspect i.e. independent of forcing the user to use AspectJ
  • For users not using AspectJ, the MetricsBuilder can be used as an easy way to set the configuration on a default initialized Metrics backend returned by MetricsFactory
  • For users using AspectJ with the @FlushMetrics annotation, the LambdaMetricsAspect only applies the configuration that is specific to the Aspect.

All of this is designed in a way that is agnostic to the metrics backend and to achieve a consistent configuration precedence (https://docs.powertools.aws.dev/lambda/java/latest/core/metrics/#order-of-precedence-of-metrics-configuration). So, it is actually intentional that these different components all set configuration on the Metrics backend because the user is given different ways of configuring the metrics backend (Env vars, builder pattern, aspectj).

The idea of having the setMetricsProvider method was to allow users to:

  1. Set another metrics backend programatically
  2. Bring their own metrics provider implementation

Both of these scenarios are not documented yet.

I agree with your point that exposing this as a public method is not 100% ideal. Do you have a suggestion for a better way of achieving this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of potential ideas

  1. it's kept public, but it can be only set once, and trying to override it throws an exception
  2. Using an approach like SPI to select the provider and completely dropping the ability to configure the provider via code - either for MetricsFactory or MetricsBuilder
  3. Only allow MetricsBuilder to configure the provider, and it sets it on MetricsFactory using setMetricsProvider but make it package-private... but this still suffers from the issue that if someone initializes another Metrics using MetricsBuilder they might override the provider on the factory again - but we can prevent the overrides in a way similar to 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like these ideas. We use SPIs for e.g. idempotency or logging. In the current monolithic module setup for metrics I am not a fan of SPI though. Usually, we provide modules that auto-register. For example: powertools-logging will be used with powertools-logging-log4j2 or powertools-logging-logback.

Introducing SPIs will be a major version update and I will consider this when adding additional larger metrics backend. Great idea!

I really like option 1. It is simple enough and makes a lot of sense for such singleton classes – we don't want people to accidentally propagate a different metrics provider throughout the code base outside of initial configuration.

Would you be up for adding this in this PR including a small unit test?

Comment on lines 250 to 254
if (namespace != null) {
metrics.setNamespace(this.namespace);
}
defaultDimensions.forEach(metrics::addDimension);
metadata.forEach(metrics::addMetadata);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We haven't decided yet in the issue if we want to auto-inherit the default metrics logger configuration. I see in v1 this is the behavior we have.

@humanzz
Copy link
Contributor Author

humanzz commented Sep 30, 2025

I've addressed the metricsContext comments in 4730be4, and will leave details about the use case on the feature request.

Copy link
Contributor

@phipag phipag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the quick turnaround @humanzz. Just one more thing that I discovered.

@phipag
Copy link
Contributor

phipag commented Sep 30, 2025

@phipag another observation/inconsistency is the usage of putProperty vs. putMetadata. See

* https://github.com/aws-powertools/powertools-lambda-java/blob/main/powertools-metrics/src/main/java/software/amazon/lambda/powertools/metrics/internal/EmfMetricsLogger.java#L212

* https://github.com/aws-powertools/powertools-lambda-java/blob/main/powertools-metrics/src/main/java/software/amazon/lambda/powertools/metrics/internal/LambdaMetricsAspect.java#L95

Good catch! It seems that a "property" is simply a key added to the log line while "metadata" becomes part of the _aws object. These are logs from the official example:

Cold start metric (putProperty)

{
    "_aws": {
        "Timestamp": 1751977621635,
        "CloudWatchMetrics": [
            {
                "Namespace": "ServerlessAirline",
                "Metrics": [
                    {
                        "Name": "ColdStart",
                        "Unit": "Count"
                    }
                ],
                "Dimensions": [
                    [
                        "Service",
                        "FunctionName"
                    ]
                ]
            }
        ]
    },
    "function_request_id": "b8628984-b745-44bb-8025-247c2da76da1",
    "xray_trace_id": "1-686d0e92-396456d519ba8c28174120a4",
    "ColdStart": 1,
    "Service": "payment",
}

Regular metric (putMetadata)

{
    "_aws": {
        "Timestamp": 1751977946689,
        "CloudWatchMetrics": [
            {
                "Namespace": "ServerlessAirline",
                "Metrics": [
                    {
                        "Name": "CustomMetric1",
                        "Unit": "Count"
                    },
                    {
                        "Name": "CustomMetric3",
                        "Unit": "Count",
                        "StorageResolution": 1
                    }
                ],
                "Dimensions": [
                    [
                        "Service"
                    ]
                ]
            }
        ],
        "function_request_id": "feccb848-47a9-4b32-a16b-73d45d7ad308",
        "xray_trace_id": "1-686d0fdb-651f4fc05b57221e726890c0"
    },
    "CustomMetric1": 1,
    "Service": "payment",
    "CustomMetric3": 1
}

I need to find out the semantic difference – those nuances don't exist in the other runtimes. None of it is searchable in cloudwatch metrics. Besides the location in the json output I don't see any difference.


Update:

It appears that putMetadata will add key-value pairs to the _aws (metadata) object. Adding things there will not have an effect. Only Timestamp which is the officially documented metadata key in addition to CloudWatchMetrics (https://docs.aws.amazon.com/AmazonCloudWatch/latest/monitoring/CloudWatch_Embedded_Metric_Format_Specification.html#CloudWatch_Embedded_Metric_Format_Specification_structure_metadata%22%3ECloudWatch%20%20%20%20%20*%20%20%20%20%20Metadata). We shouldn't use this since it is nowhere documented and favor putProperty: https://github.com/awslabs/aws-embedded-metrics-java

We should update the addMetadata method to call putProperty on the EMF logger since this is the officially documented method (putMetadata is not documented). This is consistent with e.g. Python (see https://docs.powertools.aws.dev/lambda/python/latest/core/metrics/#add_metadata_outputjson)


@Override
public void addMetadata(String key, Object value) {
emfLogger.putMetadata(key, value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Following this comment we should not call putMetadata but putProperty. It is the documented way and consistent with the other language runtimes. This will make the usage consistent again.

Thanks for catching this detail!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Was about the write this but you bet me to it :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thought though, Metrics.addMetada and the new Metrics.addProperty which I'll introduce, are likely EMF-only features so there's some bleeding of the backend metrics engine to the abstraction.

Copy link
Contributor

@phipag phipag Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we misunderstood each other. My suggestion was not add addProperty but instead replace the code within addMetadata with a call to putProperty. In the other Powertools runtimes we treat "metadata" as what is called in the Java EMF library "putProperty".

Users should not call addMetadata on the EMF library at all.

I also agree with your comment about bleeding. It is not possible to workaround this anymore without introducing a breaking change – in other backends this method will likely not have any effect.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I missed the part about using putProperty in addMetadata

change coming!

Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants