Skip to content

What are the Best Practices for Providing Instrumentation for Spring AI. #12878

Open
@Cirilla-zmh

Description

@Cirilla-zmh

Backgroud

Hi, we are currently working on providing automatic instrumentation capabilities for applications built using the Spring AI framework. Our goal is to enable users to obtain various observability data (mainly traces) without needing to modify their code after installing opentelemetry-java-instrumentation in their Spring AI application.

This sounds like a requirement for plugin support. However, we have found that Spring AI already supports a rich set of observability features, and the trace attributes adhere as closely as possible to the OTel semconv. Therefore, we believe that just exporting observability data by opentelemetry-java-instrumentation is a more elegant solution. We have made some modifications to the demo application, successfully exporting this data to Jaeger. Here is the effect:
Screen Shotcut 2024-12-11 18 16 28

Making some necessary adaptations in the demo application can indeed achieve this effect, but we think there might be better ways to achieve this. We have also came across a issue of memory leak, and below are some issues we are particularly concerned about.

List of Issues

Initialization of OpenTelemetry Sdk

Like other Spring applications, the underlying tracing capability of Spring AI is based on the micrometer framework, which requires adding these dependencies:

<dependency>
    <groupId>org.springframework.boot</groupId>
    <artifactId>spring-boot-starter-actuator</artifactId>
</dependency>
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-core</artifactId>
</dependency>
<dependency>
    <groupId>io.micrometer</groupId>
    <artifactId>micrometer-tracing-bridge-otel</artifactId>
</dependency>

However, in the implementation of spring-boot-actuator-autoconfigure, it does not detect the presence of opentelemetry-java-instrumentation; instead, it checks if an object of class OpenTelemetry exists in the application context. If not, it creates a new one—this causes the OpenTelemetrySdk provided by java agent not to be used by micrometer.

@Bean
@ConditionalOnMissingBean(OpenTelemetry.class)
OpenTelemetrySdk openTelemetry(ObjectProvider<SdkTracerProvider> tracerProvider,
	ObjectProvider<ContextPropagators> propagators, ObjectProvider<SdkLoggerProvider> loggerProvider,
	ObjectProvider<SdkMeterProvider> meterProvider) {
    OpenTelemetrySdkBuilder builder = OpenTelemetrySdk.builder();
    tracerProvider.ifAvailable(builder::setTracerProvider);
    propagators.ifAvailable(builder::setPropagators);
    loggerProvider.ifAvailable(builder::setLoggerProvider);
    meterProvider.ifAvailable(builder::setMeterProvider);
    return builder.build();
}

One way to solve this issue is to explicitly declare a Bean of class OpenTelemetry in the Configuration class of application:

@Bean
public OpenTelemetry getOpenTelemetry() {
    return GlobalOpenTelemetry.get();
}

Of course, we can add some auto configuration strategies in Spring AI (or its other distro, such as Spring AI Alibaba) to handle this logic for the user. However, we believe it would be better if this behavior were managed by the opentelemetry-java-instrumentation. The framework should fully implement observability logic based on the OpenTelemetry API, and should not notice the presence of a java agent.

Potential Memory Leak

For most applications that depend on spring-actuator, micrometer generates many metrics by default, which often face high-cardinality issues (e.g., the uri of RestTemplate is recorded as a tag by default). In opentelemetry-java-instrumentation, there is an auto-instrumentation for spring-actuator, which adds a registry to micrometer. This registry seems to bypass micrometer's high cardinality control, leading to dimension explosion and memory leaks.

This seems not to be an issue because the risk of high-cardinality tags should be borne by the user. However, the current behavior is that, without the opentelemetry-java-instrumentation, micrometer's memory consumption is normal (controlled by the default configuration maximumAllowableTags=100), but with the opentelemetry-java-instrumentation, memory leaks occur, which may mislead users into thinking the opentelemetry-java-instrumentation is causing the memory leak. We are still investigating the details of this issue and would like to know if the community has encountered similar problems? (I apologize for not finding a similar issue in this project.)

About the Support Plan of Spring AI

Currently, in the OpenTelemetry Java projects (opentelemetry-java-instrumentation, opentelemetry-java and opentelemetry-java-contrib), I have seen no discussion about the Spring AI framework. In the long term, does the community plan to support this framework? Will there be a new instrumentation provided, or will it rely on library instrumentation within Spring, as we have done—despite it being based on micrometer-tracing?

Demo

To give you a little more context, I have created a repository that contains a simple Spring AI application demo.

Note that before running the demo, you need to obtain an API Key from a LLM provider, such as OpenAI or DashScope.

Additional Context

Spring AI and its Observability: https://docs.spring.io/spring-ai/reference/observability/index.html

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions