Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[elasticsearchexporter]: Add dynamic document id support for logs #37065

Merged
merged 24 commits into from
Jan 23, 2025

Conversation

mauri870
Copy link
Contributor

@mauri870 mauri870 commented Jan 7, 2025

Description

This PR adds a new config option logs_dynamic_id that when set to true reads the elasticsearch.document_id attribute from each log record and uses it as the final document id in Elasticsearch. This is only implemented for logs but I can open subsequent PRs supporting metrics and traces akin to the *_dynamic_index options.

Fixes #36882

Testing

Added tests to verify that the document ID attribute can be read from the log record and that the _id is properly forwarded to Elasticsearch. Also asserted that when there is no doc id attribute the current behavior is retained.

Documentation

Updated the readme to mention the new logs_dynamic_id config option.

@mauri870 mauri870 requested a review from a team as a code owner January 7, 2025 13:45
@mauri870 mauri870 requested a review from djaglowski January 7, 2025 13:45
@mauri870 mauri870 changed the title [elasticsearchexporter]: Add support for for setting a document id for logs [elasticsearchexporter]: Add dynamic document id support for logs Jan 7, 2025
@mauri870 mauri870 marked this pull request as draft January 8, 2025 11:10
@mauri870 mauri870 marked this pull request as ready for review January 8, 2025 12:13
Copy link
Contributor

@carsonip carsonip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good at a high level, thanks!

.chloggen/elasticsearchexporter_logs_dynamic_id.yaml Outdated Show resolved Hide resolved
exporter/elasticsearchexporter/exporter.go Outdated Show resolved Hide resolved
Copy link
Contributor

@carsonip carsonip left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Q: do you expect the elasticsearch.document_id to be indexed under attributes as well, or should it be ignored during serialization?

@mauri870
Copy link
Contributor Author

mauri870 commented Jan 9, 2025

Q: do you expect the elasticsearch.document_id to be indexed under attributes as well, or should it be ignored during serialization?

Haven't thought of that, but I think it makes sense to remove the field, as it only has meaning for the exporter.

Also, elasticsearch.document_id is overly specific, I'm happy to hear suggestions on the naming.

@mauri870
Copy link
Contributor Author

mauri870 commented Jan 9, 2025

I updated the code to remove the id field from the final document.

Regarding the attribute name, I wonder if we should stick to some kind of semantic convention such as https://opentelemetry.io/docs/specs/semconv/database/elasticsearch/. I scrolled throught it but couldn't find an existing attribute for a document id.

exporter/elasticsearchexporter/exporter.go Outdated Show resolved Hide resolved
exporter/elasticsearchexporter/config.go Outdated Show resolved Hide resolved
exporter/elasticsearchexporter/config.go Outdated Show resolved Hide resolved
exporter/elasticsearchexporter/README.md Outdated Show resolved Hide resolved
.chloggen/elasticsearchexporter_logs_dynamic_id.yaml Outdated Show resolved Hide resolved
mauri870 and others added 3 commits January 10, 2025 13:48
Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
Co-authored-by: Carson Ip <carsonip@users.noreply.github.com>
@mauri870 mauri870 requested review from VihasMakwana and axw January 14, 2025 14:00
Copy link
Contributor

@axw axw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just a couple of nits

exporter/elasticsearchexporter/bulkindexer.go Outdated Show resolved Hide resolved
exporter/elasticsearchexporter/bulkindexer.go Outdated Show resolved Hide resolved
Copy link

codecov bot commented Jan 16, 2025

Codecov Report

Attention: Patch coverage is 91.66667% with 2 lines in your changes missing coverage. Please review.

Project coverage is 79.57%. Comparing base (37c8044) to head (490e747).
Report is 32 commits behind head on main.

Files with missing lines Patch % Lines
exporter/elasticsearchexporter/exporter.go 87.50% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main   #37065   +/-   ##
=======================================
  Coverage   79.57%   79.57%           
=======================================
  Files        2271     2271           
  Lines      212855   212872   +17     
=======================================
+ Hits       169372   169392   +20     
- Misses      37794    37797    +3     
+ Partials     5689     5683    -6     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@ChrsMark ChrsMark added the ready to merge Code review completed; ready to merge by maintainers label Jan 21, 2025
@songy23
Copy link
Member

songy23 commented Jan 21, 2025

needs a rebase

@songy23 songy23 removed the ready to merge Code review completed; ready to merge by maintainers label Jan 21, 2025
@mauri870
Copy link
Contributor Author

Interesting, I fixed the conflicts but now the tests fail with panic: invalid access to shared data. This seems to be due to m.RemoveIf here.

I believe this is because a PR yesterday changed MutatesData to false #37234.

What should be the course of action here? The idea of the attribute defined here is that we remove it from the final document.

@ChrsMark
Copy link
Member

What should be the course of action here? The idea of the attribute defined here is that we remove it from the final document.

@felixbarny any suggestion here?

@felixbarny
Copy link
Contributor

Instead of removing the attribute from the original LogRecord, just skip the attribute when serializing it.

attributes.Range(func(k string, val pcommon.Value) bool {
switch k {
case dataStreamType, dataStreamDataset, dataStreamNamespace, mappingHintsAttrKey:
return true
}
if isGeoAttribute(k, val) {
return true
}
_ = v.OnKey(k)
writeValue(v, val, stringifyMapValues)
return true
})

IIUC, this is mostly relevant for the bodymap mapping mode which doesn't add attributes to the document anyway.

@mauri870 mauri870 force-pushed the elasticsearchexporter-id branch from 90a3f3c to 0a34926 Compare January 22, 2025 17:30
@mauri870
Copy link
Contributor Author

Instead of removing the attribute from the original LogRecord, just skip the attribute when serializing it.

That is clever, worked like a charm. Thanks!

@mauri870
Copy link
Contributor Author

CI failure seems to be unrelated with my changes.

@songy23 songy23 merged commit 4542bbf into open-telemetry:main Jan 23, 2025
164 checks passed
@github-actions github-actions bot added this to the next release milestone Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[exporter/elasticsearch] Ability to specify the document ID for logs
8 participants