Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[receiver/awsfirehose] Implement encoding extension framework #37113

Closed
axw opened this issue Jan 9, 2025 · 11 comments · Fixed by #37262
Closed

[receiver/awsfirehose] Implement encoding extension framework #37113

axw opened this issue Jan 9, 2025 · 11 comments · Fixed by #37262
Labels

Comments

@axw
Copy link
Contributor

axw commented Jan 9, 2025

Component(s)

receiver/awsfirehose

Is your feature request related to a problem? Please describe.

awsfirehosereceiver has a limited set of baked-in unmarshallers: cwlogs (CloudWatch logs), cwmetrics (CloudWatch metric streams JSON format), otlp_v1 (CloudWatch metric streams OTLP 1.0 format).

Firehose itself does not know or care about the format of data records. In theory you could write OTLP logs to a delivery stream using https://docs.aws.amazon.com/firehose/latest/APIReference/API_PutRecord.html, but there is no way for the receiver to unmarshal this. OTLP is just an example: one might produce arbitrary observability data (or other types of data, but out of scope here) to Firehose, which also cannot be decoded without additional unmarshallers.

The unmarshallers within the awsfirehosereceiver module may also be useful for other receivers, e.g. in the S3 receiver for decoding data sent by Firehose to an S3 bucket. Since they are internal to the module, they cannot be reused.

Describe the solution you'd like

I would like to enhance the receiver to implement the encoding extension framework, so that users may inject additional encodings. I have created an untested PoC of this at axw@1904aa5. In that PoC I have left the existing unmarshallers in the receiver, but I would envisage them eventually moving to dedicated extension components.

Describe alternatives you've considered

No response

Additional context

No response

@axw axw added enhancement New feature or request needs triage New item requiring triage labels Jan 9, 2025
Copy link
Contributor

github-actions bot commented Jan 9, 2025

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@axw
Copy link
Contributor Author

axw commented Jan 9, 2025

If this direction is acceptable, I would be happy to polish and contribute the branch I referenced above.

@axw axw changed the title Update unmarshalers to implement [receiver/awsfirehose] Implement encoding extension framework Jan 9, 2025
@VihasMakwana
Copy link
Contributor

This sounds like a valid request to me.

FWIW, there are many components that use custom unmarshalers/marshalers. Now that we have encoding extension, we should do this for all such components.

@VihasMakwana VihasMakwana removed the needs triage New item requiring triage label Jan 9, 2025
@axw
Copy link
Contributor Author

axw commented Jan 13, 2025

/label waiting-for-code-owners

@MovieStoreGuy
Copy link
Contributor

Not sure for Firehose strictly, however the extension paths makes sense to me and embracing that adoption.

@Aneurysm9
Copy link
Member

This seems reasonable to me. I might have some concerns about the implementation, but I'll defer those until it's out of draft.

@axw
Copy link
Contributor Author

axw commented Jan 16, 2025

Thanks for the feedback, I'll clean up the branch and open a PR soon.

@axw
Copy link
Contributor Author

axw commented Jan 16, 2025

/label -waiting-for-code-owners

@axw
Copy link
Contributor Author

axw commented Jan 16, 2025

@Aneurysm9 seeing as the receiver has alpha stability, there's technically an option to break config straight away. Would you prefer:

  1. Wait for Add cloudwatch encoding #37222 and then immediately remove the built-in unmarshallers and record_type config, adding an encoding config that supports only extensions
  2. Provide backwards compatibility and remove the built-in unmarshallers later on once 37222 has baked: [receiver/awsfirehose] Add support for encoding extensions #37262

@Aneurysm9
Copy link
Member

Regardless of stability level my preference is to not make unnecessary breaking changes and to do as much as possible to both ease the transition and provide as much notice and time for users to adapt as possible. I don't think there's any reason not to follow the standard breaking changes process here.

@axw
Copy link
Contributor Author

axw commented Jan 17, 2025

@Aneurysm9 sounds good, thanks - that's my preference too. #37262 maintains backwards compatibility but deprecates the old config, so it's ready for review.

mx-psi pushed a commit that referenced this issue Feb 10, 2025
…37361)

#### Description

Refactor unmarshallers to fit into the encoding framework.
    
The internal unmarshallers now implement `plog.Unmarshaler` and
`pmetric.Unmarshaler`. This will enable us to use encoding extensions in
a followup, and will enable us to extract the unmarshallers later as
encoding extensions.
    
As a result of the interface change, the unmarshallers now unmarshal a
single record at a time, which means we cannot merge resources/metrics
as we go, and only within each record. This impacts performance, so to
offset that we implement various optimisations:

  - Use json-iterator for decoding JSON
  - Use klauspost/compress for decompressing gzip
  - Pool gzip readers
  - Remove pointer type from cwMetricValue to avoid allocation
  - Don't read the whole request body into memory
  - Reuse buffer for decoding base64; decode as we go

There are more optimisations we can make to reduce memory allocations,
e.g. avoid reflection when decoding JSON.

There's a fix for a subtle bug in the cwmetrics unmarshaller where the
unit of a metric was not considered part of its identity, and so two
metrics that differed only by unit would be merged.

#### Link to tracking issue

Preparation for #37113

#### Testing

- Added tests for consuming requests containing multiple records.
- Added benchmarks for the full request/response flow, for cwlogs and
cwmetrics record types. There's an increase in memory usage for
cwmetrics unmarshalling, and decrease for cwlogs unmarshalling. CPU
usage is better across the board.

```
goos: linux
goarch: amd64
pkg: github.com/open-telemetry/opentelemetry-collector-contrib/receiver/awsfirehosereceiver
cpu: AMD Ryzen 7 PRO 5850U with Radeon Graphics     
                                                              │ /tmp/old.txt  │            /tmp/new.txt             │
                                                              │    sec/op     │    sec/op     vs base               │
LogsConsumer_cwlogs/10resources_10records_1logs-16              142.03µ ±  8%   75.85µ ± 16%  -46.60% (p=0.002 n=6)
LogsConsumer_cwlogs/10resources_10records_10logs-16              413.1µ ±  5%   208.0µ ± 23%  -49.65% (p=0.002 n=6)
LogsConsumer_cwlogs/10resources_100records_1logs-16             1435.8µ ±  8%   728.0µ ± 11%  -49.30% (p=0.002 n=6)
LogsConsumer_cwlogs/10resources_100records_10logs-16             4.200m ±  2%   1.875m ±  4%  -55.34% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_10records_1metrics-16     104.30µ ± 16%   70.26µ ± 12%  -32.64% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_10records_10metrics-16     821.7µ ±  7%   505.5µ ±  5%  -38.48% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_100records_1metrics-16     780.2µ ±  5%   647.0µ ±  8%  -17.07% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_100records_10metrics-16    7.823m ±  5%   5.297m ± 11%  -32.29% (p=0.002 n=6)
geomean                                                          809.9µ         475.7µ        -41.26%

                                                              │ /tmp/old.txt  │             /tmp/new.txt             │
                                                              │     B/op      │     B/op      vs base                │
LogsConsumer_cwlogs/10resources_10records_1logs-16              437.78Ki ± 0%   61.70Ki ± 1%   -85.91% (p=0.002 n=6)
LogsConsumer_cwlogs/10resources_10records_10logs-16              533.7Ki ± 0%   124.9Ki ± 2%   -76.60% (p=0.002 n=6)
LogsConsumer_cwlogs/10resources_100records_1logs-16             4319.1Ki ± 0%   550.1Ki ± 0%   -87.26% (p=0.002 n=6)
LogsConsumer_cwlogs/10resources_100records_10logs-16             5.167Mi ± 0%   1.172Mi ± 1%   -77.31% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_10records_1metrics-16      47.88Ki ± 5%   83.71Ki ± 0%   +74.84% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_10records_10metrics-16     390.0Ki ± 1%   415.4Ki ± 1%    +6.51% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_100records_1metrics-16     358.9Ki ± 0%   772.9Ki ± 0%  +115.37% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_100records_10metrics-16    3.730Mi ± 0%   3.962Mi ± 0%    +6.23% (p=0.002 n=6)
geomean                                                          779.7Ki        391.8Ki        -49.76%

                                                              │ /tmp/old.txt │            /tmp/new.txt            │
                                                              │  allocs/op   │  allocs/op   vs base               │
LogsConsumer_cwlogs/10resources_10records_1logs-16                406.0 ± 2%    348.0 ± 2%  -14.29% (p=0.002 n=6)
LogsConsumer_cwlogs/10resources_10records_10logs-16              2.025k ± 0%   1.971k ± 1%   -2.64% (p=0.002 n=6)
LogsConsumer_cwlogs/10resources_100records_1logs-16              2.922k ± 0%   3.094k ± 0%   +5.89% (p=0.002 n=6)
LogsConsumer_cwlogs/10resources_100records_10logs-16             18.46k ± 0%   19.46k ± 1%   +5.45% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_10records_1metrics-16       548.5 ± 8%    653.0 ± 0%  +19.05% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_10records_10metrics-16     3.795k ± 3%   4.729k ± 1%  +24.63% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_100records_1metrics-16     3.281k ± 0%   6.161k ± 0%  +87.78% (p=0.002 n=6)
MetricsConsumer_cwmetrics/10resources_100records_10metrics-16    28.02k ± 0%   46.90k ± 0%  +67.39% (p=0.002 n=6)
geomean                                                          3.098k        3.722k       +20.16%
```

#### Documentation

N/A

---------

Co-authored-by: Anthony Mirabella <a9@aneurysm9.com>
@mx-psi mx-psi closed this as completed in 81c5eee Feb 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants