-
Notifications
You must be signed in to change notification settings - Fork 2.5k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
OpenTelemetry Protocol with Apache Arrow Receiver initial skeleton (#…
…30766) First PR to introduce the OpenTelemetry Protocol with Apache Arrow receiver. From the upstream repository: https://github.com/open-telemetry/otel-arrow/blob/main/collector/receiver/otelarrowreceiver. Similar to #30619 for the corresponding receiver. **Link to tracking Issue:** #26491 **Testing:** This is a skeleton PR, therefore only the skeleton contains tests. Compared with the upstream repository, the factory_test.go and config_test.go files have been kept, the implementation tests in otelarrow_test.go were excluded in this PR. **Documentation:** New README, [copied from the upstream repository](https://github.com/open-telemetry/otel-arrow/blob/main/collector/receiver/otelarrowreceiver/README.md). --------- Signed-off-by: Joshua MacDonald <josh.macdonald@gmail.com>
- Loading branch information
Showing
30 changed files
with
1,849 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,6 @@ | ||
change_type: new_component | ||
component: otelarrow | ||
note: Skeleton of new OpenTelemetry Protocol with Apache Arrow Receiver | ||
issues: [26491] | ||
subtext: | ||
change_logs: [user] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1 @@ | ||
include ../../Makefile.Common |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,198 @@ | ||
# OpenTelemetry Protocol with Apache Arrow Receiver | ||
|
||
<!-- status autogenerated section --> | ||
| Status | | | ||
| ------------- |-----------| | ||
| Stability | [development]: traces, metrics, logs | | ||
| Distributions | [contrib] | | ||
| Issues | [![Open issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aopen%20label%3Areceiver%2Fotelarrow%20&label=open&color=orange&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aopen+is%3Aissue+label%3Areceiver%2Fotelarrow) [![Closed issues](https://img.shields.io/github/issues-search/open-telemetry/opentelemetry-collector-contrib?query=is%3Aissue%20is%3Aclosed%20label%3Areceiver%2Fotelarrow%20&label=closed&color=blue&logo=opentelemetry)](https://github.com/open-telemetry/opentelemetry-collector-contrib/issues?q=is%3Aclosed+is%3Aissue+label%3Areceiver%2Fotelarrow) | | ||
| [Code Owners](https://github.com/open-telemetry/opentelemetry-collector-contrib/blob/main/CONTRIBUTING.md#becoming-a-code-owner) | [@jmacd](https://www.github.com/jmacd), [@moh-osman3](https://www.github.com/moh-osman3) | | ||
|
||
[development]: https://github.com/open-telemetry/opentelemetry-collector#development | ||
[contrib]: https://github.com/open-telemetry/opentelemetry-collector-releases/tree/main/distributions/otelcol-contrib | ||
<!-- end autogenerated section --> | ||
|
||
Receives telemetry data using [OpenTelemetry Protocol with Apache | ||
Arrow](https://github.com/open-telemetry/otel-arrow) and standard | ||
[OTLP]( | ||
https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/protocol/otlp.md) | ||
protocol via gRPC. | ||
|
||
## Getting Started | ||
|
||
The OpenTelemetry Protocol with Apache Arrow receiver is an extension | ||
of the core OpenTelemetry Collector [OTLP | ||
receiver](https://github.com/open-telemetry/opentelemetry-collector/tree/main/receiver/otlpreceiver) | ||
component with additional support for [OpenTelemetry Protocol with | ||
Apache Arrow](https://github.com/open-telemetry/otel-arrow). | ||
|
||
OpenTelemetry Protocol with Apache Arrow supports column-oriented data | ||
transport using the Apache Arrow data format. The [OpenTelemetry | ||
Protocol with Apache Arrow | ||
exporter](../../exporter/otelarrowexporter/README.md) | ||
converts OTLP data into an optimized representation and then sends | ||
batches of data using Apache Arrow to encode the stream. This | ||
component contains logic to reverse the process used in the | ||
OpenTelemetry Protocol with Apache Arrow exporter. | ||
|
||
The use of an OpenTelemetry Protocol with Apache Arrow | ||
exporter-receiver pair is recommended when the network is expensive. | ||
Typically, expect to see a 50% reduction in bandwidth compared with | ||
the same data being sent using standard OTLP/gRPC and gzip | ||
compression. | ||
|
||
This component includes all the features and configuration of the core | ||
OTLP receiver, making it possible to upgrade from the core component | ||
simply by replacing "otlp" with "otelarrow" as the component name in | ||
the collector configuration. | ||
|
||
To enable the OpenTelemetry Protocol with Apache Arrow receiver, | ||
include it in the list of receivers for a pipeline. No further | ||
configuration is needed. This receiver listens on the standard | ||
OTLP/gRPC port 4317 and serves standard OTLP over gRPC out of the box. | ||
|
||
```yaml | ||
receivers: | ||
otelarrow: | ||
``` | ||
## Advanced Configuration | ||
Users may wish to configure gRPC settings, for example: | ||
``` | ||
receivers: | ||
otelarrow: | ||
protocols: | ||
grpc: | ||
... | ||
``` | ||
- `endpoint` (default = 0.0.0.0:4317 for grpc protocol): | ||
host:port to which the receiver is going to receive data. The valid syntax is | ||
described at https://github.com/grpc/grpc/blob/master/doc/naming.md. | ||
|
||
Several common configuration structures provide additional capabilities automatically: | ||
|
||
- [gRPC settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configgrpc/README.md) | ||
- [TLS and mTLS settings](https://github.com/open-telemetry/opentelemetry-collector/blob/main/config/configtls/README.md) | ||
|
||
### Arrow-specific Configuration | ||
|
||
In the `arrow` configuration block, the following settings are available: | ||
|
||
- `memory_limit_mib` (default: 128): limits the amount of concurrent memory used by Arrow data buffers. | ||
|
||
When the limit is reached, the receiver will return RESOURCE_EXHAUSTED | ||
error codes to the receiver, which are [conditionally retryable, see | ||
exporter retry configuration](https://github.com/open-telemetry/opentelemetry-collector/blob/main/exporter/exporterhelper/README.md). | ||
|
||
### Compression Configuration | ||
|
||
In the `arrow` configuration block, `zstd` sub-section applies to all | ||
compression levels used by exporters: | ||
|
||
- `memory_limit_mib` limits memory dedicated to Zstd decompression, per stream (default 128) | ||
- `max_window_size_mib`: maximum size of the Zstd window in MiB, 0 indicates to determine based on level (default 32) | ||
- `concurrency`: controls background CPU used for decompression, 0 indicates to let `zstd` library decide (default 1) | ||
|
||
### Keepalive configuration | ||
|
||
As a gRPC streaming service, the OTel Arrow receiver is able to limit | ||
stream lifetime through configuration of the underlying http/2 | ||
connection via keepalive settings. | ||
|
||
Keepalive settings are vital to the operation of OTel Arrow, because | ||
longer-lived streams use more memory and streams are fixed to a single | ||
host. Since every stream of data is different, we recommend | ||
experimenting to find a good balance between memory usage, stream | ||
lifetime, and load balance. | ||
|
||
gRPC libraries do not build-in a facility for long-lived RPCs to learn | ||
about impending http/2 connection state changes, including the event | ||
that initiates connection reset. While the receiver knows its own | ||
keepalive settings, a shorter maximum connection lifetime can be | ||
imposed by intermediate http/2 proxies, and therefore the receiver and | ||
exporter are expected to independently configure these limits. | ||
|
||
``` | ||
receivers: | ||
otelarrow: | ||
protocols: | ||
grpc: | ||
keepalive: | ||
server_parameters: | ||
max_connection_age: 1m | ||
max_connection_age_grace: 10m | ||
``` | ||
In the example configuration above, OpenTelemetry Protocol with Apache | ||
Arrow streams will have reset initiated after 10 minutes. Note that | ||
`max_connection_age` is set to a small value and we recommend tuning | ||
`max_connection_age_grace`. | ||
OTel Arrow exporters are expected to configure their | ||
`max_stream_lifetime` property to a value that is slightly smaller | ||
than the receiver's `max_connection_age_grace` setting, which causes | ||
the exporter to cleanly shut down streams, allowing requests to | ||
complete before the http/2 connection is forcibly closed. While the | ||
exporter will retry data that was in-flight during an unexpected | ||
stream shutdown, instrumentation about the telemety pipeline will show | ||
RPC errors when the exporter's `max_stream_lifetime` is not configured | ||
correctly. | ||
[See the exporter README for more | ||
guidance](../../exporter/otelarrowexporter/README.md). For the | ||
example where `max_connection_age_grace` is set to 10 minutes, the | ||
exporter's `max_stream_lifetime` should be set to the same number | ||
minus a reasonable timeout to allow in-flight requests to complete. | ||
For example, an exporter with `9m30s` stream lifetime: | ||
``` | ||
exporters: | ||
otelarrow: | ||
timeout: 30s | ||
arrow: | ||
max_stream_lifetime: 9m30s | ||
endpoint: ... | ||
tls: ... | ||
``` | ||
### Receiver metrics | ||
In addition to the the standard | ||
[obsreport](https://pkg.go.dev/go.opentelemetry.io/collector/obsreport) | ||
metrics, this component provides network-level measurement instruments | ||
which we anticipate will become part of `obsreport` in the future. At | ||
the `normal` level of metrics detail: | ||
- `receiver_recv`: uncompressed bytes received, prior to compression | ||
- `receiver_recv_wire`: compressed bytes received, on the wire. | ||
Arrow's compression performance can be derived by dividing the average | ||
`receiver_recv` value by the average `receiver_recv_wire` value. | ||
At the `detailed` metrics detail level, information about the stream | ||
of data being returned from the receiver will be instrumented: | ||
- `receiver_sent`: uncompressed bytes sent, prior to compression | ||
- `receiver_sent_wire`: compressed bytes sent, on the wire. | ||
There several OpenTelemetry Protocol with Apache Arrow-consumer | ||
related metrics available to help diagnose internal performance. | ||
These are disabled at the basic level of detail. At the normal level, | ||
these metrics are introduced: | ||
- `arrow_batch_records`: Counter of Arrow-IPC records processed | ||
- `arrow_memory_inuse`: UpDownCounter of memory in use by current streams | ||
- `arrow_schema_resets`: Counter of times the schema was adjusted, by data type. | ||
``` | ||
service | ||
... | ||
telemetry: | ||
... | ||
metrics: | ||
... | ||
level: detailed | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,52 @@ | ||
// Copyright The OpenTelemetry Authors | ||
// SPDX-License-Identifier: Apache-2.0 | ||
|
||
package otelarrowreceiver // import "github.com/open-telemetry/opentelemetry-collector-contrib/receiver/otelarrowreceiver" | ||
|
||
import ( | ||
"fmt" | ||
|
||
"github.com/open-telemetry/otel-arrow/collector/compression/zstd" | ||
"go.opentelemetry.io/collector/component" | ||
"go.opentelemetry.io/collector/config/configgrpc" | ||
) | ||
|
||
// Protocols is the configuration for the supported protocols. | ||
type Protocols struct { | ||
GRPC configgrpc.GRPCServerSettings `mapstructure:"grpc"` | ||
Arrow ArrowSettings `mapstructure:"arrow"` | ||
} | ||
|
||
// ArrowSettings support configuring the Arrow receiver. | ||
type ArrowSettings struct { | ||
// MemoryLimitMiB is the size of a shared memory region used | ||
// by all Arrow streams, in MiB. When too much load is | ||
// passing through, they will see ResourceExhausted errors. | ||
MemoryLimitMiB uint64 `mapstructure:"memory_limit_mib"` | ||
|
||
// Zstd settings apply to OTel-Arrow use of gRPC specifically. | ||
Zstd zstd.DecoderConfig `mapstructure:"zstd"` | ||
} | ||
|
||
// Config defines configuration for OTel Arrow receiver. | ||
type Config struct { | ||
// Protocols is the configuration for gRPC and Arrow. | ||
Protocols `mapstructure:"protocols"` | ||
} | ||
|
||
var _ component.Config = (*Config)(nil) | ||
|
||
// Validate checks the receiver configuration is valid | ||
func (cfg *Config) Validate() error { | ||
if err := cfg.Arrow.Validate(); err != nil { | ||
return err | ||
} | ||
return nil | ||
} | ||
|
||
func (cfg *ArrowSettings) Validate() error { | ||
if err := cfg.Zstd.Validate(); err != nil { | ||
return fmt.Errorf("zstd decoder: invalid configuration: %w", err) | ||
} | ||
return nil | ||
} |
Oops, something went wrong.