Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: update docs for clickhouse exporter #37101

Closed
wants to merge 1 commit into from

Conversation

garysassano
Copy link

I've built a fresh app with the following collector confmap:

receivers:
  otlp:
    protocols:
      http:
        endpoint: 0.0.0.0:4318

exporters:
  clickhouse:
    endpoint: https://XXX.eu-central-1.aws.clickhouse.cloud:8443
    username: default
    password: XXX

extensions:
  health_check:
    endpoint: 0.0.0.0:13133

service:
  extensions: [health_check]
  pipelines:
    traces:
      receivers: [otlp]
      exporters: [clickhouse]

Using the table inspector on ClickHouse Cloud returned the following schema for the autogenerated otel_traces table:

CREATE TABLE default.otel_traces
(
    `Timestamp` DateTime64(9) CODEC(Delta(8), ZSTD(1)),
    `TraceId` String CODEC(ZSTD(1)),
    `SpanId` String CODEC(ZSTD(1)),
    `ParentSpanId` String CODEC(ZSTD(1)),
    `TraceState` String CODEC(ZSTD(1)),
    `SpanName` LowCardinality(String) CODEC(ZSTD(1)),
    `SpanKind` LowCardinality(String) CODEC(ZSTD(1)),
    `ServiceName` LowCardinality(String) CODEC(ZSTD(1)),
    `ResourceAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `ScopeName` String CODEC(ZSTD(1)),
    `ScopeVersion` String CODEC(ZSTD(1)),
    `SpanAttributes` Map(LowCardinality(String), String) CODEC(ZSTD(1)),
    `Duration` UInt64 CODEC(ZSTD(1)),
    `StatusCode` LowCardinality(String) CODEC(ZSTD(1)),
    `StatusMessage` String CODEC(ZSTD(1)),
    `Events.Timestamp` Array(DateTime64(9)) CODEC(ZSTD(1)),
    `Events.Name` Array(LowCardinality(String)) CODEC(ZSTD(1)),
    `Events.Attributes` Array(Map(LowCardinality(String), String)) CODEC(ZSTD(1)),
    `Links.TraceId` Array(String) CODEC(ZSTD(1)),
    `Links.SpanId` Array(String) CODEC(ZSTD(1)),
    `Links.TraceState` Array(String) CODEC(ZSTD(1)),
    `Links.Attributes` Array(Map(LowCardinality(String), String)) CODEC(ZSTD(1)),
    INDEX idx_trace_id TraceId TYPE bloom_filter(0.001) GRANULARITY 1,
    INDEX idx_res_attr_key mapKeys(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_res_attr_value mapValues(ResourceAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_span_attr_key mapKeys(SpanAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_span_attr_value mapValues(SpanAttributes) TYPE bloom_filter(0.01) GRANULARITY 1,
    INDEX idx_duration Duration TYPE minmax GRANULARITY 1
)
ENGINE = SharedMergeTree('/clickhouse/tables/{uuid}/{shard}', '{replica}')
PARTITION BY toDate(Timestamp)
ORDER BY (ServiceName, SpanName, toDateTime(Timestamp))
SETTINGS index_granularity = 8192, ttl_only_drop_parts = 1

As you can see, by default the clickhouse exporter uses zstd as the compression algorithm when the compress key isn't set.

@SpencerTorres
Copy link
Member

Hey thanks for submitting this PR! I think there may be a misunderstanding of this setting.

In this case there's two contexts for compression, the first is client compression and the second is column compression.

Client Compression

Client compression refers to the compression that is used when transferring data over the client/server connection. This is the setting that is referenced in your PR. In the ClickHouse exporter code, this is in fact set by default to be lz4 as seen here:

if !queryParams.Has("compress") && (cfg.Compress == "" || cfg.Compress == "true") {
	queryParams.Set("compress", "lz4")
} else if !queryParams.Has("compress") {
	queryParams.Set("compress", cfg.Compress)
}

This is what the exporter sets for the underlying clickhouse-go client's parseDSN function. Inside parseDSN there is even more logic to control this parameter as seen here:

case "compress":
	if on, _ := strconv.ParseBool(params.Get(v)); on {
		if o.Compression == nil {
			o.Compression = &Compression{}
		}


		o.Compression.Method = CompressionLZ4
		continue
	}
	if compressMethod, ok := compressionMap[params.Get(v)]; ok {
		if o.Compression == nil {
			o.Compression = &Compression{
				// default for now same as Clickhouse - https://clickhouse.com/docs/en/operations/settings/settings#settings-http_zlib_compression_level
				Level: 3,
			}
		}


		o.Compression.Method = compressMethod
	}

This logic allows setting default compression (by simply saying compression=1, but also allows for setting a compression method by name (such as lz4, or zstd). The best compression method depends on what format you're returning from the server. In our docs these are called "compression modes". As you can see on this page, LZ4 is recommended as the default.

Column Compression

The other type of compression is column compression. This refers to ClickHouse's ability to compress individual columns separately with different algorithms. This is defined in the DDL for the table. In the exporter, it is defined here:

createTracesTableSQL = `
	CREATE TABLE IF NOT EXISTS %s %s (
		Timestamp DateTime64(9) CODEC(Delta, ZSTD(1)),
		TraceId String CODEC(ZSTD(1)),
		SpanId String CODEC(ZSTD(1)),

. . .

As you can see, the default trace table uses ZSTD. This is not correlated with the client's transfer compression setting. For more details on column compression, you can read our docs page for Compression in ClickHouse. I recommend modifying these default tables to fit your data.

The docs can be a bit confusing since there's not much distinction between the two, it's just called "compression". With all of this said though, I believe the documentation is correct as-is.

@garysassano garysassano closed this Jan 8, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants