Releases: cortexproject/cortex
Cortex 1.14.0
This release contains 115 contributions from 28 contributors. Thank you!
Some notable changes release are:
- Remove support for chunks storage
- Experimental support for vertical query sharding
- Enable PromQL
@
modifier with negative offset always - Added configurations for Azure MSI in blocks-storage
- New limits (Querier/QueryFrontend)
- OpenTelemetry Bridge for Tracing
- Multiples performance improvements and bug fixes
Cortex
- [CHANGE] Remove support for chunks storage entirely. If you are using chunks storage on a previous version, you must migrate your data on version 1.12 or earlier. Before upgrading to this release, you should also remove any deprecated chunks-related configuration, as this release will no longer accept that. The following flags are gone:
-dynamodb.*
-metrics.*
-s3.*
-azure.*
-bigtable.*
-gcs.*
-cassandra.*
-boltdb.*
-local.*
- some
-ingester
flags:-ingester.wal-enabled
-ingester.checkpoint-enabled
-ingester.recover-from-wal
-ingester.wal-dir
-ingester.checkpoint-duration
-ingester.flush-on-shutdown-with-wal-enabled
-ingester.max-transfer-retries
-ingester.max-samples-per-query
-ingester.min-chunk-length
-ingester.flush-period
-ingester.retain-period
-ingester.max-chunk-idle
-ingester.max-stale-chunk-idle
-ingester.flush-op-timeout
-ingester.max-chunk-age
-ingester.chunk-age-jitter
-ingester.concurrent-flushes
-ingester.spread-flushes
-store.*
except-store.engine
and-store.max-query-length
-store.query-chunk-limit
was deprecated and replaced by-querier.max-fetched-chunks-per-query
-deletes.*
-grpc-store.*
-flusher.wal-dir
,-flusher.concurrent-flushes
,-flusher.flush-op-timeout
- [CHANGE] Remove support for alertmanager and ruler legacy store configuration. Before upgrading, you need to convert your configuration to use the
alertmanager-storage
andruler-storage
configuration on the version that you're already running, then upgrade. - [CHANGE] Disables TSDB isolation. #4825
- [CHANGE] Drops support Prometheus 1.x rule format on configdb. #4826
- [CHANGE] Removes
-ingester.stream-chunks-when-using-blocks
experimental flag and stream chunks by default whenquerier.ingester-streaming
is enabled. #4864 - [CHANGE] Compactor: Added
cortex_compactor_runs_interrupted_total
to separate compaction interruptions from failures - [CHANGE] Enable PromQL
@
modifier, negative offset always. #4927 - [CHANGE] Store-gateway: Add user label to
cortex_bucket_store_blocks_loaded
metric. #4918 - [CHANGE] AlertManager: include
status
label incortex_alertmanager_alerts_received_total
. #4907 - [FEATURE] Compactor: Added
-compactor.block-files-concurrency
allowing to configure number of go routines for download/upload block files during compaction. #4784 - [FEATURE] Compactor: Added
-compactor.blocks-fetch-concurrency
allowing to configure number of go routines for blocks during compaction. #4787 - [FEATURE] Compactor: Added configurations for Azure MSI in blocks-storage, ruler-storage and alertmanager-storage. #4818
- [FEATURE] Ruler: Add support to pass custom implementations of queryable and pusher. #4782
- [FEATURE] Create OpenTelemetry Bridge for Tracing. Now cortex can send traces to multiple destinations using OTEL Collectors. #4834
- [FEATURE] Added
-api.http-request-headers-to-log
allowing for the addition of HTTP Headers to logs #4803 - [FEATURE] Distributor: Added a new limit
-validation.max-labels-size-bytes
allowing to limit the combined size of labels for each timeseries. #4848 - [FEATURE] Storage/Bucket: Added
-*.s3.bucket-lookup-type
allowing to configure the s3 bucket lookup type. #4794 - [FEATURE] QueryFrontend: Implement experimental vertical sharding at query frontend for range/instant queries. #4863
- [FEATURE] QueryFrontend: Support vertical sharding for subqueries. #4955
- [FEATURE] Querier: Added a new limit
-querier.max-fetched-data-bytes-per-query
allowing to limit the maximum size of all data in bytes that a query can fetch from each ingester and storage. #4854 - [FEATURE] Added 2 flags
-alertmanager.alertmanager-client.grpc-compression
and-querier.store-gateway-client.grpc-compression
to configure compression methods for grpc clients. #4889 - [ENHANCEMENT] AlertManager: Retrying AlertManager Get Requests (Get Alertmanager status, Get Alertmanager Receivers) on next replica on error #4840
- [ENHANCEMENT] Querier/Ruler: Retry store-gateway in case of unexpected failure, instead of failing the query. #4532 #4839
- [ENHANCEMENT] Ring: DoBatch prioritize 4xx errors when failing. #4783
- [ENHANCEMENT] Cortex now built with Go 1.18. #4829
- [ENHANCEMENT] Ingester: Prevent ingesters to become unhealthy during wall replay. #4847
- [ENHANCEMENT] Compactor: Introduced visit marker file for blocks so blocks are under compaction will not be picked up by another compactor. #4805
- [ENHANCEMENT] Distributor: Add label name to labelValueTooLongError. #4855
- [ENHANCEMENT] Enhance traces with hostname information. #4898
- [ENHANCEMENT] Improve the documentation around limits. #4905
- [ENHANCEMENT] Distributor: cache user overrides to reduce lock contention. #4904
- [BUGFIX] Storage/Bucket: Enable AWS SDK for go authentication for s3 to fix IMDSv1 authentication. #4897
- [BUGFIX] Memberlist: Add join with no retrying when starting service. #4804
- [BUGFIX] Ruler: Fix /ruler/rule_groups returns YAML with extra fields. #4767
- [BUGFIX] Respecting
-tracing.otel.sample-ratio
configuration when enabling OpenTelemetry tracing with X-ray. #4862 - [BUGFIX] QueryFrontend: fixed query_range requests when query has
start
equals toend
. #4877 - [BUGFIX] AlertManager: fixed issue introduced by #4495 where templates files were being deleted when using alertmanager local store. #4890
- [BUGFIX] Ingester: fixed incorrect logging at the start of ingester block shipping logic. #4934
- [BUGFIX] Storage/Bucket: fixed global mark missing on deletion. #4949
- [BUGFIX] QueryFrontend/Querier: fixed regression added by #4863 where we stopped compressing the response between querier and query frontend. #4960
- [BUGFIX] QueryFrontend/Querier: fixed fix response error to be ungzipped when status code is not 2xx. #4975
Cortex 1.14.0-rc.1
Over v1.14.0-rc.0 to include the bugfix where query responses with status code other than 2xx were not being ungzipped.
- [BUGFIX] QueryFrontend/Querier: fixed fix response error to be ungzipped when status code is not 2xx. #4975
Cortex 1.14.0-rc.0
This release contains 114 contributions from 28 contributors. Thank you!
Some notable changes release are:
- Remove support for chunks storage
- Experimental support for vertical query sharding
- Enable PromQL
@
modifier with negative offset always - Added configurations for Azure MSI in blocks-storage
- New limits (Querier/QueryFrontend)
- OpenTelemetry Bridge for Tracing
- Multiples performance improvements and bug fixes
Cortex
- [CHANGE] Remove support for chunks storage entirely. If you are using chunks storage on a previous version, you must migrate your data on version 1.12 or earlier. Before upgrading to this release, you should also remove any deprecated chunks-related configuration, as this release will no longer accept that. The following flags are gone:
-dynamodb.*
-metrics.*
-s3.*
-azure.*
-bigtable.*
-gcs.*
-cassandra.*
-boltdb.*
-local.*
- some
-ingester
flags:-ingester.wal-enabled
-ingester.checkpoint-enabled
-ingester.recover-from-wal
-ingester.wal-dir
-ingester.checkpoint-duration
-ingester.flush-on-shutdown-with-wal-enabled
-ingester.max-transfer-retries
-ingester.max-samples-per-query
-ingester.min-chunk-length
-ingester.flush-period
-ingester.retain-period
-ingester.max-chunk-idle
-ingester.max-stale-chunk-idle
-ingester.flush-op-timeout
-ingester.max-chunk-age
-ingester.chunk-age-jitter
-ingester.concurrent-flushes
-ingester.spread-flushes
-store.*
except-store.engine
and-store.max-query-length
-store.query-chunk-limit
was deprecated and replaced by-querier.max-fetched-chunks-per-query
-deletes.*
-grpc-store.*
-flusher.wal-dir
,-flusher.concurrent-flushes
,-flusher.flush-op-timeout
- [CHANGE] Remove support for alertmanager and ruler legacy store configuration. Before upgrading, you need to convert your configuration to use the
alertmanager-storage
andruler-storage
configuration on the version that you're already running, then upgrade. - [CHANGE] Disables TSDB isolation. #4825
- [CHANGE] Drops support Prometheus 1.x rule format on configdb. #4826
- [CHANGE] Removes
-ingester.stream-chunks-when-using-blocks
experimental flag and stream chunks by default whenquerier.ingester-streaming
is enabled. #4864 - [CHANGE] Compactor: Added
cortex_compactor_runs_interrupted_total
to separate compaction interruptions from failures - [CHANGE] Enable PromQL
@
modifier, negative offset always. #4927 - [CHANGE] Store-gateway: Add user label to
cortex_bucket_store_blocks_loaded
metric. #4918 - [CHANGE] AlertManager: include
status
label incortex_alertmanager_alerts_received_total
. #4907 - [FEATURE] Compactor: Added
-compactor.block-files-concurrency
allowing to configure number of go routines for download/upload block files during compaction. #4784 - [FEATURE] Compactor: Added
-compactor.blocks-fetch-concurrency
allowing to configure number of go routines for blocks during compaction. #4787 - [FEATURE] Compactor: Added configurations for Azure MSI in blocks-storage, ruler-storage and alertmanager-storage. #4818
- [FEATURE] Ruler: Add support to pass custom implementations of queryable and pusher. #4782
- [FEATURE] Create OpenTelemetry Bridge for Tracing. Now cortex can send traces to multiple destinations using OTEL Collectors. #4834
- [FEATURE] Added
-api.http-request-headers-to-log
allowing for the addition of HTTP Headers to logs #4803 - [FEATURE] Distributor: Added a new limit
-validation.max-labels-size-bytes
allowing to limit the combined size of labels for each timeseries. #4848 - [FEATURE] Storage/Bucket: Added
-*.s3.bucket-lookup-type
allowing to configure the s3 bucket lookup type. #4794 - [FEATURE] QueryFrontend: Implement experimental vertical sharding at query frontend for range/instant queries. #4863
- [FEATURE] QueryFrontend: Support vertical sharding for subqueries. #4955
- [FEATURE] Querier: Added a new limit
-querier.max-fetched-data-bytes-per-query
allowing to limit the maximum size of all data in bytes that a query can fetch from each ingester and storage. #4854 - [FEATURE] Added 2 flags
-alertmanager.alertmanager-client.grpc-compression
and-querier.store-gateway-client.grpc-compression
to configure compression methods for grpc clients. #4889 - [ENHANCEMENT] AlertManager: Retrying AlertManager Get Requests (Get Alertmanager status, Get Alertmanager Receivers) on next replica on error #4840
- [ENHANCEMENT] Querier/Ruler: Retry store-gateway in case of unexpected failure, instead of failing the query. #4532 #4839
- [ENHANCEMENT] Ring: DoBatch prioritize 4xx errors when failing. #4783
- [ENHANCEMENT] Cortex now built with Go 1.18. #4829
- [ENHANCEMENT] Ingester: Prevent ingesters to become unhealthy during wall replay. #4847
- [ENHANCEMENT] Compactor: Introduced visit marker file for blocks so blocks are under compaction will not be picked up by another compactor. #4805
- [ENHANCEMENT] Distributor: Add label name to labelValueTooLongError. #4855
- [ENHANCEMENT] Enhance traces with hostname information. #4898
- [ENHANCEMENT] Improve the documentation around limits. #4905
- [ENHANCEMENT] Distributor: cache user overrides to reduce lock contention. #4904
- [BUGFIX] Storage/Bucket: Enable AWS SDK for go authentication for s3 to fix IMDSv1 authentication. #4897
- [BUGFIX] Memberlist: Add join with no retrying when starting service. #4804
- [BUGFIX] Ruler: Fix /ruler/rule_groups returns YAML with extra fields. #4767
- [BUGFIX] Respecting
-tracing.otel.sample-ratio
configuration when enabling OpenTelemetry tracing with X-ray. #4862 - [BUGFIX] QueryFrontend: fixed query_range requests when query has
start
equals toend
. #4877 - [BUGFIX] AlertManager: fixed issue introduced by #4495 where templates files were being deleted when using alertmanager local store. #4890
- [BUGFIX] Ingester: fixed incorrect logging at the start of ingester block shipping logic. #4934
- [BUGFIX] Storage/Bucket: fixed global mark missing on deletion. #4949
- [BUGFIX] QueryFrontend/Querier: fixed regression added by #4863 where we stopped compressing the response between querier and query frontend. #4960
Cortex 1.13.1
This is a bug fix release to address #4888.
The release is signed with @alvinlin123's GPG key
Cortex 1.13.0
This release contains 112 contributions from 51 contributors. Thank you!
Some notable new features in this release are:
- Streaming capabilities in Querier for metadata APIs.
- Experimental shuffle sharding support for compactor, which enables parallel compaction.
Some notable enhancement and bug fixes in this release are:
- New block storage configurations for Azure that allows reduction in memory usage.
- Memory leak fix in Distributor and Ruler.
- Jitter in Memberlist rejoin interval that reduces CPU utilization during rejoin.
Cortex
- [CHANGE] Changed default for
-ingester.min-ready-duration
from 1 minute to 15 seconds. #4539 - [CHANGE] query-frontend: Do not print anything in the logs of
query-frontend
if a in-progress query has been canceled (context canceled) to avoid spam. #4562 - [CHANGE] Compactor block deletion mark migration, needed when upgrading from v1.7, is now disabled by default. #4597
- [CHANGE] The
status_code
label on gRPC client metrics has changed from '200' and '500' to '2xx', '5xx', '4xx', 'cancel' or 'error'. #4601 - [CHANGE] Memberlist: changed probe interval from
1s
to5s
and probe timeout from500ms
to2s
. #4601 - [CHANGE] Fix incorrectly named
cortex_cache_fetched_keys
andcortex_cache_hits
metrics. Renamed tocortex_cache_fetched_keys_total
andcortex_cache_hits_total
respectively. #4686 - [CHANGE] Enable Thanos series limiter in store-gateway. #4702
- [CHANGE] Distributor: Apply
max_fetched_series_per_query
limit for/series
API. #4683 - [CHANGE] Re-enable the
proxy_url
option for receiver configuration. #4741 - [FEATURE] Ruler: Add
external_labels
option to tag all alerts with a given set of labels. #4499 - [FEATURE] Compactor: Add
-compactor.skip-blocks-with-out-of-order-chunks-enabled
configuration to mark blocks containing index with out-of-order chunks for no compact instead of halting the compaction. #4707 - [FEATURE] Querier/Query-Frontend: Add
-querier.per-step-stats-enabled
and-frontend.cache-queryable-samples-stats
configurations to enable query sample statistics. #4708 - [FEATURE] Add shuffle sharding for the compactor #4433
- [FEATURE] Querier: Use streaming for ingester metdata APIs. #4725
- [ENHANCEMENT] Update Go version to 1.17.8. #4602 #4604 #4658
- [ENHANCEMENT] Keep track of discarded samples due to bad relabel configuration in
cortex_discarded_samples_total
. #4503 - [ENHANCEMENT] Ruler: Add
-ruler.disable-rule-group-label
to disable therule_group
label on exported metrics. #4571 - [ENHANCEMENT] Query federation: improve performance in MergeQueryable by memoizing labels. #4502
- [ENHANCEMENT] Added new ring related config
-ingester.readiness-check-ring-health
when enabled the readiness probe will succeed only after all instances are ACTIVE and healthy in the ring, this is enabled by default. #4539 - [ENHANCEMENT] Added new ring related config
-distributor.excluded-zones
when set this will exclude the comma-separated zones from the ring, default is "". #4539 - [ENHANCEMENT] Upgraded Docker base images to
alpine:3.14
. #4514 - [ENHANCEMENT] Updated Prometheus to latest. Includes changes from prometheus#9239, adding 15 new functions. Multiple TSDB bugfixes prometheus#9438 & prometheus#9381. #4524
- [ENHANCEMENT] Query Frontend: Add setting
-frontend.forward-headers-list
in frontend to configure the set of headers from the requests to be forwarded to downstream requests. #4486 - [ENHANCEMENT] Blocks storage: Add
-blocks-storage.azure.http.*
,-alertmanager-storage.azure.http.*
, and-ruler-storage.azure.http.*
to configure the Azure storage client. #4581 - [ENHANCEMENT] Optimise memberlist receive path when used as a backing store for rings with a large number of members. #4601
- [ENHANCEMENT] Add length and limit to labelNameTooLongError and labelValueTooLongError #4595
- [ENHANCEMENT] Add jitter to rejoinInterval. #4747
- [ENHANCEMENT] Compactor: uploading blocks no compaction marks to the global location and introduce a new metric #4729
cortex_bucket_blocks_marked_for_no_compaction_count
: Total number of blocks marked for no compaction in the bucket.
- [ENHANCEMENT] Querier: Reduce the number of series that are kept in memory while streaming from ingesters. #4745
- [BUGFIX] AlertManager: remove stale template files. #4495
- [BUGFIX] Distributor: fix bug in query-exemplar where some results would get dropped. #4583
- [BUGFIX] Update Thanos dependency: compactor tracing support, azure blocks storage memory fix. #4585
- [BUGFIX] Set appropriate
Content-Type
header for /services endpoint, which previously hard-codedtext/plain
. #4596 - [BUGFIX] Querier: Disable query scheduler SRV DNS lookup, which removes noisy log messages about "failed DNS SRV record lookup". #4601
- [BUGFIX] Memberlist: fixed corrupted packets when sending compound messages with more than 255 messages or messages bigger than 64KB. #4601
- [BUGFIX] Query Frontend: If 'LogQueriesLongerThan' is set to < 0, log all queries as described in the docs. #4633
- [BUGFIX] Distributor: update defaultReplicationStrategy to not fail with extend-write when a single instance is unhealthy. #4636
- [BUGFIX] Distributor: Fix race condition on
/series
introduced by #4683. #4716 - [BUGFIX] Ruler: Fixed leaking notifiers after users are removed #4718
- [BUGFIX] Distributor: Fix a memory leak in distributor due to the cluster label. #4739
- [BUGFIX] Memberlist: Avoid clock skew by limiting the timestamp accepted on gossip. #4750
- [BUGFIX] Compactor: skip compaction if there is only 1 block available for shuffle-sharding compactor. #4756
Cortex 1.13.0-rc.1
Some bug fixes and improvements over 1.13.0-rc.0.
Cortex 1.13.0-rc.0
This release contains 112 contributions from 51 contributors. Thank you!
Some notable new features in this release are:
- Streaming capabilities in Querier for metadata APIs.
- Experimental shuffle sharding support for compactor, which enables parallel compaction.
Some notable enhancement and bug fixes in this release are:
- New block storage configurations for Azure that allows reduction in memory usage.
- Memory leak fix in Distributor and Ruler.
- Jitter in Memberlist rejoin interval that reduces CPU utilization during rejoin.
Cortex
- [CHANGE] Changed default for
-ingester.min-ready-duration
from 1 minute to 15 seconds. #4539 - [CHANGE] query-frontend: Do not print anything in the logs of
query-frontend
if a in-progress query has been canceled (context canceled) to avoid spam. #4562 - [CHANGE] Compactor block deletion mark migration, needed when upgrading from v1.7, is now disabled by default. #4597
- [CHANGE] The
status_code
label on gRPC client metrics has changed from '200' and '500' to '2xx', '5xx', '4xx', 'cancel' or 'error'. #4601 - [CHANGE] Memberlist: changed probe interval from
1s
to5s
and probe timeout from500ms
to2s
. #4601 - [CHANGE] Fix incorrectly named
cortex_cache_fetched_keys
andcortex_cache_hits
metrics. Renamed tocortex_cache_fetched_keys_total
andcortex_cache_hits_total
respectively. #4686 - [CHANGE] Enable Thanos series limiter in store-gateway. #4702
- [CHANGE] Distributor: Apply
max_fetched_series_per_query
limit for/series
API. #4683 - [CHANGE] Re-enable the
proxy_url
option for receiver configuration. #4741 - [FEATURE] Ruler: Add
external_labels
option to tag all alerts with a given set of labels. #4499 - [FEATURE] Compactor: Add
-compactor.skip-blocks-with-out-of-order-chunks-enabled
configuration to mark blocks containing index with out-of-order chunks for no compact instead of halting the compaction. #4707 - [FEATURE] Querier/Query-Frontend: Add
-querier.per-step-stats-enabled
and-frontend.cache-queryable-samples-stats
configurations to enable query sample statistics. #4708 - [FEATURE] Add shuffle sharding for the compactor #4433
- [FEATURE] Querier: Use streaming for ingester metdata APIs. #4725
- [ENHANCEMENT] Update Go version to 1.17.8. #4602 #4604 #4658
- [ENHANCEMENT] Keep track of discarded samples due to bad relabel configuration in
cortex_discarded_samples_total
. #4503 - [ENHANCEMENT] Ruler: Add
-ruler.disable-rule-group-label
to disable therule_group
label on exported metrics. #4571 - [ENHANCEMENT] Query federation: improve performance in MergeQueryable by memoizing labels. #4502
- [ENHANCEMENT] Added new ring related config
-ingester.readiness-check-ring-health
when enabled the readiness probe will succeed only after all instances are ACTIVE and healthy in the ring, this is enabled by default. #4539 - [ENHANCEMENT] Added new ring related config
-distributor.excluded-zones
when set this will exclude the comma-separated zones from the ring, default is "". #4539 - [ENHANCEMENT] Upgraded Docker base images to
alpine:3.14
. #4514 - [ENHANCEMENT] Updated Prometheus to latest. Includes changes from prometheus#9239, adding 15 new functions. Multiple TSDB bugfixes prometheus#9438 & prometheus#9381. #4524
- [ENHANCEMENT] Query Frontend: Add setting
-frontend.forward-headers-list
in frontend to configure the set of headers from the requests to be forwarded to downstream requests. #4486 - [ENHANCEMENT] Blocks storage: Add
-blocks-storage.azure.http.*
,-alertmanager-storage.azure.http.*
, and-ruler-storage.azure.http.*
to configure the Azure storage client. #4581 - [ENHANCEMENT] Optimise memberlist receive path when used as a backing store for rings with a large number of members. #4601
- [ENHANCEMENT] Add length and limit to labelNameTooLongError and labelValueTooLongError #4595
- [ENHANCEMENT] Add jitter to rejoinInterval. #4747
- [ENHANCEMENT] Compactor: uploading blocks no compaction marks to the global location and introduce a new metric #4729
cortex_bucket_blocks_marked_for_no_compaction_count
: Total number of blocks marked for no compaction in the bucket.
- [ENHANCEMENT] Querier: Reduce the number of series that are kept in memory while streaming from ingesters. #4745
- [BUGFIX] AlertManager: remove stale template files. #4495
- [BUGFIX] Distributor: fix bug in query-exemplar where some results would get dropped. #4583
- [BUGFIX] Update Thanos dependency: compactor tracing support, azure blocks storage memory fix. #4585
- [BUGFIX] Set appropriate
Content-Type
header for /services endpoint, which previously hard-codedtext/plain
. #4596 - [BUGFIX] Querier: Disable query scheduler SRV DNS lookup, which removes noisy log messages about "failed DNS SRV record lookup". #4601
- [BUGFIX] Memberlist: fixed corrupted packets when sending compound messages with more than 255 messages or messages bigger than 64KB. #4601
- [BUGFIX] Query Frontend: If 'LogQueriesLongerThan' is set to < 0, log all queries as described in the docs. #4633
- [BUGFIX] Distributor: update defaultReplicationStrategy to not fail with extend-write when a single instance is unhealthy. #4636
- [BUGFIX] Distributor: Fix race condition on
/series
introduced by #4683. #4716 - [BUGFIX] Ruler: Fixed leaking notifiers after users are removed #4718
- [BUGFIX] Distributor: Fix a memory leak in distributor due to the cluster label. #4739
- [BUGFIX] Memberlist: Avoid clock skew by limiting the timestamp accepted on gossip. #4750
- [BUGFIX] Compactor: skip compaction if there is only 1 block available for shuffle-sharding compactor. #4756
Cortex 1.11.1
This is a security release to include the fix for CVE-2022-24921 "stack exhaustion via a deeply nested expression".
The fix was to rebuild with Go v1.16.15, at #4663.
Cortex 1.11.0
This release contains 76 contributions from 31 authors. Thank you!
A broad range of improvements, including support for cloud services such as Memcached auto-discovery and Amazon SNS.
Cortex
- [CHANGE] Memberlist: Expose default configuration values to the command line options. Note that setting these explicitly to zero will no longer cause the default to be used. If the default is desired, then do set the option. The following are affected: #4276
-memberlist.stream-timeout
-memberlist.retransmit-factor
-memberlist.pull-push-interval
-memberlist.gossip-interval
-memberlist.gossip-nodes
-memberlist.gossip-to-dead-nodes-time
-memberlist.dead-node-reclaim-time
- [CHANGE]
-querier.max-fetched-chunks-per-query
previously applied to chunks from ingesters and store separately; now the two combined should not exceed the limit. #4260 - [CHANGE] Memberlist: the metric
memberlist_kv_store_value_bytes
has been removed due to values no longer being stored in-memory as encoded bytes. #4345 - [CHANGE] Some files and directories created by Cortex components on local disk now have stricter permissions, and are only readable by owner, but not group or others. #4394
- [CHANGE] The metric
cortex_deprecated_flags_inuse_total
has been renamed todeprecated_flags_inuse_total
as part of using grafana/dskit functionality. #4443 - [FEATURE] Ruler: Add new
-ruler.query-stats-enabled
which when enabled will report thecortex_ruler_query_seconds_total
as a per-user metric that tracks the sum of the wall time of executing queries in the ruler in seconds. #4317 - [FEATURE] Query Frontend: Add
cortex_query_fetched_series_total
andcortex_query_fetched_chunks_bytes_total
per-user counters to expose the number of series and bytes fetched as part of queries. These metrics can be enabled with the-frontend.query-stats-enabled
flag (or its respective YAML config optionquery_stats_enabled
). #4343 - [FEATURE] AlertManager: Add support for SNS Receiver. #4382
- [FEATURE] Distributor: Add label
status
to metriccortex_distributor_ingester_append_failures_total
#4442 - [FEATURE] Queries: Added
present_over_time
PromQL function, also some TSDB optimisations. #4505 - [ENHANCEMENT] Add timeout for waiting on compactor to become ACTIVE in the ring. #4262
- [ENHANCEMENT] Reduce memory used by streaming queries, particularly in ruler. #4341
- [ENHANCEMENT] Ring: allow experimental configuration of disabling of heartbeat timeouts by setting the relevant configuration value to zero. Applies to the following: #4342
-distributor.ring.heartbeat-timeout
-ring.heartbeat-timeout
-ruler.ring.heartbeat-timeout
-alertmanager.sharding-ring.heartbeat-timeout
-compactor.ring.heartbeat-timeout
-store-gateway.sharding-ring.heartbeat-timeout
- [ENHANCEMENT] Ring: allow heartbeats to be explicitly disabled by setting the interval to zero. This is considered experimental. This applies to the following configuration options: #4344
-distributor.ring.heartbeat-period
-ingester.heartbeat-period
-ruler.ring.heartbeat-period
-alertmanager.sharding-ring.heartbeat-period
-compactor.ring.heartbeat-period
-store-gateway.sharding-ring.heartbeat-period
- [ENHANCEMENT] Memberlist: optimized receive path for processing ring state updates, to help reduce CPU utilization in large clusters. #4345
- [ENHANCEMENT] Memberlist: expose configuration of memberlist packet compression via
-memberlist.compression=enabled
. #4346 - [ENHANCEMENT] Update Go version to 1.16.6. #4362
- [ENHANCEMENT] Updated Prometheus to include changes from prometheus/prometheus#9083. Now whenever
/labels
API calls include matchers, blocks store is queried forLabelNames
with matchers instead ofSeries
calls which was inefficient. #4380 - [ENHANCEMENT] Querier: performance improvements in socket and memory handling. #4429 #4377
- [ENHANCEMENT] Exemplars are now emitted for all gRPC calls and many operations tracked by histograms. #4462
- [ENHANCEMENT] New options
-server.http-listen-network
and-server.grpc-listen-network
allow binding as 'tcp4' or 'tcp6'. #4462 - [ENHANCEMENT] Rulers: Using shuffle sharding subring on GetRules API. #4466
- [ENHANCEMENT] Support memcached auto-discovery via
auto-discovery
flag, introduced by thanos in thanos-io/thanos#4487. Both AWS and Google Cloud memcached service support auto-discovery, which returns a list of nodes of the memcached cluster. #4412 - [BUGFIX] Fixes a panic in the query-tee when comparing result. #4465
- [BUGFIX] Frontend: Fixes @ modifier functions (start/end) when splitting queries by time. #4464
- [BUGFIX] Compactor: compactor will no longer try to compact blocks that are already marked for deletion. Previously compactor would consider blocks marked for deletion within
-compactor.deletion-delay / 2
period as eligible for compaction. #4328 - [BUGFIX] HA Tracker: when cleaning up obsolete elected replicas from KV store, tracker didn't update number of cluster per user correctly. #4336
- [BUGFIX] Ruler: fixed counting of PromQL evaluation errors as user-errors when updating
cortex_ruler_queries_failed_total
. #4335 - [BUGFIX] Ingester: When using block storage, prevent any reads or writes while the ingester is stopping. This will prevent accessing TSDB blocks once they have been already closed. #4304
- [BUGFIX] Ingester: fixed ingester stuck on start up (LEAVING ring state) when
-ingester.heartbeat-period=0
and-ingester.unregister-on-shutdown=false
. #4366 - [BUGFIX] Ingester: panic during shutdown while fetching batches from cache. #4397
- [BUGFIX] Querier: After query-frontend restart, querier may have lower than configured concurrency. #4417
- [BUGFIX] Memberlist: forward only changes, not entire original message. #4419
- [BUGFIX] Memberlist: don't accept old tombstones as incoming change, and don't forward such messages to other gossip members. #4420
- [BUGFIX] Querier: fixed panic when querying exemplars and using
-distributor.shard-by-all-labels=false
. #4473 - [BUGFIX] Querier: honor querier minT,maxT if
nil
SelectHints are passed to Select(). #4413 - [BUGFIX] Compactor: fixed panic while collecting Prometheus metrics. #4483
- [BUGFIX] Update go-kit package to fix spurious log messages #4544
Cortex 1.11.0-rc.1
Over v1.11.0-rc.0, this fixes a problem whereby some debug logs would be output when they were supposed to be filtered out.
This update required a couple more depedencies to be updated, but should not have any visible change.
See #4544