Skip to content

Commit 5a1294d

Browse files
committed
Add tuning documentation
This commit is a followup of the PR #9149 where we discussed the idea of centralizing the known tuning associated with their use cases. Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>
1 parent 96c92c9 commit 5a1294d

File tree

2 files changed

+49
-0
lines changed

2 files changed

+49
-0
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,8 @@
121121

122122
### Documentation
123123

124+
* [FEATURE] Add tuning documentation. #9978
125+
124126
### Tools
125127

126128
* [FEATURE] `splitblocks`: add new tool to split blocks larger than a specified duration into multiple blocks. #9517, #9779
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
---
2+
description: Tune Grafana Mimir according to your use cases.
3+
menuTitle: Tuning
4+
title: Tune Grafana Mimir according to your use cases
5+
weight: 110
6+
---
7+
8+
# Tune Grafana Mimir according to your use cases
9+
10+
For most use cases, you can use the default settings that come with Mimir.
11+
However, sometimes you need to tune Mimir to reach optimal performance. Use the following guidance when tuning settings in Mimir.
12+
13+
## Heavy multi-tenancy
14+
15+
For each tenant, Mimir opens and maintains a TSDB in memory. If you have a significant number of tenants, the memory overhead might become prohibitive.
16+
To reduce the associated overhead, consider the following:
17+
18+
- Reduce `-blocks-storage.tsdb.head-chunks-write-buffer-size-bytes`, default `4MB`. For example, try `1MB` or `128KB`.
19+
- Reduce `-blocks-storage.tsdb.stripe-size`, default `16384`. For example, try `256`, or even `64`.
20+
- Configure [shuffle sharding](https://grafana.com/docs/mimir/latest/configure/configure-shuffle-sharding/)
21+
22+
## Compression
23+
24+
Depending on the CPU model used in the underlying infrastructure, the compression for both WALs and GRPC communication might consume a significant portion of the available CPU resources.
25+
To identify this case, you can use profiling with tools like [Grafana Pyroscope](https://grafana.com/docs/pyroscope/latest/).
26+
27+
To reduce resource consumption, consider the following:
28+
29+
- Make sure `wal_compression_enabled` is not enabled.
30+
- Make sure `grpc_compression` is either off, which is the default, or configured to `snappy`. `gzip` consumes more CPU than `snappy`. However, disabling `grpc_compression` implies more network traffic, and in turn, might increase the total cost of ownership (TCO) of running Mimir.
31+
32+
If you must use compression, for example, to fit in the network bandwidth, consider using nodes with more powerful CPU. This implies an increase in TCO.
33+
34+
## Cache size
35+
36+
Grafana Mimir relies on Memcached for its caches. Memcached relies, by default, only on memory.
37+
Memcached [extstore](https://docs.memcached.org/features/flashstorage/) feature allows to extend Memcached’s memory space onto flash (or similar) storage.
38+
39+
Refer to [how we scaled Grafana Cloud Logs' Memcached cluster to 50TB and improved reliability](https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/).
40+
41+
## Periodic latency spikes when cutting blocks
42+
43+
Depending on the workload, you might witness latency spikes when Mimir cuts blocks.
44+
To reduce the impact of this behavior, consider the following:
45+
46+
- Upgrade to `2.15+`. Refer to <https://github.com/grafana/mimir/commit/03f2f06e1247e997a0246d72f5c2c1fd9bd386df>.
47+
- Reduce `-blocks-storage.tsdb.block-ranges-period`, default `2h`. For example. try `1h`.

0 commit comments

Comments
 (0)