|
| 1 | +--- |
| 2 | +description: Tune Grafana Mimir according to your use cases. |
| 3 | +menuTitle: Tuning |
| 4 | +title: Tune Grafana Mimir according to your use cases |
| 5 | +weight: 110 |
| 6 | +--- |
| 7 | + |
| 8 | +# Tune Grafana Mimir according to your use cases |
| 9 | + |
| 10 | +For most use cases, you can use the default settings that come with Mimir. |
| 11 | +However, sometimes you need to tune Mimir to reach optimal performance. Use the following guidance when tuning settings in Mimir. |
| 12 | + |
| 13 | +## Heavy multi-tenancy |
| 14 | + |
| 15 | +For each tenant, Mimir opens and maintains a TSDB in memory. If you have a significant number of tenants, the memory overhead might become prohibitive. |
| 16 | +To reduce the associated overhead, consider the following: |
| 17 | + |
| 18 | +- Reduce `-blocks-storage.tsdb.head-chunks-write-buffer-size-bytes`, default `4MB`. For example, try `1MB` or `128KB`. |
| 19 | +- Reduce `-blocks-storage.tsdb.stripe-size`, default `16384`. For example, try `256`, or even `64`. |
| 20 | +- Configure [shuffle sharding](https://grafana.com/docs/mimir/latest/configure/configure-shuffle-sharding/) |
| 21 | + |
| 22 | +## Compression |
| 23 | + |
| 24 | +Depending on the CPU model used in the underlying infrastructure, the compression for both WALs and GRPC communication might consume a significant portion of the available CPU resources. |
| 25 | +To identify this case, you can use profiling with tools like [Grafana Pyroscope](https://grafana.com/docs/pyroscope/latest/). |
| 26 | + |
| 27 | +To reduce resource consumption, consider the following: |
| 28 | + |
| 29 | +- Make sure `wal_compression_enabled` is not enabled. |
| 30 | +- Make sure `grpc_compression` is either off, which is the default, or configured to `snappy`. `gzip` consumes more CPU than `snappy`. However, disabling `grpc_compression` implies more network traffic, and in turn, might increase the total cost of ownership (TCO) of running Mimir. |
| 31 | + |
| 32 | +If you must use compression, for example, to fit in the network bandwidth, consider using nodes with more powerful CPU. This implies an increase in TCO. |
| 33 | + |
| 34 | +## Cache size |
| 35 | + |
| 36 | +Grafana Mimir relies on Memcached for its caches. Memcached relies, by default, only on memory. |
| 37 | +Memcached [extstore](https://docs.memcached.org/features/flashstorage/) feature allows to extend Memcached’s memory space onto flash (or similar) storage. |
| 38 | + |
| 39 | +Refer to [how we scaled Grafana Cloud Logs' Memcached cluster to 50TB and improved reliability](https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/). |
| 40 | + |
| 41 | +## Periodic latency spikes when cutting blocks |
| 42 | + |
| 43 | +Depending on the workload, you might witness latency spikes when Mimir cuts blocks. |
| 44 | +To reduce the impact of this behavior, consider the following: |
| 45 | + |
| 46 | +- Upgrade to `2.15+`. Refer to <https://github.com/grafana/mimir/commit/03f2f06e1247e997a0246d72f5c2c1fd9bd386df>. |
| 47 | +- Reduce `-blocks-storage.tsdb.block-ranges-period`, default `2h`. For example. try `1h`. |
0 commit comments