Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add tuning documentation #9978

Merged
merged 2 commits into from
Dec 10, 2024

Conversation

wilfriedroset
Copy link
Collaborator

What this PR does

This commit is a followup of the PR #9149 where we discussed the idea of centralizing the known tuning associated with their use cases.

Which issue(s) this PR fixes or relates to

Relates to #9149

Checklist

  • Tests updated.
  • Documentation added.
  • CHANGELOG.md updated - the order of entries should be [CHANGE], [FEATURE], [ENHANCEMENT], [BUGFIX].
  • about-versioning.md updated with experimental features.

@wilfriedroset wilfriedroset added the type/docs Improvements or additions to documentation label Nov 21, 2024
@wilfriedroset wilfriedroset marked this pull request as ready for review November 21, 2024 18:05
@wilfriedroset wilfriedroset requested review from tacole02 and a team as code owners November 21, 2024 18:05
@wilfriedroset
Copy link
Collaborator Author

ping @56quarters @bboreham for the review as you were helping on the proposal 🙏🏾

Copy link
Contributor

@tacole02 tacole02 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for adding this! I left a few comments and suggestions from a docs perspective.

@@ -0,0 +1,47 @@
---
description: Learn how to tune Grafan Mimir according to your use cases.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: Learn how to tune Grafan Mimir according to your use cases.
description: Tune Grafana Mimir according to your use cases.


# Tune Grafana Mimir according to your use cases

Grafana Mimir comes with sensible default settings. Those settings are a good place to start for most use cases.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Grafana Mimir comes with sensible default settings. Those settings are a good place to start for most use cases.
For most use cases, you can use the default settings that come with Mimir.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We refer to it as just "Mimir" throughout the documentation, so I think it's okay to do that here, too.

# Tune Grafana Mimir according to your use cases

Grafana Mimir comes with sensible default settings. Those settings are a good place to start for most use cases.
However, for some use cases Grafana Mimir requires appropriate tuning to reach optimal performance. This page aims to centralize those known tuning.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
However, for some use cases Grafana Mimir requires appropriate tuning to reach optimal performance. This page aims to centralize those known tuning.
However, sometimes you need to tune Mimir to reach optimal performance. Use the following guidance when tuning settings in Mimir.```

## Heavy multi-tenancy

For each tenant, Grafana Mimir opens and maintains a TSDB in memory. With a significant number of tenants the memory overhead might come prohibitive.
To reduce the associated overhead, users might consider:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To reduce the associated overhead, users might consider:
To reduce the associated overhead, consider the following:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We like to address users directly in the second person.


## Heavy multi-tenancy

For each tenant, Grafana Mimir opens and maintains a TSDB in memory. With a significant number of tenants the memory overhead might come prohibitive.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For each tenant, Grafana Mimir opens and maintains a TSDB in memory. With a significant number of tenants the memory overhead might come prohibitive.
For each tenant, Mimir opens and maintains a TSDB in memory. If you have a significant number of tenants, the memory overhead might become prohibitive.

Grafana Mimir relies on Memcached for its caches. Memcached relies, by default only on the memory.
Similarly to the work of the Ops team behind Grafana Loki, one could enable the `extstore` feature of Memcached.

See: [how we scaled Grafana Cloud Logs' Memcached cluster to 50TB and improved reliability](https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
See: [how we scaled Grafana Cloud Logs' Memcached cluster to 50TB and improved reliability](https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/)
Refer to [how we scaled Grafana Cloud Logs' Memcached cluster to 50TB and improved reliability](https://grafana.com/blog/2023/08/23/how-we-scaled-grafana-cloud-logs-memcached-cluster-to-50tb-and-improved-reliability/).


## Periodic latency spikes when cutting blocks

Depending on the workload, users might witness latency spikes when Grafana Mimir cuts blocks.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Depending on the workload, users might witness latency spikes when Grafana Mimir cuts blocks.
Depending on the workload, you might witness latency spikes when Mimir cuts blocks.

## Periodic latency spikes when cutting blocks

Depending on the workload, users might witness latency spikes when Grafana Mimir cuts blocks.
To reduce the impacts of this behavior, users might consider:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
To reduce the impacts of this behavior, users might consider:
To reduce the impact of this behavior, consider the following:

Depending on the workload, users might witness latency spikes when Grafana Mimir cuts blocks.
To reduce the impacts of this behavior, users might consider:

- Upgrade to `2.15+`, See: <https://github.com/grafana/mimir/commit/03f2f06e1247e997a0246d72f5c2c1fd9bd386df>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Upgrade to `2.15+`, See: <https://github.com/grafana/mimir/commit/03f2f06e1247e997a0246d72f5c2c1fd9bd386df>
- Upgrade to `2.15+`. Refer to <https://github.com/grafana/mimir/commit/03f2f06e1247e997a0246d72f5c2c1fd9bd386df>.

To reduce the impacts of this behavior, users might consider:

- Upgrade to `2.15+`, See: <https://github.com/grafana/mimir/commit/03f2f06e1247e997a0246d72f5c2c1fd9bd386df>
- Reduce `-blocks-storage.tsdb.block-ranges-period`, default `2h`, For example try `1h`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- Reduce `-blocks-storage.tsdb.block-ranges-period`, default `2h`, For example try `1h`
- Reduce `-blocks-storage.tsdb.block-ranges-period`, default `2h`. For example. try `1h`.

@wilfriedroset
Copy link
Collaborator Author

thank you @tacole02 for your review I've taken into account your remarks.

This commit is a followup of the PR grafana#9149 where we discussed the idea of
centralizing the known tuning associated with their use cases.

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>
Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's already the "Production tips" article does it make sense to merge this content there?

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>
@wilfriedroset
Copy link
Collaborator Author

thank you @dimitarvdimitrov, there is indeed no need to create a dedicated new page. I've merge the new content into the production-tips.

Copy link
Contributor

@dimitarvdimitrov dimitarvdimitrov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for writing this ❤️

@dimitarvdimitrov dimitarvdimitrov merged commit cd801a9 into grafana:main Dec 10, 2024
29 checks passed
bjorns163 pushed a commit to bjorns163/mimir that referenced this pull request Dec 30, 2024
* Add tuning documentation

This commit is a followup of the PR grafana#9149 where we discussed the idea of
centralizing the known tuning associated with their use cases.

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>

* Merge tuning documentation into production-tips

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>

---------

Signed-off-by: Wilfried Roset <wilfriedroset@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/docs Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants