Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add alertmanager client #82

Closed
wants to merge 4 commits into from
Closed

Conversation

mpenet
Copy link
Member

@mpenet mpenet commented Jan 27, 2025

Adds simple alertmanager client with a default implementation using aleph for http transport.

@mpenet mpenet force-pushed the mpenet/ch115506/alerting branch from e251016 to 0e72839 Compare January 27, 2025 14:42
@mpenet mpenet changed the title mpenet/ch115506/alerting feat: add alertmanager client Jan 28, 2025
@mpenet mpenet force-pushed the mpenet/ch115506/alerting branch from a5a064e to b9cab81 Compare January 28, 2025 11:53
@mpenet mpenet marked this pull request as ready for review January 28, 2025 11:57
Copy link

@arnaudgeiser arnaudgeiser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach works for me as long as you take care of doing the PRs on all the artifacts that would need to replace the Riemann client with the Alert client.

Down the line, we should aim at having those concerns directly available in Mania which should basically provide a base system and configuration to which we'll be able to augment with a subsystem from the artifact we are creating. But we are not there yet.

@mpenet
Copy link
Member Author

mpenet commented Jan 28, 2025

I am not super happy with this being part of reporter personally (it was my undestanding it had to be). I would rather have it separate and depended upon in mania down the line as you suggest. Otherwise, yes, the ultimate goal is to get out of riemann on all artifacts.

@arnaudgeiser
Copy link

I am not super happy with this being part of reporter personally (it was my undestanding it had to be). I would rather have it separate and depended upon in mania down the line as you suggest. Otherwise, yes, the ultimate goal is to get out of riemann on all artifacts.

This is a bit more nuanced than this and I would say it's two efforts:

  • A short-term one, where the ultimate goal is to move out from Riemann without effort (ideally zero) on the artifacts
  • A medium-term one, where we are shaping what we expect in terms of alerting from the orchestrators. Our stance (which still needs discussion) is that not everything can be metric-shaped and we should have alerts with enough context. The following write-up [1] is a good starting point

Regarding the short-term measure, it doesn't have to be implemented specially on reporter, but it has to be implemented somewhere centrally so it would be almost a no-op on the client side. An alternative path that could have been considered :

  • Vendor the reporter namespaces inside mania (a private, internal repository)
  • Alter the riemann-events! function to send alert to the Alertmanager API directly (and deprecate the function right away)
  • Hardcode as much as possible inside mania (URL and so on..)
  • Create a new alert! function (or send!) in a dedicated mania namespace that would eventually replace reporter
    (riemann-send client)

Then, on the consuming artifacts it would be just a matter of bumping mania and removing the reporter dependency. Maybe a bit more will be needed but you get the idea.

As you may I guessed, we don't have a strong appetite to have a special library dedicated just to this.

[1] : https://app.shortcut.com/exoscale/story/115506/orchestrators-alerting-strategy#activity-117365

@mpenet
Copy link
Member Author

mpenet commented Jan 29, 2025

closing in favor of https://github.com/exoscale/mania/pull/26 [wip]

@mpenet mpenet closed this Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants