-
Notifications
You must be signed in to change notification settings - Fork 62
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ability to execute only some health checks #81
Comments
Since
|
Copied from my comment on PR #142: I would also recommend to add a new annotation to specify custom health checks groups, something like |
I really like the group idea. |
In fact, |
Group idea is interesting. We could even imagine slightly change |
With Liveness and Readiness question is more about creating a custom type of HC |
So we go for |
Discussion reopened after community concerns |
I'd like to propose an alternative approach that would achieve a similar result but without increasing the API surface of the spec much. Currently a health check can have multiple checks in it like so: {
"status": "DOWN",
"checks": [
{
"name": "check1",
"status": "UP"
},
{
"name": "check2",
"status": "UP"
},
{
"name": "check3",
"status": "DOWN"
}
]
} Instead of making the application define a group name and having the observer use that group name (i.e. For example, using the above example suppose I just want the status of {
"status": "DOWN",
"checks": [
{
"name": "check2",
"status": "UP"
},
{
"name": "check3",
"status": "DOWN"
}
]
} I think this approach has several benefits:
|
@aguibert your alternative approach definitely solves the issues that was brought up, however only have one concern... The request URL might become very long if we want multiple checks in the filter parameters Maybe, we could still use the |
I don't think long URLs is necessarily an issue, and IMO it's not enough reason to have an entirely separate mechanism (i.e. health groups) |
Both approaches can be combined and we can allow users to choose. Maybe we can ask what would be the preference on the microprofile mailing list. Also couldn't it be more problematic to share concrete names of the health checks with everyone that is calling the service (instead of 3 groups I need to pass the names of 12 HC classes)? And not so easy to add or remove a check from a group without breaking all clients. My 2c. |
Unless there is a good reason, I think it's better to only give users one way to accomplish something, otherwise it can cause confusion. Sharing concrete names could be an issue, but I see 2 generic types of health checks:
Provider-defined checks will be stable due to providers not wanting to disrupt their users (changing HC name would be a breaking change). For app-defined checks, developers can already "group" multiple health checks into a single check if they want to with existing MP Health API. The most common use-case I see here is an observer wanting to observe >=1 provider-defined health checks (stable names) and 0/1 app-defined health checks. |
@aguibert I agree that one option will be better.
provider-defined HC can already be disabled by
Exactly. Readiness and Liveness are already groups. The proposal is to extend these two groups with custom app-defined groups which are not suitable for readiness or liveness checks. |
The
If someone uses the regular |
Yes, @health is not included in any of ready or live checks. The plan is to remove @health in 3.0 in June. What do you mean that it would be a solution to this problem? |
It seems to me like it's a solution to the problem because that's a health check a user can write which falls outside of the category of liveness or readiness groups. Additionally a user can give the response for an |
@aguibert , I think what you are getting at is you could...
I think we still need to settle on the prime use case for this though. Is it so that a client can decide which health checks they want to run for deeper problem determination in cases where those health checks may be too expensive to run all the time as part of liveness/readiness probes? or something else? |
... and if that is the use case, perhaps we would also need a way to query the list of health checks without running them. |
Perhaps by executing an |
@johnament Are you able to describe the primary use case you were thinking of when you opened this? I think that would help to work out where a client or server approach is better and from that whether things like a check query api is necessary. I know that Kubernetes isn’t the only valid use case, but that is a use case I understand and can get my head around. I think the usage scenario of this is less obvious. I’m not saying I doubt that a use case exists, just that I’m struggling to understand one based on this issue. |
Originally when I had this, it was actually for two reasons. The first was to differentiate liveliness and readiness checks (granted, when I first implemented this capability in the platform I was working on, it was simply described as I need a way to run these checks and those checks, and when it came down to seeing what checks they were it was the difference between checking liveliness and readiness). However, after that we actually realized something useful in our cloud monitoring capabilities, by exposing the list of checks to execute we can actually do things like narrow down nodes that were unavailable or pieces of our system that were failing due to the individual checks that would pass/fail. I think now that liveliness/readiness are effectively the same as a health check group. Having other groups may make sense, and having the application dictate "this is what it means for my service to be up and operating within reason" or "this is what it means for me to start servicing new requests" makes a ton of sense. However, when it comes to an ops team needing to identify what part of a system is failing, they sometimes need to be able to specify only a certain check to run and want to focus on that one problem area (because sometimes, even the best developer leaves an accidental dependency on two external components in their code instead of just one). Being able to list out the available checks makes sense, but honestly I was thinking the execution of a single check is by a person, not by a system, and more likely than not that person would know the name of the check they want to execute and not have to rely on calling an API (then you have to start to consider that a developer may name their check FWIW, the way we did this back in the day (when I first put in this ticket) was a query string, |
Hello @ALL, From my understanding there was some confusion in #232 regarding the use case of introducing HealthGroups, correct? I think the main reason for such confusion is the different view on technical and functional health of a service. From my point of view it is all about monitoring, service restoration and different requirements on how to handle health errors. So I'll try to explain it. In both cases my overall service health is bad, but different things need to be done to restore the service health . BY now C is not possible in MP Health (please interrupt me, if I am wrong). I am not an expert on this but from my point of view Spring Boot does a great job explaining and reasoning about the usage of health groups, and comparing them with Readiness & Liveness endpoints. Any chance to get such a feature in MP Health as well? I am absolutely sure devs out there will make use of it! Thanks for such a great job on MP. I really love this spec! |
Hi @migoldschmidt, I understand and can tell you that now I agree. We have had health groups in Quarkus/Smallrye Health for some time, and they are being used - https://smallrye.io/docs/smallrye-health/3.0.1/health-groups.html. So if you fancy doing a PR, that would be great. |
It would be useful in the wire format if we specified a way to execute only some health checks, rather than all discovered
@Health
health checks.I could see this being done from both client and server side. From client perspective, listing out a set of names to run, or a group name to run. On the server, I could see marker interfaces, annotations, or even a simple use of class name/
@Inject @Named
support.The text was updated successfully, but these errors were encountered: