-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Kubernetes client native integration test OOMs (GC overhead limit reached) with -Xmx5g #37142
Comments
FTR: the same test builds fine with Mandrel but still has the increased reachable types. https://github.com/graalvm/mandrel/actions/runs/6885132117/job/18729383262#step:12:201
#36312 was merged 2 days ago, so it could be related (not verified). |
cc @manusa |
Sorry, I'm not familiar with the graalvm/mandrel pipelines. Does this mean that even with the dependency exclusions (and hack extension to remove the link check) GraalVM is still running out of memory? |
If this is the case, there are a few other unused modules that could be excluded too (these are the larger ones):
|
We could bump the memory limit, but since the static analysis results differ wildly, I'd rather we do some investigation of whether or not this could be reduced. The end result will also influence image size. To back that up with some numbers: I see Compare before #36312 and after. |
We(Quarkus QE) see this bug on mandrel as well. Reproducer:
Logs output:
|
Thanks. It doesn't seem GraalVM CE/Mandrel related. We see it in CI sometimes passing (producing a 200MB binary!) and sometimes failing (Watchdog timeout or GC overhead limit). |
This is not CI related. I just reproduced it on my very strong workstation. There is actual serious issue. |
Although |
Sorry, I just tried it with |
I had a better look at #36312 and although I don't have a solution I see the following things that are concerning and are all related to getting the CI happy:
The above changes apply only on the test meaning that what we test is not what the users will use. Furthermore, allowing an incomplete class path opens the doors for issues related to reachable but not included code going unnoticed. Removing the make-ci-happy changes I get the following numbers that are even worse to the ones @jerboaa reported:
which is probably closer to what Quarkus users will get, while before #36312 it was
I agree with @jerboaa that the most important thing here is not the higher resource utilization at build time, but why the same test now requires so much more data in the binary (which is probably related with the resource utilization at build time).
If they are unused only in the test, that's not the right way to go. If they are generally unused we should find a way to exclude them in general not only in the test. I will try to do some analysis but I don't know what the ETA will be. |
Let me give a little bit of context so to make things clearer. The The For example, one can easily query bareMetalHosts by doing client.bareMetalHosts().withName("the-name")
.edit(host -> new BareMetalHostBuilder(host).editMetadata().addToAnnotations("foo", "bar").endMetadata().build()); in a single statement. The problem is that since these methods reference classes from those modules, even if the user knows they aren't going to use them and excludes the model modules, the native image compiling process still complains about the unlinked classes. We're tracking this at fabric8io/kubernetes-client#5592 So besides what's proposed in #37278, other options would include breaking down the openshift-client into smaller clients. |
@manusa in the comments to fabric8 issue you said[1], that you're planning a Quarkus-specific fix for the bug. Did it materialize? The issue still affects us as of 3.7.2. |
I can't remember now what I exactly meant with that comment (I should have elaborated more). What I can remember is that I added the MiscellaneousSubstitutions and OperatorSubstitutions that allow for the With the current state of the client and the Quarkus extensions, I'm not sure there's something else that preserves the current API. The only options that I can think of are more aggressive. |
This was more extensively discussed in #38683 and is now fixed by #38886 and fabric8io/kubernetes-client#5759 |
Describe the bug
Since today in mandrel CI, the kubernetest-client native integration test OOMs with a GraalVM master build:
See:
https://github.com/graalvm/mandrel/actions/runs/6885132117/job/18729490442#step:12:207
It looks like the types reached and methods reached have increased significantly from a run that last worked 2 days ago here:
https://github.com/graalvm/mandrel/actions/runs/6885132117/job/18729490442#step:12:207
The specifics are as follows:
GOOD
BAD
Was there a change recently which could have caused this?
It's also concerning that we now see this (not in the passing test):
The text was updated successfully, but these errors were encountered: