Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Out-of-memory handling in NoGC #1177

Open
wks opened this issue Jul 25, 2024 · 1 comment
Open

Out-of-memory handling in NoGC #1177

wks opened this issue Jul 25, 2024 · 1 comment

Comments

@wks
Copy link
Collaborator

wks commented Jul 25, 2024

When NoGC runs out of memory, it panics in NoGC::schedule_collection with an unreachable!() macro. A recent PR #1175 attempts to moves the panicking earlier into GCTrigger::poll.

However, we do have an out-of-memory handler Collection::out_of_memory. It allows the VM to handle OOM events in a VM-specific way, such as throwing OutOfMemoryError. Currently, when using NoGC, it will panic before reaching any call sites of Collection::out_of_memory.

When running Epsilon GC in OpenJDK 22, it throws OutOfMemoryError.

$ java -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC -Xms40M -Xmx40M -jar dacapo-23.11-chopin.jar lusearch
[0.002s][warning][gc,init] Consider enabling -XX:+AlwaysPreTouch to avoid memory commit hiccups
Using scaled threading model. 32 processors detected, 32 threads used to drive the workload, in a possible range of [1,2048]
Terminating due to java.lang.OutOfMemoryError: Java heap space

When running NoGC in MMTk, it panics with "internal error: entered unreachable code: GC triggered in nogc".

$ MMTK_PLAN=NoGC ~/projects/mmtk-github/openjdk/build/linux-x86_64-normal-server-release/jdk/bin/java -XX:MetaspaceSize=500M -XX:+UseThirdPartyHeap -Xms40M -Xmx40M -jar dacapo-23.11-chopin.jar lusearch
Using scaled threading model. 32 processors detected, 32 threads used to drive the workload, in a possible range of [1,2048]
thread '<unnamed>' panicked at /home/wks/projects/mmtk-github/mmtk-core/src/plan/nogc/global.rs:74:9:
internal error: entered unreachable code: GC triggered in nogc
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
fatal runtime error: failed to initiate panic, error 5
fish: Job 1, 'MMTK_PLAN=NoGC ~/projects/mmtk-…' terminated by signal SIGABRT (Abort)

When running SemiSpace in MMTk with a small heap size, it throws OutOfMemoryError, too

$ MMTK_PLAN=SemiSpace ~/projects/mmtk-github/openjdk/build/linux-x86_64-normal-server-release/jdk/bin/java -XX:MetaspaceSize=500M -XX:+UseThirdPartyHeap -Xms10M -Xmx10M -jar dacapo-23.11-chopin.jar lusearch
Using scaled threading model. 32 processors detected, 32 threads used to drive the workload, in a possible range of [1,2048]
Version: lucene 9.7.0 (use -p to print nominal benchmark stats)
===== DaCapo 23.11-chopin lusearch starting =====
java.lang.reflect.InvocationTargetException
java.lang.reflect.InvocationTargetException
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at org.dacapo.harness.Lusearch.iterate(Lusearch.java:43)
        at org.dacapo.harness.Benchmark.run(Benchmark.java:253)
        at org.dacapo.harness.TestHarness.runBenchmark(TestHarness.java:225)
        at org.dacapo.harness.TestHarness.main(TestHarness.java:170)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
        at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.base/java.lang.reflect.Method.invoke(Method.java:566)
        at Harness.main(Unknown Source)
Caused by: java.lang.OutOfMemoryError: Java heap space
        at org.dacapo.harness.LatencyReporter.initialize(LatencyReporter.java:70)
        at org.dacapo.lusearch.Search.main(Search.java:141)
        ... 13 more

But we can't simply call Collection::out_of_memory in GCTrigger. We do have dedicated code paths that calls Collection::out_of_memory and they should be used instead of skipped.

Mock testing

Meanwhile, some of our mock tests, such as allocate_with_re_enable_collection, still depends on block_for_gc to detect if GC is triggered. When fixing this problem, we probably need to reserve a proper hook for the MockVM to detect that GC has been triggered.

@k-sareen
Copy link
Collaborator

This should be easy to solve. Just need to add a check here to see if the plan can even collect. If it can't then call Collection::out_of_memory like we do above with will_oom_on_acquire.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants