Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java.lang.OutOfMemoryError i.e. "Thread-0" or "pool-1-thread-4" #30

Open
BlindDuck opened this issue Feb 1, 2024 · 11 comments
Open

java.lang.OutOfMemoryError i.e. "Thread-0" or "pool-1-thread-4" #30

BlindDuck opened this issue Feb 1, 2024 · 11 comments
Assignees
Labels
support request Support Request

Comments

@BlindDuck
Copy link

BlindDuck commented Feb 1, 2024

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread-0"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-1-thread-4"

Steps to reproduce the behavior:

  1. Enter the command:
    $ /usr/lib64/jvm/jre-11-openjdk/bin/java -jar /usr/local/bin/jacksum-3.7.0.jar --threads-hashing max --threads-reading max -a md5 --encoding hex --output-file-overwrite /tmp/jacksum-3.7.0-hashes.txt --error-file /tmp/jacksum.9403.error.txt --file-list-format list --file-list /tmp/jacksum-3.7.0-filelist.txt --verbose summary,warnings --header --path-relative-to /mnt/Nas8BaySammlung/0_neu/ /mnt/Nas8BaySammlung/0_neu/

The directory contains
1309,52GB
7020091 files
281204 subdirs
the parent dir has
2480,08GB
14922770 files
672472 subdirs

an other dir contains
1289,23GB
1287811 files
115790 subdirs

2/4. See the output/error:
"
Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread-0"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-1-thread-4"
"

Expected behavior:
a smooth run through as on the other processed directories.
i.e /usr/lib
jacksum.95988.report.txt
jacksum.95988.output.txt
jacksum.95988.error.txt
jacksum.95932.output.txt
jacksum.95932.error.txt

env.txt

@BlindDuck BlindDuck added the support request Support Request label Feb 1, 2024
@jonelo jonelo self-assigned this Feb 3, 2024
@jonelo
Copy link
Owner

jonelo commented Feb 3, 2024

Thank you very much for the report.
The error message indicates that the process is running out of memory, the Java Heap to be exact.
Could you please enter the following commands on your system and paste the output here?

$ grep Mem /proc/meminfo
$ /usr/lib64/jvm/jre-11-openjdk/bin/java -XX:+PrintFlagsFinal -version | grep HeapSize

@BlindDuck
Copy link
Author

I have reproduced the error message on three different systems and installed also the resent openJDK.

system1:
grep Mem /proc/meminfo
MemTotal: 30740012 kB
MemFree: 4689416 kB
MemAvailable: 17962028 kB

/usr/lib64/jvm/jre-11-openjdk/bin/java -XX:+PrintFlagsFinal -version | grep HeapSize
size_t ErgoHeapSizeLimit = 0 {product} {default}
size_t HeapSizePerGCThread = 43620760 {product} {default}
size_t InitialHeapSize = 492830720 {product} {ergonomic}
size_t LargePageHeapSizeThreshold = 134217728 {product} {default}
size_t MaxHeapSize = 7870611456 {product} {ergonomic}
uintx NonNMethodCodeHeapSize = 5836300 {pd product} {ergonomic}
uintx NonProfiledCodeHeapSize = 122910970 {pd product} {ergonomic}
uintx ProfiledCodeHeapSize = 122910970 {pd product} {ergonomic}
size_t ShenandoahSoftMaxHeapSize = 0 {manageable} {default}
openjdk version "11.0.17" 2022-10-18
OpenJDK Runtime Environment (build 11.0.17+0-suse-150000.3.86.2-x8664)
OpenJDK 64-Bit Server VM (build 11.0.17+0-suse-150000.3.86.2-x8664, mixed mode)

/opt/jdk-21.0.2/bin/java -XX:+PrintFlagsFinal -version | grep HeapSize
size_t ErgoHeapSizeLimit = 0 {product} {default}
size_t HeapSizePerGCThread = 43620760 {product} {default}
size_t InitialHeapSize = 494927872 {product} {ergonomic}
size_t LargePageHeapSizeThreshold = 134217728 {product} {default}
size_t MaxHeapSize = 7872708608 {product} {ergonomic}
size_t MinHeapSize = 8388608 {product} {ergonomic}
uintx NonNMethodCodeHeapSize = 5839372 {pd product} {ergonomic}
uintx NonProfiledCodeHeapSize = 122909434 {pd product} {ergonomic}
uintx ProfiledCodeHeapSize = 122909434 {pd product} {ergonomic}
size_t SoftMaxHeapSize = 7872708608 {manageable} {ergonomic}
openjdk version "21.0.2" 2024-01-16
OpenJDK Runtime Environment (build 21.0.2+13-58)
OpenJDK 64-Bit Server VM (build 21.0.2+13-58, mixed mode, sharing)

system2:
grep Mem /proc/meminfo
MemTotal: 65562564 kB
MemFree: 4183688 kB
MemAvailable: 42483060 kB

/usr/lib64/jvm/jre-11-openjdk/bin/java -XX:+PrintFlagsFinal -version | grep HeapSize
size_t ErgoHeapSizeLimit = 0 {product} {default}
size_t HeapSizePerGCThread = 43620760 {product} {default}
size_t InitialHeapSize = 1052770304 {product} {ergonomic}
size_t LargePageHeapSizeThreshold = 134217728 {product} {default}
size_t MaxHeapSize = 16785604608 {product} {ergonomic}
uintx NonNMethodCodeHeapSize = 5836300 {pd product} {ergonomic}
uintx NonProfiledCodeHeapSize = 122910970 {pd product} {ergonomic}
uintx ProfiledCodeHeapSize = 122910970 {pd product} {ergonomic}
size_t ShenandoahSoftMaxHeapSize = 0 {manageable} {default}
openjdk version "11.0.17" 2022-10-18
OpenJDK Runtime Environment (build 11.0.17+0-suse-150000.3.86.2-x8664)
OpenJDK 64-Bit Server VM (build 11.0.17+0-suse-150000.3.86.2-x8664, mixed mode)

/opt/jdk-21.0.2/bin/java -XX:+PrintFlagsFinal -version | grep HeapSize size_t ErgoHeapSizeLimit = 0 {product} {default}
size_t HeapSizePerGCThread = 43620760 {product} {default}
size_t InitialHeapSize = 1056964608 {product} {ergonomic}
size_t LargePageHeapSizeThreshold = 134217728 {product} {default}
size_t MaxHeapSize = 16785604608 {product} {ergonomic}
size_t MinHeapSize = 8388608 {product} {ergonomic}
uintx NonNMethodCodeHeapSize = 5839372 {pd product} {ergonomic}
uintx NonProfiledCodeHeapSize = 122909434 {pd product} {ergonomic}
uintx ProfiledCodeHeapSize = 122909434 {pd product} {ergonomic}
size_t SoftMaxHeapSize = 16785604608 {manageable} {ergonomic}
openjdk version "21.0.2" 2024-01-16
OpenJDK Runtime Environment (build 21.0.2+13-58)
OpenJDK 64-Bit Server VM (build 21.0.2+13-58, mixed mode, sharing)

system3 rasberry Pi 4b:
sudo grep Mem /proc/meminfo
MemTotal: 3930868 kB
MemFree: 1572248 kB
MemAvailable: 2408712 kB

/usr/bin/java -XX:+PrintFlagsFinal -version | grep HeapSize
size_t ErgoHeapSizeLimit = 0 {product} {default}
size_t HeapSizePerGCThread = 33554432 {product} {default}
size_t InitialHeapSize = 62914560 {product} {ergonomic}
size_t LargePageHeapSizeThreshold = 134217728 {product} {default}
size_t MaxHeapSize = 1006632960 {product} {ergonomic}
uintx NonNMethodCodeHeapSize = 5242880 {pd product} {default}
uintx NonProfiledCodeHeapSize = 0 {pd product} {ergonomic}
uintx ProfiledCodeHeapSize = 0 {pd product} {ergonomic}
openjdk version "11.0.21" 2023-10-17
OpenJDK Runtime Environment (build 11.0.21+9-post-Raspbian-1deb10u1)
OpenJDK Server VM (build 11.0.21+9-post-Raspbian-1deb10u1, mixed mode)

@jonelo
Copy link
Owner

jonelo commented Feb 3, 2024

All JVMs on the three systems behave as expected. The JVMs (OpenJDK 11 or 21.02.2) set the maximum heap size of 1/4 of physical memory on each machine. In case of the weakest computer in the list, the 4 GiB Raspi, the JVM from the OpenJDK selects 1 GiB for the MaxHeapSize value for example. Well, 1 GiB of allowed Java Heap should be plenty enough memory for Jacksum to do the job!

I did some quick tests and limit the max Java Heap to 1 GB on my box, but I cannot reproduce the issue for now.
If you gather a Heap Dump from a faulty run, I could have a look at the dump.

You can allowing the JVM to write a heap dump (a .hprof file) by setting an appropriate JVM option. Please put that option after the java command and just before the -jar option. Example:

$ java -XX:+HeapDumpOnOutOfMemoryError -jar jacksum-3.7.0.jar ...

After how many seconds/minutes/hours is that reproducible?

@BlindDuck
Copy link
Author

BlindDuck commented Feb 4, 2024

for system1 after round about 1h the erros occours:
$ /usr/lib64/jvm/jre-11-openjdk/bin/java -XX:+HeapDumpOnOutOfMemoryError -jar /usr/local/bin/jacksum-3.7.0.jar --threads-hashing max --threads-reading max -a md5 --encoding hex --output-file-overwrite /tmp/jacksum-3.7.0-hashes.txt --error-file /tmp/jacksum.130322.error.txt --file-list-format list --file-list /tmp/jacksum-3.7.0-filelist.txt --verbose summary,warnings --header --path-relative-to /home/OldHome/ /home/OldHome/

Sun Feb 4 01:27:28 CET 2024
File PID: 130322
/tmp/jacksum.130322.output.txt
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid130326.hprof ...
Heap dump file created [8929970313 bytes in 59.886 secs]

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread-0"

8929970313 4. Feb 02:38 java_pid130326.hprof
"File size too big: 25 MB are allowed, 561 MB were attempted to upload."
342787332 Feb 4 02:38 java_pid130326.hprof.xz
with https://wetransfer.com:
https://we.tl/t-LRxp41yuxk

$ /opt/jdk-21.0.2/bin/java -XX:+HeapDumpOnOutOfMemoryError -jar /usr/local/bin/jacksum-3.7.0.jar --threads-hashing max --threads-reading max -a md5 --encoding hex --output-file-overwrite /tmp/jacksum-3.7.0-hashes.txt --error-file /tmp/jacksum.134719.error.txt --file-list-format list --file-list /tmp/jacksum-3.7.0-filelist.txt --verbose summary,warnings --header --path-relative-to /home/OldHome/ /home/OldHome/

Sun Feb 4 09:55:01 CET 2024
File PID: 134719
/tmp/jacksum.134719.output.txt
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid134723.hprof ...
Heap dump file created [8931432106 bytes in 49.558 secs]

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread-0"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-1-thread-4"
^C^C^COpenJDK 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGINT to handler- the VM may need to be forcibly terminated
OpenJDK 64-Bit Server VM warning: Exception java.lang.OutOfMemoryError occurred dispatching signal SIGINT to handler- the VM may need to be forcibly terminate

8931432106 Feb 4 11:10 java_pid134723.hprof
334900120 Feb 4 11:10 java_pid134723.hprof.xz
with https://wetransfer.com:
https://we.tl/t-9IzPHxIc1D

system2 with jdk21 did not throw an error till now, routhly the same start time.

@jonelo
Copy link
Owner

jonelo commented Feb 6, 2024

Thank you for gathering the Java Heap Dumps. Much appreciated.

I have loaded and analyzed the Java Heap dumps using MAT. 99,97 % (1st dump) resp. 99,96 % (2nd dump) of the Java heap is consumed by an instance of a LinkedBlockingQueue called the workQueue which is a member of a java.util.concurrent.ThreadPoolExececutor instance. That queue is used for holding tasks and handing off to worker threads.

In Jacksum the instance is being created by calling Executors.newFixedThreadPool(cores); see also

ExecutorService executorService = Executors.newFixedThreadPool(cores);

In your cases the producer seems to be much faster than the consumers and the queue have been flooded with 8.3 millions tasks.
Each task also contain the path to each file that needs to be hashed, and there are many paths with more than 2 k String length.
So it gets continuously flooded until the Out Of Memory Error occured and Jacksum didn't even get a chance to work on the remaining tasks.

By default the LinkedBlockingQueue is unbounded which is the reason for the issue I think.

I found a description of that problem (and some potential mitigations) very well described at
https://medium.com/@amardeepbhowmick92/task-queuing-in-executors-newfixedthreadpool-31bc8c24b4d2

So it seems that fortunately it is not a memory leak by definition, because all data is still needed to do the job,
but unfortunately, due to unforseen, but likely conditions (clients are still busy, and producer is unstoppable hardworking to produce work for others) it could lead to unexpected high memory consumption which leads to an OOM in worst case.

I still cannot reproduce the issue with any of my hardware, but I have created a potential fix for that, so that the producer will need to help with the work before it can delegate work to others if they are all busy.

The fix is incluced in the binary attached to this note. Would be great if you could test it and see whether it can avoid the OOM.
In theory it should work. Please let's see whether it works in real life as well.

It is not allowed to upload .jar to notes, so I have packed it using zip as a workaround.
jacksum-3.8.0-SNAPSHOT-20240206.jar.zip

@BlindDuck
Copy link
Author

Great!
Starting test runs...

@BlindDuck
Copy link
Author

Unfortunately it does not work out, sorry.

java -version
openjdk version "11.0.17" 2022-10-18
OpenJDK Runtime Environment (build 11.0.17+0-suse-150000.3.86.2-x8664)
OpenJDK 64-Bit Server VM (build 11.0.17+0-suse-150000.3.86.2-x8664, mixed mode)

system1:
Wed Feb 7 11:05:37 CET 2024
File PID: 11550
/tmp/jacksum.11550.output.txt
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid11554.hprof ...
Heap dump file created [8914258100 bytes in 61.682 secs]

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread-0"
^Ccat: /tmp/jacksum.11550.txt: No such file or directory
Wed Feb 7 12:57:30 CET 2024

8914258100 Feb 7 12:16 java_pid11554.hprof
330988948 Feb 7 12:16 java_pid11554.hprof.xz
https://we.tl/t-m7wdxtSmng

@jonelo
Copy link
Owner

jonelo commented Feb 7, 2024

yes, it looks like my change had no effect at all. It still looks the same as before the change.
I am going to have a deeper look into it at the weekend. Thanks for your patience!

@BlindDuck
Copy link
Author

Sorry for causing that much efford and disturbing your weekend.

In the meanwhile system2 was also finished

system2:
/usr/lib64/jvm/jre-11-openjdk/bin/java -XX:+HeapDumpOnOutOfMemoryError -jar /usr/local/bin/jacksum-3.8.0-SNAPSHOT-20240206.jar --threads-hashing max --threads-reading max -a md5 --encoding hex --output-file-overwrite /tmp/jacksum-3.8.0-SNAPSHOT-20240206-hashes.txt --error-file /tmp/jacksum.9780.error.txt --file-list-format list --file-list /tmp/jacksum-3.8.0-SNAPSHOT-20240206-filelist.txt --verbose summary,warnings --header --path-relative-to XY XY

Wed Feb 7 16:49:56 CET 2024
File PID: 9780
/tmp/jacksum.9780.output.txt
java.lang.OutOfMemoryError: Java heap space
Dumping heap to java_pid9784.hprof ...
Heap dump file created [19016693628 bytes in 28.497 secs]

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "Thread-0"

Exception: java.lang.OutOfMemoryError thrown from the UncaughtExceptionHandler in thread "pool-1-thread-6"

19016693628 Feb 7 18:15 java_pid9784.hprof
740732288 Feb 7 18:15 java_pid9784.hprof.xz
https://we.tl/t-N55uWYGNkm

@BlindDuck
Copy link
Author

Hi hi!
Might I rais my hand in request for a bug fix?

@jonelo
Copy link
Owner

jonelo commented Mar 10, 2024

My apologies @BlindDuck, I haven't got round to it yet. I will update this ticket soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
support request Support Request
Projects
None yet
Development

No branches or pull requests

2 participants