Memory leak after large number of upload jobs #3

kandybaby · 2023-11-17T16:23:22Z

I was doing some testing at scale when I found the docker container was not releasing a large amount of memory after running over 100 large upload jobs overnight.

Current recommendation is to periodically restart the container after running uploads, I will be pushing a fix as soon as I have found the source of the leak.

kandybaby · 2023-11-17T18:00:03Z

When GC runs it does free up most of the memory. Looks like this might be more of an issue with the AWS CRT Transfer manager, I am going to open up an issue on the SDK for that. Even just one transfer of one file turns out uses a ton of memory, that is not released for awhile

In the meantime, I will push a new image that has a default max heap size of 1gb. This will cause the GC to clean up during the uploads, and the app won't idle at any more than 1gb of memory. I will introduce an env variable to manually set this max heap size as well. That will come tomorrow

kandybaby · 2023-11-19T20:42:06Z

Fix is still incoming. Unfortunately its becoming quite complicated, adding a JVM memory limit fixes the leak when running the JVM, but not the docker container. Adding a container memory limit triggers GC frequently during an upload, but overall the container is still accumulating memory well beyond the set limit, and not releasing it. This is not applicable outside of the docker container. There was a similar unsolved issue raised on the AWS SDK for the AWS CRT, which this application uses. I might need to refactor the entire upload logic to use the older S3 async client and manage the multi part uploads directly, in order to avoid this problem.

kandybaby · 2023-11-19T20:46:33Z

aws/aws-sdk-java-v2#4034

I think this is the relevant issue

kandybaby · 2023-12-15T04:08:25Z

I know it's been awhile but it looks like AWS may have addressed the memory bug. I'll try and update and release before xmas

kandybaby · 2024-01-20T14:32:16Z

Seems to be resolved as of 0.1.2

kandybaby self-assigned this Nov 17, 2023

kandybaby added the bug Something isn't working label Nov 17, 2023

kandybaby pinned this issue Nov 17, 2023

kandybaby closed this as completed Jan 20, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak after large number of upload jobs #3

Memory leak after large number of upload jobs #3

kandybaby commented Nov 17, 2023

kandybaby commented Nov 17, 2023 •

edited

Loading

kandybaby commented Nov 19, 2023

kandybaby commented Nov 19, 2023

kandybaby commented Dec 15, 2023

kandybaby commented Jan 20, 2024

Memory leak after large number of upload jobs #3

Memory leak after large number of upload jobs #3

Comments

kandybaby commented Nov 17, 2023

kandybaby commented Nov 17, 2023 • edited Loading

kandybaby commented Nov 19, 2023

kandybaby commented Nov 19, 2023

kandybaby commented Dec 15, 2023

kandybaby commented Jan 20, 2024

kandybaby commented Nov 17, 2023 •

edited

Loading