Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory leak after large number of upload jobs #3

Closed
kandybaby opened this issue Nov 17, 2023 · 5 comments
Closed

Memory leak after large number of upload jobs #3

kandybaby opened this issue Nov 17, 2023 · 5 comments
Assignees
Labels
bug Something isn't working

Comments

@kandybaby
Copy link
Owner

I was doing some testing at scale when I found the docker container was not releasing a large amount of memory after running over 100 large upload jobs overnight.

Current recommendation is to periodically restart the container after running uploads, I will be pushing a fix as soon as I have found the source of the leak.

@kandybaby kandybaby self-assigned this Nov 17, 2023
@kandybaby kandybaby added the bug Something isn't working label Nov 17, 2023
@kandybaby
Copy link
Owner Author

kandybaby commented Nov 17, 2023

When GC runs it does free up most of the memory. Looks like this might be more of an issue with the AWS CRT Transfer manager, I am going to open up an issue on the SDK for that. Even just one transfer of one file turns out uses a ton of memory, that is not released for awhile

In the meantime, I will push a new image that has a default max heap size of 1gb. This will cause the GC to clean up during the uploads, and the app won't idle at any more than 1gb of memory. I will introduce an env variable to manually set this max heap size as well. That will come tomorrow

@kandybaby kandybaby pinned this issue Nov 17, 2023
@kandybaby
Copy link
Owner Author

Fix is still incoming. Unfortunately its becoming quite complicated, adding a JVM memory limit fixes the leak when running the JVM, but not the docker container. Adding a container memory limit triggers GC frequently during an upload, but overall the container is still accumulating memory well beyond the set limit, and not releasing it. This is not applicable outside of the docker container. There was a similar unsolved issue raised on the AWS SDK for the AWS CRT, which this application uses. I might need to refactor the entire upload logic to use the older S3 async client and manage the multi part uploads directly, in order to avoid this problem.

@kandybaby
Copy link
Owner Author

aws/aws-sdk-java-v2#4034

I think this is the relevant issue

@kandybaby
Copy link
Owner Author

I know it's been awhile but it looks like AWS may have addressed the memory bug. I'll try and update and release before xmas

@kandybaby
Copy link
Owner Author

Seems to be resolved as of 0.1.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

1 participant