-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory leak after large number of upload jobs #3
Comments
When GC runs it does free up most of the memory. Looks like this might be more of an issue with the AWS CRT Transfer manager, I am going to open up an issue on the SDK for that. Even just one transfer of one file turns out uses a ton of memory, that is not released for awhile In the meantime, I will push a new image that has a default max heap size of 1gb. This will cause the GC to clean up during the uploads, and the app won't idle at any more than 1gb of memory. I will introduce an env variable to manually set this max heap size as well. That will come tomorrow |
Fix is still incoming. Unfortunately its becoming quite complicated, adding a JVM memory limit fixes the leak when running the JVM, but not the docker container. Adding a container memory limit triggers GC frequently during an upload, but overall the container is still accumulating memory well beyond the set limit, and not releasing it. This is not applicable outside of the docker container. There was a similar unsolved issue raised on the AWS SDK for the AWS CRT, which this application uses. I might need to refactor the entire upload logic to use the older S3 async client and manage the multi part uploads directly, in order to avoid this problem. |
I think this is the relevant issue |
I know it's been awhile but it looks like AWS may have addressed the memory bug. I'll try and update and release before xmas |
Seems to be resolved as of 0.1.2 |
I was doing some testing at scale when I found the docker container was not releasing a large amount of memory after running over 100 large upload jobs overnight.
Current recommendation is to periodically restart the container after running uploads, I will be pushing a fix as soon as I have found the source of the leak.
The text was updated successfully, but these errors were encountered: