Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable actor heap large chunk recycling #4534

Merged
merged 1 commit into from
Oct 26, 2024

Conversation

dipinhora
Copy link
Contributor

The implementation of actor heap large chunk recycling from #4531 too naive and results in actors wasting huge amounts of time related to large chunk recycling.

This commit effectively disables the large chunk recycling but doesn't undo the code changes at the moment because the expectation is that large chunk recycling will be re-enabled in the near future with an improved implementation.

The implementation of actor heap large chunk recycling from ponylang#4531
too naive and results in actors wasting huge amounts of time related
to large chunk recycling.

This commit effectively disables the large chunk recycling but
doesn't undo the code changes at the moment because the expectation
is that large chunk recycling will be re-enabled in the near future
with an improved implementation.
@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Oct 26, 2024
@dipinhora
Copy link
Contributor Author

@SeanTAllen this PR should resolve the stress test issue created by #4531. Let me know if you prefer to rollback all the large chunk recycling related changes instead of temporarily disabling them like this PR currently does.

@SeanTAllen SeanTAllen merged commit a8c6f92 into ponylang:main Oct 26, 2024
23 checks passed
@ponylang-main ponylang-main removed the discuss during sync Should be discussed during an upcoming sync label Oct 26, 2024
@SeanTAllen
Copy link
Member

Let's see what happens with the stress tests tonight

@ponylang-main ponylang-main added the discuss during sync Should be discussed during an upcoming sync label Oct 26, 2024
@SeanTAllen
Copy link
Member

FYI @dipinhora, you can see a number of failures that appear related in the ponyup actions history from the last couple of days.

@SeanTAllen
Copy link
Member

@dipinhora ok, so after the large change, the ponyup macos nightly still hangs for apple silicon. That was using nightlies of tools to build (rather than release), so it is possible that the corral version that appears to be hanging hand't been updated to have the large chunk recycling turned off.

That ponyup build was supposed to be using release tools so i switched it. if need be we can switch it back to using nightly tools in order to diagnose further.

building the latest corral with the last ponyc on an apple silicon mac and trying to build ponyup while matching the nightly build options is also an option.

there were problems with some of last nights stress tests but some also passed. i'm not convinced that the failures last night are related to this (might have been some weird environmental stuff). i'm rerunning the stress test failures from last night at the moment. will update here again once those are done.

@SeanTAllen
Copy link
Member

@dipinhora windows stress tests OOMed:

I think that perhaps this recycle attempt is problematic in practice for some reasonably standard things like the stress test and we should consider rolling it back.

@SeanTAllen
Copy link
Member

@dipinhora Additional stress test failures. I think we should roll back. Thoughts?

@dipinhora
Copy link
Contributor Author

@SeanTAllen i finally figured out the real root causes.. two bugs in the original PR.. one which impacts only large chunk recycling and one memory leak for both small and large chunk recycling (hence the OOMs)..

i'll open a PR with the fixes shortly.. up to you whether you want to give it another go after this latest fix or not but i'm fairly configure it'll be fixed after it..

dipinhora added a commit to dipinhora/ponyc that referenced this pull request Oct 27, 2024
The implementation of actor heap chunk recycling from ponylang#4531 had two
bugs. First, the large heap re-use logic (which was temporarily
disabled in ponylang#4534) had a bug related to how it updated the large
chunk recyclable list pointer in the heap. Second, the memory
clearing logic in the `ponyint_heap_endgc` function was clearing
more of the heap than it should have been resulting in a memory
leak for both small and large chunk recyclable chunks.

This commit re-enabled large chunk recycling (undoing ponylang#4534) and
fixes both bugs so that both large chunk and small chunk recycling
work as expected without memory leaks.
@dipinhora
Copy link
Contributor Author

@SeanTAllen PR #4535 has been opened to fix both of the bugs mentioned.

@SeanTAllen
Copy link
Member

@dipinhora ill merge it once it passes CI

SeanTAllen pushed a commit that referenced this pull request Oct 27, 2024
The implementation of actor heap chunk recycling from #4531 had two
bugs. First, the large heap re-use logic (which was temporarily
disabled in #4534) had a bug related to how it updated the large
chunk recyclable list pointer in the heap. Second, the memory
clearing logic in the `ponyint_heap_endgc` function was clearing
more of the heap than it should have been resulting in a memory
leak for both small and large chunk recyclable chunks.

This commit re-enabled large chunk recycling (undoing #4534) and
fixes both bugs so that both large chunk and small chunk recycling
work as expected without memory leaks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss during sync Should be discussed during an upcoming sync
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants