-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[draft] erts: kill spawned child processes on VM exit #9453
base: master
Are you sure you want to change the base?
Conversation
CT Test Results 3 files 141 suites 49m 55s ⏱️ Results for commit dba896f. ♻️ This comment has been updated with latest results. To speed up review, make sure that you have read Contributing to Erlang/OTP and that all checks pass. See the TESTING and DEVELOPMENT HowTo guides for details about how to run test locally. Artifacts// Erlang/OTP Github Action Bot |
} | ||
|
||
static Eterm get_port_id(pid_t os_pid) | ||
{ | ||
ErtsSysExitStatus est, *es; | ||
Eterm port_id; | ||
est.os_pid = os_pid; | ||
es = hash_remove(forker_hash, &est); | ||
es = hash_get(forker_hash, &est); | ||
if (!es) return THE_NON_VALUE; | ||
port_id = es->port_id; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to preserve the original behavior of only sending exit_status back to callers which have requested it, since port_id is set conditionally by the caller and is still used in the guard around sending—not simply whether the os_pid exists in the hash table.
9f87bc1
to
dba896f
Compare
Hello! I think that we can move forward with this. There is no need to have an option to disable it for now (unless our existing tests shows that it is needed...), but there needs to be testcases to test that it works as expected on both Unix and Windows. I do wonder however if we should send some other signal than |
Good point, TIL that sigkill is untrappable. Looking at I experimented a bit locally to see if |
Keep the mapping of all living child processes so that we'll be able to iterate over them to clean up, rather than only storing the children which have an associated port.
If the uds_fd connection to the parent BEAM is broken or closed, react by killing all children and any descendants in the same process group. A concise demonstration of the problem being solved is to run this command with and without the patch, then kill the BEAM. Without the patch, the "sleep" process will continue: erl -noshell -eval 'os:cmd("sleep 60")' To intentionally start a child process which can outlive BEAM termination, give it a new process group for example by using `setsid`: erl -noshell -eval 'os:cmd("setsid sleep 60")'
FIXME: Not working yet—the grandchild must never get TERM? FIXME: Don't sleep for fixed intervals, use messages for timing.
215fc42
to
ce88ab3
Compare
This is a very rough proof-of-concept for discussion, which ensures all children spawned with open_port are terminated along with the BEAM.
Will be discussed in https://erlangforums.com/t/open-port-and-zombie-processes