-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mfserv] Conflict between process termination by circus and exit handler functions coded in out plugin #636
Comments
There is a way to reproduce:
in plugin's config.ini, set:
for circus to restart plugin each 30 to 60 second Edit the
Check log:
=> we notice the handle_sigterm function cannot ends as a SIGKILL is received... do you agree with this way to reproduce @corentin-bl ? |
If we look at the above log and the code of https://github.com/metwork-framework/mfserv/blob/master/adm/signal_wrapper.py
moreover we notice the 3 secondes of delay in the log: so the timeout_after_signal delay (3s) is responsible of the SIGKILL during the sigterm handling. if we look in the So I set
|
Hi Matthieu, thanks for looking at this so quickly! I will try changing the configuration parameter to see if it does anything. However, as mentionned in the initial post, it seems that the part of our function that was executed lasted only 0.4ms, way lower than 3s, so I don't know if increasing the timeout will help ... I'll test it today and let you know! |
Unfortunately, and as expected, increasing
|
Regarding the way to reproduce, we did not use Flask to build our plugin. We have a script that defines the |
Hello @matthieumarrast, I have made some progress on the debug and I have found two things:
And a final question: now that I am able to catch the |
|
Hi Matthieu, here are the two examples : mfdata :
mfserv :
Let me know if this works as is or if you need further info (e.g. from the config file for instance). While preparing the examples, I noticed that in the case of mfserv, errors happenning in the |
Hi @matthieumarrast, a quick follow-up regarding my message from Oct. 18th. On our side, I found the cause of the problem: I made a stupid typo which prevented from getting the correct variable, hence the |
https://docs.python.org/3/library/signal.html "If the handler raises an exception, it will be raised “out of thin air” in the main thread. See the note below for a discussion." Interesting article to read here:
|
We have an mfserv plugin that responds to HTTP requests using data fetched from a database, and we want to control database connections closures.
Normally, this plugin is configured so that circus spawns 8 processes, lasting 1 hour each. The process hierarchy is as such :
circus
->signal_wrapper.py
->bjoern_wrapper.py
. Whenmax_age
is reached, circus sends a signal tosignal_wrapper.py
, which forwards it tobjoern_wrapper.py
, and eventually to our plugin. We understand thatSIGTERM
is first sent, and then after some timeout (3s according to the arguments given tosignal_wrapper.py
),SIGKILL
is sent.We have implemented a function in our plugin that is triggered upon signal reception. It simply retrieves the database connection and closes it. In practice, we see that
SIGTERM
is well caught and that the function starts. However, we can tell from the log that the function is never fully executed, andSIGKILL
is eventually sent after 3s.From the log messages, it is clear that the function did not exceed the timeout as the part of the function that was executed lasted ~0.4ms, so it is not clear why it does not run until the end.
@mrechte and I tried to investigate the propagation of signals through processes, but it seems that everything works normally. @matthieumarrast and I tried to dive deeper in the code of the
socket_down
function insignal_wrapper
; an idea would be to keep the socket busy during the execution of our exit handler, but this remains to be tested, and this would still look like a patch-up job. If anybody has a better understanding of what happens when circus terminates processes, it would be much appreciated.Steps to reproduce (mfserv 2.2) :
signal
from the standard library to bind it to signal catch; make the function long enough (e.g. usingtime.sleep()
) and print messages in the log file,The text was updated successfully, but these errors were encountered: