You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
my application hangs / stuck randomly and I can not figure out why. Is there a way to trace each tokio worker and figure out why they're stuck ? I'm not sure the source of potential deadlock, but put loggings on my own logic and didn't find anything.
I have the following symptoms:
After restarting the web server, after few seconds to 24 hours, some workers will hang: I can still visit my service from some devices, but not from others. Or I can not visit my service from any device at all.
This happens very often during some time, not often on other times. If I’m lucky, I can keep it running without issues for more than 24 hours, if not, it keeps stuck immediately after restarting.
I suspect some visitors, probably a search engine or a malicious bot causing the tokio workers stuck one by one (I only have 3-4), they’re only active during some times in a day. This is my current theory, please let me know if you saw similar things, or have any suggestions. This issue exists before we upgrade to hyper 1.
I'm closing this discussion because I don't think it's a deadlock. It's likely some kind of slow http attacks. It seems hyper is not going to implement a time out on http body read (they think it's more appropriate for a framework to do it: hyperium/hyper#2457). And I don't know if salvo should implement this either because in a normal setup, nginx is used to defend these attacks. so I'm going to setup nginx. too bad I can not just simply run my own standalone app built with salvo.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
my application hangs / stuck randomly and I can not figure out why. Is there a way to trace each tokio worker and figure out why they're stuck ? I'm not sure the source of potential deadlock, but put loggings on my own logic and didn't find anything.
I have the following symptoms:
After restarting the web server, after few seconds to 24 hours, some workers will hang: I can still visit my service from some devices, but not from others. Or I can not visit my service from any device at all.
This happens very often during some time, not often on other times. If I’m lucky, I can keep it running without issues for more than 24 hours, if not, it keeps stuck immediately after restarting.
I suspect some visitors, probably a search engine or a malicious bot causing the tokio workers stuck one by one (I only have 3-4), they’re only active during some times in a day. This is my current theory, please let me know if you saw similar things, or have any suggestions. This issue exists before we upgrade to hyper 1.
I'm closing this discussion because I don't think it's a deadlock. It's likely some kind of slow http attacks. It seems hyper is not going to implement a time out on http body read (they think it's more appropriate for a framework to do it: hyperium/hyper#2457). And I don't know if salvo should implement this either because in a normal setup, nginx is used to defend these attacks. so I'm going to setup nginx. too bad I can not just simply run my own standalone app built with salvo.
Beta Was this translation helpful? Give feedback.
All reactions