how to debug deadlock ? #604

gengjun · 2024-01-06T02:05:23Z

gengjun
Jan 6, 2024

my application hangs / stuck randomly and I can not figure out why. Is there a way to trace each tokio worker and figure out why they're stuck ? I'm not sure the source of potential deadlock, but put loggings on my own logic and didn't find anything.

I have the following symptoms:

After restarting the web server, after few seconds to 24 hours, some workers will hang: I can still visit my service from some devices, but not from others. Or I can not visit my service from any device at all.
This happens very often during some time, not often on other times. If I’m lucky, I can keep it running without issues for more than 24 hours, if not, it keeps stuck immediately after restarting.

I suspect some visitors, probably a search engine or a malicious bot causing the tokio workers stuck one by one (I only have 3-4), they’re only active during some times in a day. This is my current theory, please let me know if you saw similar things, or have any suggestions. This issue exists before we upgrade to hyper 1.

I'm closing this discussion because I don't think it's a deadlock. It's likely some kind of slow http attacks. It seems hyper is not going to implement a time out on http body read (they think it's more appropriate for a framework to do it: hyperium/hyper#2457). And I don't know if salvo should implement this either because in a normal setup, nginx is used to defend these attacks. so I'm going to setup nginx. too bad I can not just simply run my own standalone app built with salvo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

how to debug deadlock ? #604

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 0 comments

Select a reply

how to debug deadlock ? #604

gengjun Jan 6, 2024

Replies: 0 comments

gengjun
Jan 6, 2024