Replies: 1 comment
-
Moving to discussions. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Description:
I'm running a heavy queue system that runs about 300k jobs and insert/updating about 50 million rows per day, using docker-compose and scale workers by docker-compose scale command
Sometimes, a job gets stuck and does not release even if the timeout has been exceeded. I'm running 6 different queues, the first is a crawler and the others are updaters (update crawled data into the database). When I run multiple crawlers processors, the updaters can be stuck randomly at any time, and any job.
This is an example updater:
All of the functions in this job just send the raw query to the Mysql server, such as:
The workers are started up by docker-compose:
As you can see, the timeout is set to 60s, but take a look at this picture:
The stuck job took 7949s to complete, far beyond the timeout, so weird. This job normally just take about 10~20s to complete, I have no idea about this weird thing.
Currently, I'm running 6 workers for the crawler, and one worker per updater. If I scale the crawler for more than 6 workers, the updaters start to be stuck, RANDOMLY at ANY JOB.
Steps To Reproduce:
Because the jobs are stuck randomly, so I have no idea how to reproduce this bug.
Note: Please note that the updaters can be stuck at any time and any job. The job stuck for about 2 hours, no more, no less. While a job is sticking, the rest are running normally without any error.
Thank you for reading my long post, please help!
Beta Was this translation helpful? Give feedback.
All reactions