Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WorkerPool errors being suppressed #884

Open
JackUrb opened this issue Aug 4, 2022 · 2 comments
Open

WorkerPool errors being suppressed #884

JackUrb opened this issue Aug 4, 2022 · 2 comments

Comments

@JackUrb
Copy link
Contributor

JackUrb commented Aug 4, 2022

Problem description

There seems to be an error with WorkerPool functions (as well as potentially other locations) wherein some of these methods as called by the ClientIOHandler may fail without putting anything to terminal. For instance, adding assert False to WorkerPool.register_agent causes tasks to stop being assigned without anything showing in the logs or terminal. This makes debugging very difficult, and means we may have been silencing other errors.

Current theory

Best guess is that the issue has to do with LoopWrapper.execute_coro method not pulling the exception details when the routines finish.

@lidiyam
Copy link

lidiyam commented Jan 11, 2023

Likely related to the issue above, seeing the following when running task on mturk:

[2023-01-11 13:53:37,345][asyncio][ERROR] - Task exception was never retrieved
future: <Task finished name='Task-1965' coro=<WorkerPool.register_worker() done, defined at /export/home/crs-salesbot/env/lib/python3.9/site-packages/mephisto/operations/worker_pool.py:143> exception=KeyError('f847f371-bc8c-4e23-b392-74d27c77705e')>
Traceback (most recent call last):
  File "/export/home/crs-salesbot/env/lib/python3.9/site-packages/mephisto/operations/worker_pool.py", line 185, in register_worker
    await self.register_agent(crowd_data, worker, request_id)
  File "/export/home/crs-salesbot/env/lib/python3.9/site-packages/mephisto/operations/worker_pool.py", line 476, in register_agent
    live_run.client_io.enqueue_agent_details(
  File "/export/home/crs-salesbot/env/lib/python3.9/site-packages/mephisto/operations/client_io_handler.py", line 368, in enqueue_agent_details
    subject_id=self.request_id_to_channel_id[request_id],
KeyError: 'f847f371-bc8c-4e23-b392-74d27c77705e'

Nothing else shows up in the terminal so I'm wondering if you have any pointers on how to debug this..

@JackUrb
Copy link
Contributor Author

JackUrb commented Jan 11, 2023

Hi @lidiyam - yes this is exactly the issue. Often these issues are no-ops, but sometimes they point to real problems that we should be surfacing. As the errors are completely suppressed rather than catalogued with details, we need to update LoopWrapper's execute_coro method to check for the exception status somehow.

Your particular issue is a no-op so long as it's not happening for every task (preventing you from getting any data at all).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants