Skip to content

Conversation

@quvide
Copy link

@quvide quvide commented Jul 3, 2025

Works around a potential deadlock:

T1 = Thread 1 (main thread)
T2 = Thread 2 (background thread)
M = MessageManager mutex
L = Lua mutex

  1. MessageManager::Broadcast is looping subscribers on T1, while keeping M locked. Subscribers might lock L.
  2. A network request completes on T2, obtains lock on L before M is released. Lua code wants to run MessageManager::Broadcast, which waits for M to be released.

We are now in a deadlock as T1 is waiting for L while holding M and T2 is waiting for M while holding L.

This commit changes the Lua API to never run MessageManager::Broadcast from a non-main thread. Instead, a Message is allocated and stored in a queue, which the main thread checks in the main GameLoop.

Access to the queue is guarded with a separate mutex. No other mutexes are obtained while this new mutex is locked.

With this change, it should no longer be possible for M to be waited on while L is locked from a non-main thread and when attempting to broadcast.

(Potentially, it could be possible for MessageManager::Broadcast to lock M and L on the main thread and run Lua which attempts to Broadcast again, that would try to obtain M which is already locked... I did not investigate whether there are some safeguards preventing this.)

Works around a potential deadlock:

T1 = Thread 1 (main thread)
T2 = Thread 2 (background thread)
M = MessageManager mutex
L = Lua mutex

1. MessageManager::Broadcast is looping subscribers on T1,
   while keeping M locked. Subscribers might lock L.
2. A network request completes on T2, obtains lock on L before M is
   released. Lua code wants to run MessageManager::Broadcast, which
   waits for M to be released.

We are now in a deadlock as T1 is waiting for L while holding M and T2
is waiting for M while holding L.

This commit changes the Lua API to never run MessageManager::Broadcast
from a non-main thread. Instead, a Message is allocated and stored in a
queue, which the main thread checks in the main GameLoop.

Access to the queue is guarded with a separate mutex. No other mutexes
are obtained while this new mutex is locked.

With this change, it should no longer be possible for M to be waited on
while L is locked from a non-main thread and when attempting to
broadcast.

(Potentially, it could be possible for MessageManager::Broadcast to lock
M and L on the main thread and run Lua which attempts to Broadcast
again, that would try to obtain M which is already locked... I did not
investigate whether there are some safeguards preventing this.)
@quvide quvide force-pushed the messageman-main-broadcast branch from 3f2c8c2 to f2a51bb Compare July 3, 2025 15:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant