diff --git a/ChangeLog.md b/ChangeLog.md index 2e1d037f5970f..0e16803fd6a68 100644 --- a/ChangeLog.md +++ b/ChangeLog.md @@ -20,6 +20,8 @@ See docs/process.md for more on how version tagging works. 5.0.6 (in development) ---------------------- +- The emscripten_futux_wait API is not document to explicitly allow spurious + wakeups. (#26659) - The minimum version of node supported by the generated code was bumped from v12.22.0 to v18.3.0. (#26604) - The DETERMINISIC settings was marked as deprecated (#26653) diff --git a/docs/design/01-precise-futex-wakeups.md b/docs/design/01-precise-futex-wakeups.md index 70fbf22d6a2a4..f379f25789388 100644 --- a/docs/design/01-precise-futex-wakeups.md +++ b/docs/design/01-precise-futex-wakeups.md @@ -1,18 +1,18 @@ # Design Doc: Precise Futex Wakeups -- **Status**: Draft +- **Status**: Completed - **Bug**: https://github.com/emscripten-core/emscripten/issues/26633 ## Context -Currently, `emscripten_futex_wait` (in -`system/lib/pthread/emscripten_futex_wait.c`) relies on a periodic wakeup loop -for pthreads and the main runtime thread. This is done for two primary reasons: +Historically, `emscripten_futex_wait` (in +`system/lib/pthread/emscripten_futex_wait.c`) relied on a periodic wakeup loop +for pthreads and the main runtime thread. This was done for two primary reasons: -1. **Thread Cancellation**: To check if the calling thread has been cancelled while it is blocked. +1. **Thread Cancellation**: To check if the calling thread had been cancelled while it was blocked. 2. **Main Runtime Thread Events**: To allow the main runtime thread (even when not the main browser thread) to process its mailbox/event queue. -The current implementation uses a 1ms wakeup interval for the main runtime -thread and a 100ms interval for cancellable pthreads. This leads to unnecessary +The old implementation used a 1ms wakeup interval for the main runtime +thread and a 100ms interval for cancellable pthreads. This led to unnecessary CPU wakeups and increased latency for events. ## Goals @@ -23,22 +23,21 @@ CPU wakeups and increased latency for events. ## Non-Goals - **Main Browser Thread**: Changes to the busy-wait loop in `futex_wait_main_browser_thread` are out of scope. -- **Direct Atomics Usage**: Threads that call `atomic.wait` directly (bypassing `emscripten_futex_wait`) will remain un-interruptible. +- **Direct Atomics Usage**: Threads that call `atomic.wait` directly (bypassing `emscripten_futex_wait`) remain un-interruptible. - **Wasm Workers**: Wasm Workers do not have a `pthread` structure, so they are not covered by this design. -## Proposed Design +## Design The core idea is to allow "side-channel" wakeups (cancellation, mailbox events) to interrupt the `atomic.wait` call by having the waker call `atomic.wake` on the same address the waiter is currently blocked on. -As part of this design we will need to explicitly state that -`emscripten_futex_wait` now supports spurious wakeups. i.e. it may return `0` -(success) even if the underlying futex was not explicitly woken by the -application. +As part of this design, `emscripten_futex_wait` now explicitly supports spurious +wakeups. i.e. it may return `0` (success) even if the underlying futex was not +explicitly woken by the application. ### 1. `struct pthread` Extensions -We will add a single atomic `wait_addr` field to `struct pthread` (in +A single atomic `wait_addr` field was added to `struct pthread` (in `system/lib/libc/musl/src/internal/pthread_impl.h`). ```c @@ -57,82 +56,80 @@ _Atomic uintptr_t wait_addr; ``` ### 2. Waiter Logic (`emscripten_futex_wait`) -The waiter will follow this logic: +The waiter follows this logic: -1. **Notification Loop**: +1. **Publish Wait Address**: ```c uintptr_t expected_null = 0; - while (!atomic_compare_exchange_strong(&self->wait_addr, &expected_null, (uintptr_t)addr)) { + if (!atomic_compare_exchange_strong(&self->wait_addr, &expected_null, (uintptr_t)addr)) { // If the CAS failed, it means NOTIFY_BIT was set by another thread. - assert(expected_null == NOTIFY_BIT); - // Let the notifier know that we received the wakeup notification by - // resetting wait_addr. - self->wait_addr = 0; - handle_wakeup(); // Process mailbox or handle cancellation - // Reset expected_null because CAS updates it to the observed value on failure. - expected_null = 0; + assert(expected_null & NOTIFY_BIT); + // We don't wait at all; instead behave as if we spuriously woke up. + ret = ATOMICS_WAIT_OK; + goto done; } ``` 2. **Wait**: Call `ret = __builtin_wasm_memory_atomic_wait32(addr, val, timeout)`. -3. **Unpublish & Check**: +3. **Unpublish**: ```c - // Clear wait_addr and check if a notification arrived while we were sleeping. - if ((atomic_exchange(&self->wait_addr, 0) & NOTIFY_BIT) != 0) { - handle_wakeup(); - } + done: + self->wait_addr = 0; ``` -4. **Return**: Return the result of the wait. +4. **Handle side effects**: If the wake was due to cancellation or mailbox + events, these are handled after `emscripten_futex_wait` returns (or + internally via `pthread_testcancel` if cancellable). Note: We do **not** loop internally if `ret == ATOMICS_WAIT_OK`. Even if we suspect the wake was caused by a side-channel event, we must return to the user to avoid "swallowing" a simultaneous real application wake. -### 3. Waker Logic -When a thread needs to wake another thread for a side-channel event: +### 3. Waker Logic (`_emscripten_thread_notify`) +When a thread needs to wake another thread for a side-channel event (e.g. +enqueuing work or cancellation), it calls `_emscripten_thread_notify`: -1. **Enqueue Work**: Add the task to the target's mailbox or set the cancellation flag. -2. **Signal**: - ```c - uintptr_t addr = atomic_fetch_or(&target->wait_addr, NOTIFY_BIT); - if (addr == 0 || (addr & NOTIFY_BIT) != 0) { - // Either the thread wasn't waiting (it will see NOTIFY_BIT later), - // or someone else is already in the process of notifying it. - return; - } - // We set the bit and are responsible for waking the target. - // The target is currently waiting on `addr`. - while (target->wait_addr == (addr | NOTIFY_BIT)) { - emscripten_futex_wake((void*)addr, INT_MAX); - sched_yield(); - } - ``` +```c +void _emscripten_thread_notify(pthread_t target) { + uintptr_t addr = atomic_fetch_or(&target->wait_addr, NOTIFY_BIT); + if (addr == 0 || (addr & NOTIFY_BIT) != 0) { + // Either the thread wasn't waiting (it will see NOTIFY_BIT later), + // or someone else is already in the process of notifying it. + return; + } + // We set the bit and are responsible for waking the target. + // The target is currently waiting on `addr`. + while (target->wait_addr == (addr | NOTIFY_BIT)) { + emscripten_futex_wake((void*)addr, INT_MAX); + sched_yield(); + } +} +``` ### 4. Handling the Race Condition The protocol handles the "Lost Wakeup" race by having the waker loop until the waiter clears its `wait_addr`. If the waker sets the `NOTIFY_BIT` just before the waiter enters `atomic.wait`, the `atomic_wake` will be delivered once the waiter is asleep. If the waiter wakes up for any reason (timeout, real wake, or -side-channel wake), its `atomic_exchange` will satisfy the waker's loop -condition. +side-channel wake), its reset of `wait_addr` to `0` will satisfy the waker's +loop condition. ## Benefits - **Lower Power Consumption**: Threads can sleep indefinitely (or for the full duration of a user-requested timeout) without periodic wakeups. -- **Lower Latency**: Mailbox events and cancellation requests are processed immediately rather than waiting for the next 1ms or 100ms tick. -- **Simpler Loop**: The complex logic for calculating remaining timeout slices in `emscripten_futex_wait` is removed. +- **Lower Latency**: Mailbox events and cancellation requests are processed immediately rather than waiting for the next tick. +- **Simpler Loop**: The complex logic for calculating remaining timeout slices in `emscripten_futex_wait` was removed. ## Alternatives Considered - **Signal-based wakeups**: Not currently feasible in Wasm as signals are not implemented in a way that can interrupt `atomic.wait`. - **A single global "wake-up" address per thread**: This would require the waiter to wait on *two* addresses simultaneously (the user's futex and its - own wakeup address), which `atomic.wait` does not support. The proposed + own wakeup address), which `atomic.wait` does not support. The implemented design works around this by having the waker use the *user's* futex address. ## Security/Safety Considerations -- **The `wait_addr` must be managed carefully** to ensure wakers don't +- **The `wait_addr` is managed carefully** to ensure wakers don't call `atomic.wake` on stale addresses. Clearing the address upon wake mitigates this. -- **The waker loop should have a reasonable fallback** (like a yield) to prevent a - busy-wait deadlock if the waiter is somehow prevented from waking up (though - `atomic.wait` is generally guaranteed to wake if `atomic.wake` is called). +- **The waker loop has a yield** to prevent a busy-wait deadlock if the waiter + is somehow prevented from waking up (though `atomic.wait` is generally + guaranteed to wake if `atomic.wake` is called). diff --git a/src/struct_info_generated.json b/src/struct_info_generated.json index 163a1cf69779c..c5c9d7e1ee7a2 100644 --- a/src/struct_info_generated.json +++ b/src/struct_info_generated.json @@ -1036,7 +1036,7 @@ "p_proto": 8 }, "pthread": { - "__size__": 124, + "__size__": 128, "profilerBlock": 104, "stack": 48, "stack_size": 52, diff --git a/src/struct_info_generated_wasm64.json b/src/struct_info_generated_wasm64.json index 8b857fef5dbcb..18a3290fe7728 100644 --- a/src/struct_info_generated_wasm64.json +++ b/src/struct_info_generated_wasm64.json @@ -1036,7 +1036,7 @@ "p_proto": 16 }, "pthread": { - "__size__": 216, + "__size__": 224, "profilerBlock": 184, "stack": 80, "stack_size": 88, diff --git a/system/include/emscripten/threading_primitives.h b/system/include/emscripten/threading_primitives.h index dac148f0c143e..aa6761a791a07 100644 --- a/system/include/emscripten/threading_primitives.h +++ b/system/include/emscripten/threading_primitives.h @@ -188,6 +188,10 @@ void emscripten_condvar_signal(emscripten_condvar_t * _Nonnull condvar, uint32_t // If the given memory address contains value val, puts the calling thread to // sleep waiting for that address to be notified. +// Note: Like the Linux futex syscall, this APi *does* allow spurious wakeups. +// This differs from the WebAssembly `atomic.wait` instruction itself which +// does *not* allow supurious wakeups and it means that most callers will want +// to wraps this some kind loop. // Returns -EINVAL if addr is null. int emscripten_futex_wait(volatile void/*uint32_t*/ * _Nonnull addr, uint32_t val, double maxWaitMilliseconds); diff --git a/system/lib/libc/musl/src/internal/pthread_impl.h b/system/lib/libc/musl/src/internal/pthread_impl.h index b2588344eb1ab..d335c4cd9119b 100644 --- a/system/lib/libc/musl/src/internal/pthread_impl.h +++ b/system/lib/libc/musl/src/internal/pthread_impl.h @@ -111,6 +111,16 @@ struct pthread { // postMessage path. Once this becomes true, it remains true so we never // fall back to postMessage unnecessarily. _Atomic int waiting_async; + // The address the thread is currently waiting on in emscripten_futex_wait. + // + // This field encodes the state using the following bitmask: + // - NULL: Not waiting, no pending notification. + // - NOTIFY_BIT (0x1): Not waiting, but a notification was sent. + // - addr: Waiting on `addr`, no pending notification. + // - addr | NOTIFY_BIT: Waiting on `addr`, notification sent. + // + // Since futex addresses must be 4-byte aligned, the low bit is safe to use. + _Atomic uintptr_t wait_addr; #endif #ifdef EMSCRIPTEN_DYNAMIC_LINKING // When dynamic linking is enabled, threads use this to facilitate the @@ -120,6 +130,10 @@ struct pthread { #endif }; +#ifdef __EMSCRIPTEN__ +#define NOTIFY_BIT (1 << 0) +#endif + enum { DT_EXITED, DT_EXITING, diff --git a/system/lib/libc/musl/src/signal/setitimer.c b/system/lib/libc/musl/src/signal/setitimer.c index 5e9b4683ee067..90f3b4a2166d0 100644 --- a/system/lib/libc/musl/src/signal/setitimer.c +++ b/system/lib/libc/musl/src/signal/setitimer.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -79,6 +80,18 @@ void _emscripten_check_timers(double now) } } } + +double _emscripten_next_timer() +{ + assert(emscripten_is_main_runtime_thread()); + double next_timer = INFINITY; + for (int which = 0; which < 3; which++) { + if (current_timeout_ms[which]) { + next_timer = fmin(current_timeout_ms[which], next_timer); + } + } + return next_timer - emscripten_get_now(); +} #endif int setitimer(int which, const struct itimerval *restrict new, struct itimerval *restrict old) diff --git a/system/lib/libc/musl/src/thread/pthread_cancel.c b/system/lib/libc/musl/src/thread/pthread_cancel.c index 3428cd5fab474..7bdba001ec6cc 100644 --- a/system/lib/libc/musl/src/thread/pthread_cancel.c +++ b/system/lib/libc/musl/src/thread/pthread_cancel.c @@ -108,5 +108,13 @@ int pthread_cancel(pthread_t t) pthread_exit(PTHREAD_CANCELED); return 0; } +#ifdef __EMSCRIPTEN__ + // Wake the target thread in case it in emscripten_futex_wait. Normally + // this is only done when the target is the main runtime thread and there + // is an event added to its system queue. + // However, all threads needs to be inturruped like this in the case they are + // cancelled. + _emscripten_thread_notify(t); +#endif return pthread_kill(t, SIGCANCEL); } diff --git a/system/lib/pthread/em_task_queue.c b/system/lib/pthread/em_task_queue.c index 0eeb8f0bf8579..d9c72f5384162 100644 --- a/system/lib/pthread/em_task_queue.c +++ b/system/lib/pthread/em_task_queue.c @@ -14,6 +14,7 @@ #include "em_task_queue.h" #include "proxying_notification_state.h" #include "thread_mailbox.h" +#include "threading_internal.h" #define EM_TASK_QUEUE_INITIAL_CAPACITY 128 @@ -166,6 +167,7 @@ static bool em_task_queue_grow(em_task_queue* queue) { } void em_task_queue_execute(em_task_queue* queue) { + DBG("em_task_queue_execute"); queue->processing = 1; pthread_mutex_lock(&queue->mutex); while (!em_task_queue_is_empty(queue)) { @@ -178,6 +180,7 @@ void em_task_queue_execute(em_task_queue* queue) { } pthread_mutex_unlock(&queue->mutex); queue->processing = 0; + DBG("done em_task_queue_execute"); } void em_task_queue_cancel(em_task_queue* queue) { @@ -219,6 +222,7 @@ static void receive_notification(void* arg) { notification_state expected = NOTIFICATION_RECEIVED; atomic_compare_exchange_strong( &tasks->notification, &expected, NOTIFICATION_NONE); + DBG("receive_notification done"); } static void cancel_notification(void* arg) { @@ -246,6 +250,7 @@ bool em_task_queue_send(em_task_queue* queue, task t) { notification_state previous = atomic_exchange(&queue->notification, NOTIFICATION_PENDING); if (previous == NOTIFICATION_PENDING) { + DBG("em_task_queue_send NOTIFICATION_PENDING already set"); emscripten_thread_mailbox_unref(queue->thread); return true; } diff --git a/system/lib/pthread/emscripten_futex_wait.c b/system/lib/pthread/emscripten_futex_wait.c index 0584edccbd57e..806eea7950379 100644 --- a/system/lib/pthread/emscripten_futex_wait.c +++ b/system/lib/pthread/emscripten_futex_wait.c @@ -9,8 +9,11 @@ #include "pthread_impl.h" #include "threading_internal.h" -#include #include +#include + +#include +#include #include #include #include @@ -123,6 +126,12 @@ static int futex_wait_main_browser_thread(volatile void* addr, return 0; } +static double dummy() { + return INFINITY; +} + +weak_alias(dummy, _emscripten_next_timer); + int emscripten_futex_wait(volatile void *addr, uint32_t val, double max_wait_ms) { if ((((intptr_t)addr)&3) != 0) { return -EINVAL; @@ -146,91 +155,63 @@ int emscripten_futex_wait(volatile void *addr, uint32_t val, double max_wait_ms) return ret; } - // -1 (or any negative number) means wait indefinitely. - int64_t max_wait_ns = ATOMICS_WAIT_DURATION_INFINITE; - if (max_wait_ms != INFINITY) { - max_wait_ns = (int64_t)(max_wait_ms * 1e6); - } + // Pass 0 here, which means we don't have access to the current time in this + // function. This tells _emscripten_yield to call emscripten_get_now if (and + // only if) it needs to know the time. + _emscripten_yield(0); + + DBG("emscripten_futex_wait ms=%f", max_wait_ms); -#ifdef __EMSCRIPTEN_PTHREADS__ - // When building with pthread support there are two conditions under which we - // need to limit the amount of time we spend in atomic.wait. - // 1. We are the main runtime thread. In this case we need to be able to - // process proxied events from workers. Note that this is not always - // the same as being the main browser thread. For example, when running - // under node or when launching an emscripten-built program in a Web - // Worker. In this case we limit our wait slices to 1ms intervals. - // 2. When the current thread has async cancellation enabled. In this case - // we limit the wait duration to 100ms intervals. - int64_t wakeup_interval = 0; bool is_runtime_thread = emscripten_is_main_runtime_thread(); if (is_runtime_thread) { - // If the current thread is the main runtime thread then only wait in 1ms slices. - wakeup_interval = 1 * 1000000; - } - else if (cancelable) { - // If the current thread is async cancellable then only wait in 100ms slices. - wakeup_interval = 100 * 1000000; + max_wait_ms = fmin(max_wait_ms, fmax(0, _emscripten_next_timer())); } - // When wakeup_interval is set, we use end_time to track the absolute - // time when the wait should end. - double end_time = 0; - if (wakeup_interval) { - if (max_wait_ms == INFINITY) { - max_wait_ns = wakeup_interval; - } else { - end_time = emscripten_get_now() + max_wait_ms; - max_wait_ns = MIN(max_wait_ns, wakeup_interval); - } + // -1 (or any negative number) means wait indefinitely. + int64_t max_wait_ns = ATOMICS_WAIT_DURATION_INFINITE; + if (max_wait_ms != INFINITY) { + max_wait_ns = (int64_t)(max_wait_ms * 1e6); } - - do { -#endif - // Pass 0 here, which means we don't have access to the current time in this - // function. This tells _emscripten_yield to call emscripten_get_now if (and - // only if) it needs to know the time. - _emscripten_yield(0); - +#ifdef __EMSCRIPTEN_PTHREADS__ #ifdef EMSCRIPTEN_DYNAMIC_LINKING - // After the main thread queues dlopen events, it checks if the target threads - // are sleeping. - // If `sleeping` is set then the main thread knows that event will be - // processed after the sleep (before any other user code). In this case the - // main thread does not wait for any kind of response form the thread. - // If `sleeping` is not set then we know we should wait for the thread process - // the queue, either from the call here directly after setting `sleeping` to - // 1, or from another callsite (e.g. the one in `emscripten_yield`). - if (!is_runtime_thread) { - self->sleeping = 1; - _emscripten_process_dlopen_queue(); - } + if (!is_runtime_thread) { + self->sleeping = 1; + _emscripten_process_dlopen_queue(); + } #endif + uintptr_t expected_null = 0; + if (atomic_compare_exchange_strong(&self->wait_addr, &expected_null, (uintptr_t)addr)) { + DBG("emscripten_futex_wait atomic.wait ns=%lld", max_wait_ns); ret = __builtin_wasm_memory_atomic_wait32((int*)addr, val, max_wait_ns); + } else { + DBG("emscripten_futex_wait skipping atomic.wait due to NOTIFY_BIT"); + // CAS failed, NOTIFY_BIT must have been set. In this case we don't + // actually wait at all. Instead we behaviour as if we spuriously woke up + // right away. + assert(expected_null & NOTIFY_BIT); + ret = ATOMICS_WAIT_OK; + } + + // Clear the wait_addr + DBG("emscripten_futex_wait done notify=%d cancelable=%d cancel=%d", !!(self->wait_addr & NOTIFY_BIT), cancelable, self->cancel); + self->wait_addr = 0; #ifdef EMSCRIPTEN_DYNAMIC_LINKING - if (!is_runtime_thread) { - self->sleeping = 0; - _emscripten_process_dlopen_queue(); - } + if (!is_runtime_thread) { + self->sleeping = 0; + _emscripten_process_dlopen_queue(); + } #endif -#ifdef __EMSCRIPTEN_PTHREADS__ - if (cancelable && ret == ATOMICS_WAIT_TIMED_OUT && self->cancel) { - __pthread_testcancel(); - // If __pthread_testcancel does return here it means that canceldisable - // must be set to PTHREAD_CANCEL_MASKED. In this case we emulate the - // behaviour of the futex syscall and return ECANCELLED here. - // See pthread_cond_timedwait.c for the only use of this flag. - emscripten_conditional_set_current_thread_status(EM_THREAD_STATUS_WAITFUTEX, EM_THREAD_STATUS_RUNNING); - return -ECANCELED; - } - if (wakeup_interval && max_wait_ms != INFINITY) { - double remainder_ms = end_time - emscripten_get_now(); - if (remainder_ms <= 0) { - break; - } - max_wait_ns = MIN((int64_t)(remainder_ms * 1e6), wakeup_interval); - } - } while (wakeup_interval && ret == ATOMICS_WAIT_TIMED_OUT); + if (cancelable && self->cancel) { + __pthread_testcancel(); + // If __pthread_testcancel does return here it means that canceldisable + // must be set to PTHREAD_CANCEL_MASKED. In this case we emulate the + // behaviour of the futex syscall and return ECANCELLED here. + // See pthread_cond_timedwait.c for the only use of this flag. + emscripten_conditional_set_current_thread_status(EM_THREAD_STATUS_WAITFUTEX, EM_THREAD_STATUS_RUNNING); + return -ECANCELED; + } +#else + ret = __builtin_wasm_memory_atomic_wait32((int*)addr, val, max_wait_ns); #endif emscripten_conditional_set_current_thread_status(EM_THREAD_STATUS_WAITFUTEX, EM_THREAD_STATUS_RUNNING); diff --git a/system/lib/pthread/emscripten_futex_wake.c b/system/lib/pthread/emscripten_futex_wake.c index 1e4e57474ffa7..2bff9699fa658 100644 --- a/system/lib/pthread/emscripten_futex_wake.c +++ b/system/lib/pthread/emscripten_futex_wake.c @@ -5,10 +5,13 @@ * found in the LICENSE file. */ +#include "atomic.h" +#include "pthread_impl.h" #include #include #include -#include +#include + #include // Stores the memory address that the main thread is waiting on, if any. If @@ -47,3 +50,23 @@ int emscripten_futex_wake(volatile void *addr, int count) { assert(ret >= 0); return ret + main_thread_woken; } + +void _emscripten_thread_notify(pthread_t target) { + DBG("_emscripten_thread_notify %p", target); + uintptr_t wait_addr = atomic_fetch_or(&target->wait_addr, NOTIFY_BIT); + if (wait_addr == 0 || (wait_addr & NOTIFY_BIT)) { + // Either the thread wasn't waiting (In this case it will see NOTIFY_BIT and + // return early once it enters its next `emscripten_futex_wait`), or someone + // else is already in the process of notifying it. + return; + } + + // We set the NOTIFY_BIT bit and are responsible for waking the target. + // The target is currently waiting on `wait_addr`. + while (target->wait_addr == (wait_addr | NOTIFY_BIT)) { + emscripten_futex_wake((void*)wait_addr, INT_MAX); + // TODO: Can we put some kind of yield instruction here? For example, + // it we ever support an `atomics.pause` Wasm instrction this would be a + // good place for it. + } +} diff --git a/system/lib/pthread/emscripten_yield.c b/system/lib/pthread/emscripten_yield.c index fa3e0398b70a2..578faa355ef50 100644 --- a/system/lib/pthread/emscripten_yield.c +++ b/system/lib/pthread/emscripten_yield.c @@ -11,6 +11,7 @@ static _Atomic pthread_t crashed_thread_id = NULL; void _emscripten_thread_crashed() { crashed_thread_id = pthread_self(); + _emscripten_thread_notify(emscripten_main_runtime_thread_id()); } static void dummy(double now) diff --git a/system/lib/pthread/library_pthread.c b/system/lib/pthread/library_pthread.c index 4f104d6549750..a89425874b113 100644 --- a/system/lib/pthread/library_pthread.c +++ b/system/lib/pthread/library_pthread.c @@ -79,18 +79,25 @@ int pthread_mutexattr_setprioceiling(pthread_mutexattr_t *attr, int prioceiling) void emscripten_thread_sleep(double msecs) { // We include emscripten_current_thread_process_queued_calls before and // after sleeping since that is how we recieve "async" signals. - // We include __pthread_testcancel here becuase clock_nanosleep is + // We include __pthread_testcancel here because clock_nanosleep is // a pthread cancelation point. emscripten_current_thread_process_queued_calls(); __pthread_testcancel(); - emscripten_conditional_set_current_thread_status(EM_THREAD_STATUS_RUNNING, - EM_THREAD_STATUS_SLEEPING); - uint32_t dummyZeroAddress = 0; - emscripten_futex_wait(&dummyZeroAddress, 0, msecs); - emscripten_conditional_set_current_thread_status(EM_THREAD_STATUS_SLEEPING, - EM_THREAD_STATUS_RUNNING); - emscripten_current_thread_process_queued_calls(); - __pthread_testcancel(); + if (msecs > 0) { + uint32_t dummyZeroAddress = 0; + double start = emscripten_get_now(); + double elapsed = 0; + while (elapsed < msecs) { + emscripten_conditional_set_current_thread_status(EM_THREAD_STATUS_RUNNING, + EM_THREAD_STATUS_SLEEPING); + emscripten_futex_wait(&dummyZeroAddress, 0, msecs - elapsed); + emscripten_conditional_set_current_thread_status(EM_THREAD_STATUS_SLEEPING, + EM_THREAD_STATUS_RUNNING); + emscripten_current_thread_process_queued_calls(); + __pthread_testcancel(); + elapsed = emscripten_get_now() - start; + } + } } static struct pthread __main_pthread; diff --git a/system/lib/pthread/proxying.c b/system/lib/pthread/proxying.c index 7625eba8ab3a6..6a2ee7f0c46e8 100644 --- a/system/lib/pthread/proxying.c +++ b/system/lib/pthread/proxying.c @@ -162,7 +162,15 @@ static bool do_proxy(em_proxying_queue* q, pthread_t target_thread, task t) { return false; } - return em_task_queue_send(tasks, t); + bool rtn = em_task_queue_send(tasks, t); + // Then proxying work to the main thread using the system queue we have + // a very special case in that we need the target thread to wake up from + // `emscripten_futex_wait` to process the queue. + if (rtn && is_system_queue && pthread_equal(target_thread, emscripten_main_runtime_thread_id())) { + DBG("waking main runtime thread using _emscripten_thread_notify"); + _emscripten_thread_notify(target_thread); + } + return rtn; } bool emscripten_proxy_async(em_proxying_queue* q, diff --git a/system/lib/pthread/thread_mailbox.c b/system/lib/pthread/thread_mailbox.c index 701218e3a137b..ba0eac547e041 100644 --- a/system/lib/pthread/thread_mailbox.c +++ b/system/lib/pthread/thread_mailbox.c @@ -77,6 +77,7 @@ void _emscripten_check_mailbox() { // For example, in PROXY_TO_PTHREAD the atexit functions are called via // a proxied call, and without this call to synchronize we would crash if // any atexit functions were registered from a side module. + DBG("_emscripten_check_mailbox"); assert(pthread_self()); em_task_queue* mailbox = pthread_self()->mailbox; mailbox->notification = NOTIFICATION_RECEIVED; diff --git a/system/lib/pthread/threading_internal.h b/system/lib/pthread/threading_internal.h index abc6cbff7f22c..edfd779ca3444 100644 --- a/system/lib/pthread/threading_internal.h +++ b/system/lib/pthread/threading_internal.h @@ -9,6 +9,16 @@ #include +// Defined THREADING_DEBUG here (or in the build system) to enabled verbose +// logging using emscripten_dbgf. +// #define THREADING_DEBUG + +#ifdef THREADING_DEBUG +#define DBG(format, ...) emscripten_dbgf(format, ##__VA_ARGS__) +#else +#define DBG(format, ...) +#endif + #define EM_THREAD_NAME_MAX 32 #define EM_THREAD_STATUS int @@ -104,3 +114,8 @@ void _emscripten_run_js_on_main_thread_done(void* ctx, void* arg, double result) // if called from the main browser thread, this function will return zero // since blocking is not allowed there). int _emscripten_thread_supports_atomics_wait(void); + +// Notify a thread that it has new work to do (cancellation or mailbox). +// This will interrupt the thread if it is currently blocked in +// emscripten_futex_wait. +void _emscripten_thread_notify(pthread_t thread); diff --git a/test/codesize/test_codesize_cxx_ctors1.json b/test/codesize/test_codesize_cxx_ctors1.json index 0b8bea69b50e7..91ad85c53860c 100644 --- a/test/codesize/test_codesize_cxx_ctors1.json +++ b/test/codesize/test_codesize_cxx_ctors1.json @@ -2,9 +2,9 @@ "a.out.js": 19194, "a.out.js.gz": 7969, "a.out.nodebug.wasm": 132635, - "a.out.nodebug.wasm.gz": 49922, + "a.out.nodebug.wasm.gz": 49924, "total": 151829, - "total_gz": 57891, + "total_gz": 57893, "sent": [ "__cxa_throw", "_abort_js", diff --git a/test/codesize/test_codesize_cxx_ctors2.json b/test/codesize/test_codesize_cxx_ctors2.json index 4603f919e7c55..ffa319647e766 100644 --- a/test/codesize/test_codesize_cxx_ctors2.json +++ b/test/codesize/test_codesize_cxx_ctors2.json @@ -1,10 +1,10 @@ { "a.out.js": 19171, "a.out.js.gz": 7957, - "a.out.nodebug.wasm": 132061, - "a.out.nodebug.wasm.gz": 49579, - "total": 151232, - "total_gz": 57536, + "a.out.nodebug.wasm": 132063, + "a.out.nodebug.wasm.gz": 49580, + "total": 151234, + "total_gz": 57537, "sent": [ "__cxa_throw", "_abort_js", diff --git a/test/codesize/test_codesize_cxx_except_wasm.json b/test/codesize/test_codesize_cxx_except_wasm.json index 66e3ff01b1367..4bc0805d82c50 100644 --- a/test/codesize/test_codesize_cxx_except_wasm.json +++ b/test/codesize/test_codesize_cxx_except_wasm.json @@ -2,9 +2,9 @@ "a.out.js": 19026, "a.out.js.gz": 7904, "a.out.nodebug.wasm": 147926, - "a.out.nodebug.wasm.gz": 55308, + "a.out.nodebug.wasm.gz": 55309, "total": 166952, - "total_gz": 63212, + "total_gz": 63213, "sent": [ "_abort_js", "_tzset_js", diff --git a/test/codesize/test_codesize_cxx_lto.json b/test/codesize/test_codesize_cxx_lto.json index 6c07b8cbdbf3f..4c2da355a7ebc 100644 --- a/test/codesize/test_codesize_cxx_lto.json +++ b/test/codesize/test_codesize_cxx_lto.json @@ -2,9 +2,9 @@ "a.out.js": 18563, "a.out.js.gz": 7666, "a.out.nodebug.wasm": 102164, - "a.out.nodebug.wasm.gz": 39553, + "a.out.nodebug.wasm.gz": 39554, "total": 120727, - "total_gz": 47219, + "total_gz": 47220, "sent": [ "a (emscripten_resize_heap)", "b (_setitimer_js)", diff --git a/test/codesize/test_codesize_cxx_noexcept.json b/test/codesize/test_codesize_cxx_noexcept.json index 59ac98658db69..2d84e07e951e9 100644 --- a/test/codesize/test_codesize_cxx_noexcept.json +++ b/test/codesize/test_codesize_cxx_noexcept.json @@ -2,9 +2,9 @@ "a.out.js": 19194, "a.out.js.gz": 7969, "a.out.nodebug.wasm": 134657, - "a.out.nodebug.wasm.gz": 50774, + "a.out.nodebug.wasm.gz": 50769, "total": 153851, - "total_gz": 58743, + "total_gz": 58738, "sent": [ "__cxa_throw", "_abort_js", diff --git a/test/codesize/test_codesize_cxx_wasmfs.json b/test/codesize/test_codesize_cxx_wasmfs.json index 577ddcf1fd9b6..60713bafd1b45 100644 --- a/test/codesize/test_codesize_cxx_wasmfs.json +++ b/test/codesize/test_codesize_cxx_wasmfs.json @@ -2,9 +2,9 @@ "a.out.js": 7023, "a.out.js.gz": 3310, "a.out.nodebug.wasm": 172710, - "a.out.nodebug.wasm.gz": 63317, + "a.out.nodebug.wasm.gz": 63307, "total": 179733, - "total_gz": 66627, + "total_gz": 66617, "sent": [ "__cxa_throw", "_abort_js", diff --git a/test/codesize/test_codesize_hello_dylink_all.json b/test/codesize/test_codesize_hello_dylink_all.json index 105e4113624d7..fd8921124b9cd 100644 --- a/test/codesize/test_codesize_hello_dylink_all.json +++ b/test/codesize/test_codesize_hello_dylink_all.json @@ -1,7 +1,7 @@ { "a.out.js": 244343, - "a.out.nodebug.wasm": 577473, - "total": 821816, + "a.out.nodebug.wasm": 577634, + "total": 821977, "sent": [ "IMG_Init", "IMG_Load", @@ -1958,6 +1958,7 @@ "_emscripten_find_dylib", "_emscripten_memcpy_bulkmem", "_emscripten_memset_bulkmem", + "_emscripten_next_timer", "_emscripten_run_callback_on_thread", "_emscripten_set_offscreencanvas_size_on_thread", "_emscripten_stack_alloc", @@ -3819,6 +3820,7 @@ "$_emscripten_find_dylib", "$_emscripten_memcpy_bulkmem", "$_emscripten_memset_bulkmem", + "$_emscripten_next_timer", "$_emscripten_run_callback_on_thread", "$_emscripten_set_offscreencanvas_size_on_thread", "$_emscripten_stack_alloc", diff --git a/test/codesize/test_codesize_minimal_pthreads.json b/test/codesize/test_codesize_minimal_pthreads.json index 050cb6cd65f24..2b9278e555869 100644 --- a/test/codesize/test_codesize_minimal_pthreads.json +++ b/test/codesize/test_codesize_minimal_pthreads.json @@ -1,10 +1,10 @@ { "a.out.js": 7367, "a.out.js.gz": 3603, - "a.out.nodebug.wasm": 19003, - "a.out.nodebug.wasm.gz": 8786, - "total": 26370, - "total_gz": 12389, + "a.out.nodebug.wasm": 19074, + "a.out.nodebug.wasm.gz": 8817, + "total": 26441, + "total_gz": 12420, "sent": [ "a (memory)", "b (exit)", @@ -96,6 +96,7 @@ "$_emscripten_thread_free_data", "$_emscripten_thread_init", "$_emscripten_thread_mailbox_init", + "$_emscripten_thread_notify", "$_emscripten_tls_init", "$_emscripten_yield", "$_main_thread", diff --git a/test/codesize/test_codesize_minimal_pthreads_memgrowth.json b/test/codesize/test_codesize_minimal_pthreads_memgrowth.json index 3076a1322f4a5..e2fdc1b4f1fe2 100644 --- a/test/codesize/test_codesize_minimal_pthreads_memgrowth.json +++ b/test/codesize/test_codesize_minimal_pthreads_memgrowth.json @@ -1,10 +1,10 @@ { "a.out.js": 7769, "a.out.js.gz": 3809, - "a.out.nodebug.wasm": 19004, - "a.out.nodebug.wasm.gz": 8787, - "total": 26773, - "total_gz": 12596, + "a.out.nodebug.wasm": 19075, + "a.out.nodebug.wasm.gz": 8818, + "total": 26844, + "total_gz": 12627, "sent": [ "a (memory)", "b (exit)", @@ -96,6 +96,7 @@ "$_emscripten_thread_free_data", "$_emscripten_thread_init", "$_emscripten_thread_mailbox_init", + "$_emscripten_thread_notify", "$_emscripten_tls_init", "$_emscripten_yield", "$_main_thread", diff --git a/test/codesize/test_minimal_runtime_code_size_hello_wasm_worker.json b/test/codesize/test_minimal_runtime_code_size_hello_wasm_worker.json index 678af8ef670c3..80a918a51986f 100644 --- a/test/codesize/test_minimal_runtime_code_size_hello_wasm_worker.json +++ b/test/codesize/test_minimal_runtime_code_size_hello_wasm_worker.json @@ -3,8 +3,8 @@ "a.html.gz": 355, "a.js": 956, "a.js.gz": 605, - "a.wasm": 2584, - "a.wasm.gz": 1438, - "total": 4055, - "total_gz": 2398 + "a.wasm": 2592, + "a.wasm.gz": 1447, + "total": 4063, + "total_gz": 2407 } diff --git a/test/pthread/test_pthread_cleanup.c b/test/pthread/test_pthread_cleanup.c index 9d07817f641d0..400e7d05d771f 100644 --- a/test/pthread/test_pthread_cleanup.c +++ b/test/pthread/test_pthread_cleanup.c @@ -29,10 +29,12 @@ static void cleanup_handler2(void *arg) { } static void *thread_start1(void *arg) { + printf("thread_start1\n"); pthread_cleanup_push(cleanup_handler1, (void*)(42 + (long)arg*100)); pthread_cleanup_push(cleanup_handler2, (void*)(69 + (long)arg*100)); pthread_cleanup_pop((int)(intptr_t)arg); pthread_cleanup_pop((int)(intptr_t)arg); + printf("thread_start1 done\n"); pthread_exit(0); } @@ -60,20 +62,27 @@ static void *thread_start3(void *arg) { pthread_t thr[4]; int main() { + int s; int result = 0; pthread_cleanup_push(cleanup_handler1, (void*)9998); pthread_cleanup_push(cleanup_handler1, (void*)9999); - int s = pthread_create(&thr[0], NULL, thread_start1, (void*)0); + s = pthread_create(&thr[0], NULL, thread_start1, (void*)0); assert(s == 0); + printf("joining thread 0"); pthread_join(thr[0], 0); + printf("done join"); s = pthread_create(&thr[1], NULL, thread_start1, (void*)1); assert(s == 0); + printf("joining thread 1"); pthread_join(thr[1], 0); + printf("done join\n"); s = pthread_create(&thr[2], NULL, thread_start2, (void*)1); assert(s == 0); + printf("joining thread 2"); pthread_join(thr[2], 0); + printf("done join"); // TODO // s = pthread_create(&thr[3], NULL, thread_start3, (void*)1); // assert(s == 0); diff --git a/test/pthread/test_pthread_kill.out b/test/pthread/test_pthread_kill.out index 9c0d30aee8994..f7a70be40ea36 100644 --- a/test/pthread/test_pthread_kill.out +++ b/test/pthread/test_pthread_kill.out @@ -4,6 +4,6 @@ thread has started, sending SIGTERM SIGTERM sent signal: 15 onthread=1 got term signal, sending signal back to main thread -joined child_thread signal: 10 onthread=0 +joined child_thread got SIGUSR1. all done. diff --git a/test/pthread/test_pthread_mandelbrot.cpp b/test/pthread/test_pthread_mandelbrot.cpp index d0edb4a3d682c..69f8726b4f088 100644 --- a/test/pthread/test_pthread_mandelbrot.cpp +++ b/test/pthread/test_pthread_mandelbrot.cpp @@ -307,7 +307,9 @@ void *mandelbrot_thread(void *arg) for(;;) { - emscripten_futex_wait(&tasksPending[idx], 0, INFINITY); + while (tasksPending[idx] == 0) { + emscripten_futex_wait(&tasksPending[idx], 0, INFINITY); + } tasksPending[idx] = 0; double t0 = emscripten_get_now(); int ni; diff --git a/test/pthread/test_pthread_proxying_in_futex_wait.cpp b/test/pthread/test_pthread_proxying_in_futex_wait.cpp index 605b75bc2be51..9190a65feba6f 100644 --- a/test/pthread/test_pthread_proxying_in_futex_wait.cpp +++ b/test/pthread/test_pthread_proxying_in_futex_wait.cpp @@ -35,14 +35,16 @@ int main() pthread_t thread; int rc = pthread_create(&thread, NULL, ThreadMain, 0); assert(rc == 0); - rc = emscripten_futex_wait(&main_thread_wait_val, 1, 15 * 1000); - // An rc of 0 means no error, and of EWOULDBLOCK means that the value is - // not the expected one, which can happen if the pthread manages to set it - // before we reach the futex_wait. - if (rc != 0 && rc != -EWOULDBLOCK) - { - printf("ERROR! futex wait errored %d!\n", rc); - return 2; + while (main_thread_wait_val != 0) { + rc = emscripten_futex_wait(&main_thread_wait_val, 1, 15 * 1000); + // An rc of 0 means no error, and of EWOULDBLOCK means that the value is + // not the expected one, which can happen if the pthread manages to set it + // before we reach the futex_wait. + if (rc != 0 && rc != -EWOULDBLOCK) + { + printf("ERROR! futex wait errored %d!\n", rc); + return 2; + } } pthread_join(thread, 0);