fix(typescript-client): quarantine aborted responses#4115
Open
KyleAMathews wants to merge 7 commits intomainfrom
Open
fix(typescript-client): quarantine aborted responses#4115KyleAMathews wants to merge 7 commits intomainfrom
KyleAMathews wants to merge 7 commits intomainfrom
Conversation
commit: |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4115 +/- ##
==========================================
- Coverage 89.20% 88.92% -0.29%
==========================================
Files 25 25
Lines 2520 2754 +234
Branches 640 726 +86
==========================================
+ Hits 2248 2449 +201
- Misses 270 301 +31
- Partials 2 4 +2
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fixes one aborted-request race in
ShapeStreamand adds much denser generation-aware diagnostics for the remaining field failures we are still chasing.This matters for desktop runtimes too, including Tauri/WebView, where abort semantics can be less strict than a normal browser tab. When a late or parallel request generation collides with a newer one, the stream can already be in
ErrorState, producing repeated:[Electric] Response was ignored by state "error"and silently dropping updates until a hard refresh.
This PR also adds an opt-in client diagnostics mode that can be enabled from
localStorage, refreshed, and used to capture denser request / response / state logs in the field without changing app code.It now also reduces
onError -> {}retry churn substantially:3consecutive retries instead of50console.logwith the full triggering error object attachedRoot Cause
ShapeStreamassumed that once a request signal was aborted, that request could no longer deliver a successful response into the state machine.That assumption was not fully enforced in the fetch wrapper chain.
So the client could hit this sequence:
ErrorState200ErrorStateSeparately, once an app-specific
onErrorhandler returned{}to request a restart, the client would restart immediately. Because4xxresponses other than429bypass transport backoff, a persistent400/ malformed200path could churn quickly.Further investigation showed that we also need better evidence for other possible failure shapes, especially:
response/messages/sseClose) landing after the stream already enteredErrorStateThis PR now instruments those paths directly.
Fix
signal.abortedafter fetch resolution increateFetchWithBackoffsignal.abortedbefore and after body consumption increateFetchWithConsumedMessagesSPEC.mdto document the transport invariant: aborted requests are quarantined before state-machine deliveryErrorStatelocalStorage.setItem('electric.debug', 'true')orlocalStorage.setItem('debug', 'electric*')console.infoline and rate-limiting verboseconsole.debugoutputonError -> {}retry loop from50to3console.logon everyonError-driven restart and attach the full error object to that log entryrequest:dispatch,response:headers,messages:batch,sse:closed, ignored responses, andErrorStateentrylastErrorGeneration,lastErrorRequestId,lastErrorName) so later warnings can be tied back to the original failureconsole.debugEnable Diagnostics
No app code change is required. In the affected client runtime:
Then refresh / reload the app.
When diagnostics are enabled, the client now prints one visible
console.infoline confirming that diagnostics are active. Detailed per-request diagnostics are emitted atconsole.debug/Verboselevel in DevTools, and are rate-limited to avoid overwhelming a tight-looping runtime.Logs To Expect
With this PR, the app team should now see logs like these when a stream goes bad:
1. Diagnostics mode turned on successfully
If that line does not appear after refresh, diagnostics were not enabled early enough for stream construction.
2. First transition into
ErrorStateThis is the most important warning. It tells you:
previousState)errorGeneration,errorRequestId,errorTransport)isActiveGenerationError)activeGeneration)lastErrorGeneration,lastErrorRequestId,lastErrorName)currentUrl,shapeKey)handle,offset,cursor)errorName,errorMessage)3. When the app's
onErrorrequests an immediate restartThis is emitted with
console.log, and the fullError/FetchErrorobject is attached as the second console argument so DevTools should show the full stack, status, headers, and any parsed body.4. If a stale request generation surfaces an error after a newer generation is already active
If this appears, the most important fields to capture are the active generation, the stale generation, and the request ids around the first
Entered error statewarning.5. If a response is ignored while already in
ErrorStateThis warning is now rate-limited much more aggressively so a pathological loop does not bury the useful logs. The per-event generation details are still available in
console.debug.6. Generation-aware debug lines around the failure
These are the most useful lines for distinguishing:
7. If the app's
onErrorkeeps retrying and the client finally gives upThat is especially relevant for apps that return
{}fromonErroron unknown sync failures.8. If the runtime is in a pathological loop and verbose diagnostics are being throttled
What To Capture From The Tester
If the user can keep the runtime alive long enough, the most helpful things to screenshot or copy are:
Entered error statewarning, not just the later ignored-response warningsA stale request generation surfaced an error...warningrequest:dispatch,response:headers,messages:batch, andsse:closeddebug lines immediately before the first failuresnapshot:pause-acquired,snapshot:fetch:start,snapshot:fetch:409,snapshot:error, orsnapshot:pause-releaseddebug lines if the page uses live queries / subsetsSnapshot "snapshot-N" has held the pause lock for 30swarning, especially with the attachedsubset=...summaryconsole.infoline appeared after refresh3retries or keeps recreating entirely new streamsVerification
Both pass (
375unit tests).