fix: cdp retry on disconnect and crash#357
Conversation
Greptile SummaryThis PR adds robust retry logic for CDP connection failures and unexpected disconnections. The implementation properly handles connection state with flags to prevent concurrent operations, rejects pending requests when connection is lost, and implements a crash-recovery pattern that exits the process for external restart after retries are exhausted. Key improvements:
Review notes:
Confidence Score: 5/5
Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A[CDP Connection] -->|Close Event| B{Check Flags}
B -->|disconnecting=true| C[Ignore]
B -->|reconnecting=true| C
B -->|Both false| D[Set reconnecting=true]
D --> E[Reject Pending Requests]
E --> F[Start Reconnection Loop]
F --> G{Attempt < Max Retries?}
G -->|Yes| H[Wait Retry Delay]
H --> I[Attempt Connect]
I -->|Success| J[Log Success]
J --> K[Clear reconnecting flag]
I -->|Fail| G
G -->|No| L[Log Error]
L --> M[Exit Process]
N[Initial Connection] --> O{Attempt < Max Retries?}
O -->|Yes| P[Attempt Connect]
P -->|Success| Q[Return]
P -->|Fail| R[Wait Retry Delay]
R --> O
O -->|No| S[Throw Error]
Last reviewed commit: 5861e0a |
| logger.error( | ||
| 'CDP WebSocket closed unexpectedly, attempting reconnection...', | ||
| ) | ||
| this.reconnectOrCrash() |
There was a problem hiding this comment.
reconnectOrCrash() is not awaited, allowing multiple concurrent reconnection attempts if close events fire rapidly
| this.reconnectOrCrash() | |
| await this.reconnectOrCrash() |
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 104
Comment:
`reconnectOrCrash()` is not awaited, allowing multiple concurrent reconnection attempts if close events fire rapidly
```suggestion
await this.reconnectOrCrash()
```
How can I resolve this? If you propose a fix, please make it concise.| private async reconnectOrCrash(): Promise<void> { | ||
| const maxRetries = TIMEOUTS.CDP_CONNECT_MAX_RETRIES | ||
| const retryDelay = TIMEOUTS.CDP_CONNECT_RETRY_DELAY | ||
|
|
||
| for (let attempt = 1; attempt <= maxRetries; attempt++) { | ||
| try { | ||
| logger.info(`CDP reconnection attempt ${attempt}/${maxRetries}...`) | ||
| await sleep(retryDelay) | ||
| await this.attemptConnect() | ||
| logger.info('CDP reconnected successfully') | ||
| return | ||
| } catch (error) { | ||
| const msg = error instanceof Error ? error.message : String(error) | ||
| logger.warn( | ||
| `CDP reconnection attempt ${attempt}/${maxRetries} failed: ${msg}`, | ||
| ) | ||
| } | ||
| } | ||
|
|
||
| this.ws.onmessage = (event) => { | ||
| this.handleMessage(event.data as string) | ||
| } | ||
| }) | ||
| logger.error( | ||
| `CDP reconnection failed after ${maxRetries} attempts, exiting for restart`, | ||
| ) | ||
| process.exit(EXIT_CODES.GENERAL_ERROR) | ||
| } |
There was a problem hiding this comment.
pending requests are never rejected when reconnection starts, causing them to hang indefinitely if the connection drops mid-request
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 107-130
Comment:
pending requests are never rejected when reconnection starts, causing them to hang indefinitely if the connection drops mid-request
How can I resolve this? If you propose a fix, please make it concise.| private handleUnexpectedClose(): void { | ||
| if (this.disconnecting) return | ||
|
|
||
| this.ws.onerror = (event) => { | ||
| reject(new Error(`CDP WebSocket error: ${event}`)) | ||
| } | ||
| logger.error( | ||
| 'CDP WebSocket closed unexpectedly, attempting reconnection...', | ||
| ) | ||
| this.reconnectOrCrash() | ||
| } |
There was a problem hiding this comment.
add guard to prevent multiple concurrent reconnection attempts
| private handleUnexpectedClose(): void { | |
| if (this.disconnecting) return | |
| this.ws.onerror = (event) => { | |
| reject(new Error(`CDP WebSocket error: ${event}`)) | |
| } | |
| logger.error( | |
| 'CDP WebSocket closed unexpectedly, attempting reconnection...', | |
| ) | |
| this.reconnectOrCrash() | |
| } | |
| private reconnecting = false | |
| private handleUnexpectedClose(): void { | |
| if (this.disconnecting || this.reconnecting) return | |
| logger.error( | |
| 'CDP WebSocket closed unexpectedly, attempting reconnection...', | |
| ) | |
| this.reconnecting = true | |
| this.reconnectOrCrash().finally(() => { | |
| this.reconnecting = false | |
| }) | |
| } |
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 98-105
Comment:
add guard to prevent multiple concurrent reconnection attempts
```suggestion
private reconnecting = false
private handleUnexpectedClose(): void {
if (this.disconnecting || this.reconnecting) return
logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
this.reconnecting = true
this.reconnectOrCrash().finally(() => {
this.reconnecting = false
})
}
```
How can I resolve this? If you propose a fix, please make it concise.|
@greptileai review |
| logger.error( | ||
| 'CDP WebSocket closed unexpectedly, attempting reconnection...', | ||
| ) | ||
| this.reconnectOrCrash() |
There was a problem hiding this comment.
reconnectOrCrash() not awaited - if multiple close events fire, multiple concurrent reconnection attempts will run
| this.reconnectOrCrash() | |
| await this.reconnectOrCrash() |
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 105
Comment:
`reconnectOrCrash()` not awaited - if multiple close events fire, multiple concurrent reconnection attempts will run
```suggestion
await this.reconnectOrCrash()
```
How can I resolve this? If you propose a fix, please make it concise.| private async reconnectOrCrash(): Promise<void> { | ||
| const maxRetries = CDP_LIMITS.CONNECT_MAX_RETRIES | ||
| const retryDelay = TIMEOUTS.CDP_CONNECT_RETRY_DELAY | ||
|
|
||
| for (let attempt = 1; attempt <= maxRetries; attempt++) { | ||
| try { | ||
| logger.info(`CDP reconnection attempt ${attempt}/${maxRetries}...`) | ||
| await Bun.sleep(retryDelay) | ||
| await this.attemptConnect() | ||
| logger.info('CDP reconnected successfully') | ||
| return | ||
| } catch (error) { | ||
| const msg = error instanceof Error ? error.message : String(error) | ||
| logger.warn( | ||
| `CDP reconnection attempt ${attempt}/${maxRetries} failed: ${msg}`, | ||
| ) | ||
| } | ||
| } | ||
|
|
||
| this.ws.onmessage = (event) => { | ||
| this.handleMessage(event.data as string) | ||
| } | ||
| }) | ||
| logger.error( | ||
| `CDP reconnection failed after ${maxRetries} attempts, exiting for restart`, | ||
| ) | ||
| process.exit(EXIT_CODES.GENERAL_ERROR) | ||
| } |
There was a problem hiding this comment.
No guard prevents concurrent reconnection attempts. Add a reconnecting flag similar to disconnecting:
private reconnecting = false
private async reconnectOrCrash(): Promise<void> {
if (this.reconnecting) return
this.reconnecting = true
try {
const maxRetries = CDP_LIMITS.CONNECT_MAX_RETRIES
const retryDelay = TIMEOUTS.CDP_CONNECT_RETRY_DELAY
for (let attempt = 1; attempt <= maxRetries; attempt++) {
// ... existing retry logic
}
logger.error(
`CDP reconnection failed after ${maxRetries} attempts, exiting for restart`,
)
process.exit(EXIT_CODES.GENERAL_ERROR)
} finally {
this.reconnecting = false
}
}Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 108-131
Comment:
No guard prevents concurrent reconnection attempts. Add a `reconnecting` flag similar to `disconnecting`:
```typescript
private reconnecting = false
private async reconnectOrCrash(): Promise<void> {
if (this.reconnecting) return
this.reconnecting = true
try {
const maxRetries = CDP_LIMITS.CONNECT_MAX_RETRIES
const retryDelay = TIMEOUTS.CDP_CONNECT_RETRY_DELAY
for (let attempt = 1; attempt <= maxRetries; attempt++) {
// ... existing retry logic
}
logger.error(
`CDP reconnection failed after ${maxRetries} attempts, exiting for restart`,
)
process.exit(EXIT_CODES.GENERAL_ERROR)
} finally {
this.reconnecting = false
}
}
```
How can I resolve this? If you propose a fix, please make it concise.| private handleUnexpectedClose(): void { | ||
| if (this.disconnecting) return | ||
|
|
||
| this.ws.onerror = (event) => { | ||
| reject(new Error(`CDP WebSocket error: ${event}`)) | ||
| } | ||
| logger.error( | ||
| 'CDP WebSocket closed unexpectedly, attempting reconnection...', | ||
| ) | ||
| this.reconnectOrCrash() | ||
| } |
There was a problem hiding this comment.
Pending requests never rejected when connection drops - they hang indefinitely. Reject all pending requests before reconnecting:
private handleUnexpectedClose(): void {
if (this.disconnecting) return
// Reject all pending requests
for (const [id, request] of this.pending.entries()) {
request.reject(new Error('CDP connection lost'))
}
this.pending.clear()
logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
await this.reconnectOrCrash()
}Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 99-106
Comment:
Pending requests never rejected when connection drops - they hang indefinitely. Reject all pending requests before reconnecting:
```typescript
private handleUnexpectedClose(): void {
if (this.disconnecting) return
// Reject all pending requests
for (const [id, request] of this.pending.entries()) {
request.reject(new Error('CDP connection lost'))
}
this.pending.clear()
logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
await this.reconnectOrCrash()
}
```
How can I resolve this? If you propose a fix, please make it concise.| this.reconnecting = true | ||
| this.reconnectOrCrash().finally(() => { | ||
| this.reconnecting = false | ||
| }) |
There was a problem hiding this comment.
reconnecting flag should be set inside the promise chain to prevent race condition. If reconnectOrCrash() Promise is created but .finally() runs synchronously before async work starts, the flag could be cleared prematurely.
| this.reconnecting = true | |
| this.reconnectOrCrash().finally(() => { | |
| this.reconnecting = false | |
| }) | |
| logger.error( | |
| 'CDP WebSocket closed unexpectedly, attempting reconnection...', | |
| ) | |
| void (async () => { | |
| this.reconnecting = true | |
| try { | |
| await this.reconnectOrCrash() | |
| } finally { | |
| this.reconnecting = false | |
| } | |
| })() |
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 108-111
Comment:
`reconnecting` flag should be set inside the promise chain to prevent race condition. If `reconnectOrCrash()` Promise is created but `.finally()` runs synchronously before async work starts, the flag could be cleared prematurely.
```suggestion
logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
void (async () => {
this.reconnecting = true
try {
await this.reconnectOrCrash()
} finally {
this.reconnecting = false
}
})()
```
How can I resolve this? If you propose a fix, please make it concise.
No description provided.