Skip to content

fix: cdp retry on disconnect and crash#357

Merged
shadowfax92 merged 3 commits intomainfrom
feat/fix-cdp-reconnect
Feb 23, 2026
Merged

fix: cdp retry on disconnect and crash#357
shadowfax92 merged 3 commits intomainfrom
feat/fix-cdp-reconnect

Conversation

@shadowfax92
Copy link
Contributor

No description provided.

@github-actions github-actions bot added the fix label Feb 23, 2026
@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 23, 2026

Greptile Summary

This PR adds robust retry logic for CDP connection failures and unexpected disconnections. The implementation properly handles connection state with flags to prevent concurrent operations, rejects pending requests when connection is lost, and implements a crash-recovery pattern that exits the process for external restart after retries are exhausted.

Key improvements:

  • Added retry logic to initial connection with configurable max retries (3) and delay (1s)
  • Implemented automatic reconnection on unexpected websocket close with same retry parameters
  • Added reconnecting and disconnecting flags to prevent concurrent connection attempts
  • Pending requests are now properly rejected with error when connection is lost, preventing indefinite hangs
  • Process exits with EXIT_CODES.GENERAL_ERROR after max retries to trigger external restart
  • Configuration constants moved to shared package following project conventions

Review notes:

  • Previous review concerns about concurrent reconnection attempts and hanging requests have been addressed
  • The synchronous flag setting before async operations correctly prevents race conditions in JavaScript's single-threaded event loop
  • The implementation follows CLAUDE.md guidelines for centralized constants

Confidence Score: 5/5

  • This PR is safe to merge - all previous review concerns have been properly addressed
  • The implementation correctly addresses all previous review feedback. The retry/reconnection logic properly manages state with flags, rejects pending requests, and follows project conventions. The flag management pattern is correct due to JavaScript's single-threaded execution model - flags are set synchronously before any async work begins.
  • No files require special attention

Important Files Changed

Filename Overview
apps/server/src/browser/backends/cdp.ts Added retry logic for CDP connection/reconnection with proper state management and pending request cleanup. Implementation addresses previous review feedback but has one minor state management concern.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[CDP Connection] -->|Close Event| B{Check Flags}
    B -->|disconnecting=true| C[Ignore]
    B -->|reconnecting=true| C
    B -->|Both false| D[Set reconnecting=true]
    D --> E[Reject Pending Requests]
    E --> F[Start Reconnection Loop]
    F --> G{Attempt < Max Retries?}
    G -->|Yes| H[Wait Retry Delay]
    H --> I[Attempt Connect]
    I -->|Success| J[Log Success]
    J --> K[Clear reconnecting flag]
    I -->|Fail| G
    G -->|No| L[Log Error]
    L --> M[Exit Process]
    
    N[Initial Connection] --> O{Attempt < Max Retries?}
    O -->|Yes| P[Attempt Connect]
    P -->|Success| Q[Return]
    P -->|Fail| R[Wait Retry Delay]
    R --> O
    O -->|No| S[Throw Error]
Loading

Last reviewed commit: 5861e0a

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
this.reconnectOrCrash()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reconnectOrCrash() is not awaited, allowing multiple concurrent reconnection attempts if close events fire rapidly

Suggested change
this.reconnectOrCrash()
await this.reconnectOrCrash()
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 104

Comment:
`reconnectOrCrash()` is not awaited, allowing multiple concurrent reconnection attempts if close events fire rapidly

```suggestion
    await this.reconnectOrCrash()
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 107 to 130
private async reconnectOrCrash(): Promise<void> {
const maxRetries = TIMEOUTS.CDP_CONNECT_MAX_RETRIES
const retryDelay = TIMEOUTS.CDP_CONNECT_RETRY_DELAY

for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
logger.info(`CDP reconnection attempt ${attempt}/${maxRetries}...`)
await sleep(retryDelay)
await this.attemptConnect()
logger.info('CDP reconnected successfully')
return
} catch (error) {
const msg = error instanceof Error ? error.message : String(error)
logger.warn(
`CDP reconnection attempt ${attempt}/${maxRetries} failed: ${msg}`,
)
}
}

this.ws.onmessage = (event) => {
this.handleMessage(event.data as string)
}
})
logger.error(
`CDP reconnection failed after ${maxRetries} attempts, exiting for restart`,
)
process.exit(EXIT_CODES.GENERAL_ERROR)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pending requests are never rejected when reconnection starts, causing them to hang indefinitely if the connection drops mid-request

Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 107-130

Comment:
pending requests are never rejected when reconnection starts, causing them to hang indefinitely if the connection drops mid-request

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 98 to 105
private handleUnexpectedClose(): void {
if (this.disconnecting) return

this.ws.onerror = (event) => {
reject(new Error(`CDP WebSocket error: ${event}`))
}
logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
this.reconnectOrCrash()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add guard to prevent multiple concurrent reconnection attempts

Suggested change
private handleUnexpectedClose(): void {
if (this.disconnecting) return
this.ws.onerror = (event) => {
reject(new Error(`CDP WebSocket error: ${event}`))
}
logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
this.reconnectOrCrash()
}
private reconnecting = false
private handleUnexpectedClose(): void {
if (this.disconnecting || this.reconnecting) return
logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
this.reconnecting = true
this.reconnectOrCrash().finally(() => {
this.reconnecting = false
})
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 98-105

Comment:
add guard to prevent multiple concurrent reconnection attempts

```suggestion
  private reconnecting = false

  private handleUnexpectedClose(): void {
    if (this.disconnecting || this.reconnecting) return

    logger.error(
      'CDP WebSocket closed unexpectedly, attempting reconnection...',
    )
    this.reconnecting = true
    this.reconnectOrCrash().finally(() => {
      this.reconnecting = false
    })
  }
```

How can I resolve this? If you propose a fix, please make it concise.

@shadowfax92
Copy link
Contributor Author

@greptileai review

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 3 comments

Edit Code Review Agent Settings | Greptile

logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
this.reconnectOrCrash()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reconnectOrCrash() not awaited - if multiple close events fire, multiple concurrent reconnection attempts will run

Suggested change
this.reconnectOrCrash()
await this.reconnectOrCrash()
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 105

Comment:
`reconnectOrCrash()` not awaited - if multiple close events fire, multiple concurrent reconnection attempts will run

```suggestion
    await this.reconnectOrCrash()
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 108 to 131
private async reconnectOrCrash(): Promise<void> {
const maxRetries = CDP_LIMITS.CONNECT_MAX_RETRIES
const retryDelay = TIMEOUTS.CDP_CONNECT_RETRY_DELAY

for (let attempt = 1; attempt <= maxRetries; attempt++) {
try {
logger.info(`CDP reconnection attempt ${attempt}/${maxRetries}...`)
await Bun.sleep(retryDelay)
await this.attemptConnect()
logger.info('CDP reconnected successfully')
return
} catch (error) {
const msg = error instanceof Error ? error.message : String(error)
logger.warn(
`CDP reconnection attempt ${attempt}/${maxRetries} failed: ${msg}`,
)
}
}

this.ws.onmessage = (event) => {
this.handleMessage(event.data as string)
}
})
logger.error(
`CDP reconnection failed after ${maxRetries} attempts, exiting for restart`,
)
process.exit(EXIT_CODES.GENERAL_ERROR)
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No guard prevents concurrent reconnection attempts. Add a reconnecting flag similar to disconnecting:

private reconnecting = false

private async reconnectOrCrash(): Promise<void> {
  if (this.reconnecting) return
  this.reconnecting = true
  
  try {
    const maxRetries = CDP_LIMITS.CONNECT_MAX_RETRIES
    const retryDelay = TIMEOUTS.CDP_CONNECT_RETRY_DELAY

    for (let attempt = 1; attempt <= maxRetries; attempt++) {
      // ... existing retry logic
    }
    
    logger.error(
      `CDP reconnection failed after ${maxRetries} attempts, exiting for restart`,
    )
    process.exit(EXIT_CODES.GENERAL_ERROR)
  } finally {
    this.reconnecting = false
  }
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 108-131

Comment:
No guard prevents concurrent reconnection attempts. Add a `reconnecting` flag similar to `disconnecting`:

```typescript
private reconnecting = false

private async reconnectOrCrash(): Promise<void> {
  if (this.reconnecting) return
  this.reconnecting = true
  
  try {
    const maxRetries = CDP_LIMITS.CONNECT_MAX_RETRIES
    const retryDelay = TIMEOUTS.CDP_CONNECT_RETRY_DELAY

    for (let attempt = 1; attempt <= maxRetries; attempt++) {
      // ... existing retry logic
    }
    
    logger.error(
      `CDP reconnection failed after ${maxRetries} attempts, exiting for restart`,
    )
    process.exit(EXIT_CODES.GENERAL_ERROR)
  } finally {
    this.reconnecting = false
  }
}
```

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 99 to 106
private handleUnexpectedClose(): void {
if (this.disconnecting) return

this.ws.onerror = (event) => {
reject(new Error(`CDP WebSocket error: ${event}`))
}
logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
this.reconnectOrCrash()
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pending requests never rejected when connection drops - they hang indefinitely. Reject all pending requests before reconnecting:

private handleUnexpectedClose(): void {
  if (this.disconnecting) return

  // Reject all pending requests
  for (const [id, request] of this.pending.entries()) {
    request.reject(new Error('CDP connection lost'))
  }
  this.pending.clear()

  logger.error(
    'CDP WebSocket closed unexpectedly, attempting reconnection...',
  )
  await this.reconnectOrCrash()
}
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 99-106

Comment:
Pending requests never rejected when connection drops - they hang indefinitely. Reject all pending requests before reconnecting:

```typescript
private handleUnexpectedClose(): void {
  if (this.disconnecting) return

  // Reject all pending requests
  for (const [id, request] of this.pending.entries()) {
    request.reject(new Error('CDP connection lost'))
  }
  this.pending.clear()

  logger.error(
    'CDP WebSocket closed unexpectedly, attempting reconnection...',
  )
  await this.reconnectOrCrash()
}
```

How can I resolve this? If you propose a fix, please make it concise.

@shadowfax92 shadowfax92 reopened this Feb 23, 2026
@shadowfax92 shadowfax92 merged commit f27858f into main Feb 23, 2026
10 of 12 checks passed
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +108 to 111
this.reconnecting = true
this.reconnectOrCrash().finally(() => {
this.reconnecting = false
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reconnecting flag should be set inside the promise chain to prevent race condition. If reconnectOrCrash() Promise is created but .finally() runs synchronously before async work starts, the flag could be cleared prematurely.

Suggested change
this.reconnecting = true
this.reconnectOrCrash().finally(() => {
this.reconnecting = false
})
logger.error(
'CDP WebSocket closed unexpectedly, attempting reconnection...',
)
void (async () => {
this.reconnecting = true
try {
await this.reconnectOrCrash()
} finally {
this.reconnecting = false
}
})()
Prompt To Fix With AI
This is a comment left during a code review.
Path: apps/server/src/browser/backends/cdp.ts
Line: 108-111

Comment:
`reconnecting` flag should be set inside the promise chain to prevent race condition. If `reconnectOrCrash()` Promise is created but `.finally()` runs synchronously before async work starts, the flag could be cleared prematurely.

```suggestion
    logger.error(
      'CDP WebSocket closed unexpectedly, attempting reconnection...',
    )
    void (async () => {
      this.reconnecting = true
      try {
        await this.reconnectOrCrash()
      } finally {
        this.reconnecting = false
      }
    })()
```

How can I resolve this? If you propose a fix, please make it concise.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant