Skip to content

[copilot-finds] Bug: ExportHistoryJobClient.delete() bare catch swallows network and auth errors #212

@github-actions

Description

@github-actions

Problem

ExportHistoryJobClient.delete() in packages/durabletask-js-export-history/src/client/export-history-client.ts (lines 229-235) has a bare catch {} that silently swallows all errors from three sequential gRPC operations:

try {
  await this.client.terminateOrchestration(orchestrationInstanceId, "Export job deleted");
  await this.client.waitForOrchestrationCompletion(orchestrationInstanceId, false, 30);
  await this.client.purgeOrchestration(orchestrationInstanceId);
} catch {
  // Orchestration instance doesn't exist or already purged - this is expected
}

The comment explains the intent: the linked export orchestration may not exist, so a "not found" error is expected. However, the bare catch {} also swallows:

  • Network failures (gRPC UNAVAILABLE) — connection lost during cleanup
  • Authentication errors (gRPC UNAUTHENTICATED) — token expired during cleanup
  • Timeout errorswaitForOrchestrationCompletion exceeded its 30s deadline
  • Server errors (gRPC INTERNAL) — unexpected backend failures

Root Cause

The catch block does not discriminate between expected "not found" errors (gRPC status code 5) and unexpected errors. All errors are treated as benign.

Proposed Fix

Replace the bare catch {} with a catch that inspects the error:

  1. If the error has a gRPC code property equal to 5 (NOT_FOUND), swallow it — this is the expected "orchestration doesn't exist" case.
  2. For all other errors, re-throw so the caller can handle or log them.

Add a local isNotFoundError() helper function to perform the check without introducing a direct @grpc/grpc-js dependency in the export-history package.

Impact

Severity: Medium. When network or auth errors occur during job deletion:

  • The caller receives no indication that cleanup failed
  • The export orchestration may continue running after the entity is marked deleted (state inconsistency)
  • Stale orchestration instances are never purged (resource leak)
  • Operational visibility is lost — errors that should trigger retries or alerts are silently dropped

Affected scenarios: Any ExportHistoryJobClient.delete() call where the gRPC connection fails, tokens expire, or the backend errors during the terminate/wait/purge cleanup phase.

Metadata

Metadata

Assignees

No one assigned

    Labels

    copilot-findsFindings from daily automated code review agent

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions