Skip to content

Long-running jobs lack heartbeat/timeout #48

@codihuston

Description

@codihuston

Cause: No task-level heartbeat or execution timeout policy.
Analysis: Long tasks can stall pools and mask failures.
Acceptance Criteria:

  • Task execution times out or emits heartbeat signals.
  • Stuck tasks are detected and retried or failed safely.
    Solution Approach:
  • Add per-task timeouts and heartbeat updates.
  • Monitor for stale "running" tasks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions