Conversation
Reviewer's GuideImplements robust handling and surface visibility for TES task submission failures by logging failed submissions as SUBMISSION_FAILED tasks, exposing all error states in the UI, and preventing background status polling for tasks that never reached TES; also fixes command handling for the demo Python script task submission path. Sequence diagram for TES task submission with SUBMISSION_FAILED handlingsequenceDiagram
actor User
participant FrontendTasksPage
participant BackendTasksRoute
participant TaskService
participant TESInstance
User->>FrontendTasksPage: Submit new task (command, tes_url, tes_name)
FrontendTasksPage->>BackendTasksRoute: POST /tasks with task payload
Note over BackendTasksRoute: Build executor command
BackendTasksRoute->>BackendTasksRoute: Normalize command using shlex
Note over BackendTasksRoute: Check TES service-info endpoints
BackendTasksRoute->>TESInstance: GET /service-info (one or more endpoints)
TESInstance-->>BackendTasksRoute: Connectivity failure for all endpoints
alt No TES endpoint reachable
BackendTasksRoute->>BackendTasksRoute: Create failed_task with state SUBMISSION_FAILED
BackendTasksRoute->>TaskService: add_task(failed_task)
BackendTasksRoute-->>FrontendTasksPage: 503 with error details and task_id
FrontendTasksPage-->>User: Show SUBMISSION_FAILED task in list
else At least one endpoint reachable
BackendTasksRoute->>TESInstance: POST /tasks
TESInstance-->>BackendTasksRoute: HTTP error 4xx or 5xx
alt TES returns HTTP error
BackendTasksRoute->>BackendTasksRoute: Build error_msg and error_info
BackendTasksRoute->>BackendTasksRoute: Create failed_task with state SUBMISSION_FAILED and http_status_code
BackendTasksRoute->>TaskService: add_task(failed_task)
BackendTasksRoute-->>FrontendTasksPage: 400 with error details and task_id
FrontendTasksPage-->>User: Show SUBMISSION_FAILED task in list
else TES accepts task
BackendTasksRoute-->>FrontendTasksPage: 200 with task_id and tes_task_id
loop Background status updates
TaskService->>TaskService: update_task_statuses
TaskService->>TaskService: Skip polling when state in terminal_states including SUBMISSION_FAILED
end
end
end
Flow diagram for backend task submission and SUBMISSION_FAILED handlingflowchart TD
A_start[Start task submission] --> B_build_command[Build executor command from input using shlex]
B_build_command --> C_check_service_info[Check TES service-info endpoints]
C_check_service_info --> D_any_endpoint_ok{Any TES endpoint reachable?}
D_any_endpoint_ok -- No --> E_create_failed_connectivity[Create failed_task with state SUBMISSION_FAILED and connectivity error details]
E_create_failed_connectivity --> F_add_task_connectivity["add_task(failed_task)"]
F_add_task_connectivity --> G_respond_503[Return 503 with error details and task_id]
D_any_endpoint_ok -- Yes --> H_post_task[POST task to TES endpoint]
H_post_task --> I_http_success{HTTP success?}
I_http_success -- No --> J_create_failed_http[Create failed_task with state SUBMISSION_FAILED and HTTP error details]
J_create_failed_http --> K_add_task_http["add_task(failed_task)"]
K_add_task_http --> L_respond_400[Return 400 with error details and task_id]
I_http_success -- Yes --> M_success_path[Return success response with task_id and tes_task_id]
subgraph Background_status_updater
N_update_loop[Loop update_task_statuses] --> O_check_state{Is task.state in terminal_states?}
O_check_state -- Yes (COMPLETE, CANCELED, SYSTEM_ERROR, EXECUTOR_ERROR, PREEMPTED, SUBMISSION_FAILED) --> P_skip_polling[Skip TES status polling]
O_check_state -- No --> Q_poll_TES[Poll TES for status and update task]
end
M_success_path --> N_update_loop
F_add_task_connectivity --> N_update_loop
K_add_task_http --> N_update_loop
File-Level Changes
Possibly linked issues
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- The
failed_taskdict is constructed twice insubmit_taskwith mostly identical fields; consider extracting this into a small helper to avoid duplication and keep the error-handling paths in sync. - The new command parsing logic around
shlex.splitcurrently falls back to a naivesplit()on failure and only prints a warning; you might want to either return a 4xx error for invalid commands or at least log via the app logger instead ofprintso these issues are easier to trace. - For the connectivity-failure and HTTP-error branches in
submit_task, it could be useful to standardize the structure of the JSON error response (e.g., always includetask_id,tes_endpoint, and a clearly namederror_type) to make client-side handling simpler and more consistent.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- The `failed_task` dict is constructed twice in `submit_task` with mostly identical fields; consider extracting this into a small helper to avoid duplication and keep the error-handling paths in sync.
- The new command parsing logic around `shlex.split` currently falls back to a naive `split()` on failure and only prints a warning; you might want to either return a 4xx error for invalid commands or at least log via the app logger instead of `print` so these issues are easier to trace.
- For the connectivity-failure and HTTP-error branches in `submit_task`, it could be useful to standardize the structure of the JSON error response (e.g., always include `task_id`, `tes_endpoint`, and a clearly named `error_type`) to make client-side handling simpler and more consistent.
## Individual Comments
### Comment 1
<location> `backend/routes/tasks.py:195-204` </location>
<code_context>
+ failed_task = {
</code_context>
<issue_to_address>
**suggestion:** The `failed_task` construction is duplicated and could be centralized to reduce drift between error paths.
This dict is assembled almost identically here and in the later 4xx/5xx submission-error branch. Please extract a small helper (e.g. `build_failed_task(...)`) or a shared template so updates to the schema stay consistent between these two paths.
Suggested implementation:
```python
def build_failed_task(tes_task, task_id='N/A'):
now_iso = datetime.now(timezone.utc).isoformat()
return {
'id': str(uuid.uuid4()),
'task_id': task_id,
'name': tes_task['name'],
'task_name': tes_task['name'],
'description': tes_task['description'],
'state': 'SUBMISSION_FAILED',
'status': 'SUBMISSION_FAILED',
'creation_time': now_iso,
'submitted_at': now_iso,
'start_time': None,
}
if not service_is_reachable:
print(f"All service-info endpoints failed for {tes_name}")
# feat: add failed submission tasks to task management #11
failed_task = build_failed_task(tes_task=tes_task)
```
There is a second branch that handles 4xx/5xx submission errors and constructs a nearly identical `failed_task` dictionary. That block should be updated to call `build_failed_task(...)` instead of inlining the dict. If the 4xx/5xx path adds extra fields (e.g. `error`, `http_status`, etc.), extend `build_failed_task` to accept those as optional parameters and include them in the returned dict so both error paths share a single schema definition.
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Pull request overview
Addresses task submission failure visibility and stability issues by recording failed submissions as tasks and preventing background status updates from breaking task lists.
Changes:
- Frontend: updates task filtering to no longer exclude error states so failed/error tasks remain visible in Task Management.
- Backend: treats
SUBMISSION_FAILEDas terminal to prevent the status updater from querying TES for tasks that never existed on TES. - Backend: improves command parsing (supports quoted strings) and records failed submissions (connectivity + HTTP errors) into the in-memory task list.
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| frontend/src/pages/Tasks.js | Stops filtering out error states so failed/error tasks remain in the task list. |
| backend/services/task_service.py | Adds SUBMISSION_FAILED to terminal states to prevent updater corruption for failed submissions. |
| backend/routes/tasks.py | Parses command strings via shlex.split and appends SUBMISSION_FAILED task records on submission failures. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
backend/routes/tasks.py
Outdated
| 'error_code': connectivity_error_info['error_code'], | ||
| 'reason': connectivity_error_info['reason'], | ||
| 'tes_url': tes_url, | ||
| 'tes_name': tes_name | ||
| 'tes_name': tes_name, | ||
| 'task_id': failed_task['id'] |
There was a problem hiding this comment.
In this error response, the field name task_id is now used to return a locally-generated UUID (the dashboard record id), whereas on successful submissions task_id is the TES-assigned task id. This makes the API response semantics inconsistent for clients. Consider returning a separate field (e.g., dashboard_task_id/local_task_id) and reserving task_id for the TES id (or returning both explicitly).
backend/routes/tasks.py
Outdated
| 'status_code': response.status_code | ||
| 'status_code': response.status_code, | ||
| 'task_id': failed_task['id'] | ||
| }), 400 |
There was a problem hiding this comment.
This branch always returns HTTP 400 even when the TES instance responded with a different status (e.g., 401/403/500). That makes the dashboard API inaccurate and can break client-side handling that depends on status codes. Consider returning the original response.status_code (or mapping to an appropriate upstream code) instead of hardcoding 400.
| }), 400 | |
| }), response.status_code |
backend/routes/tasks.py
Outdated
| # feat: add failed submission tasks to task management #11 | ||
| failed_task = { | ||
| 'id': str(uuid.uuid4()), | ||
| 'task_id': 'N/A', | ||
| 'name': tes_task['name'], |
There was a problem hiding this comment.
The failed_task payload is constructed in multiple places with a largely duplicated dict literal (here and earlier around the service-info reachability failure). This duplication makes it easy for the two branches to drift (missing fields / different timestamps). Consider extracting a small helper/builder for failed-submission task records and reusing it in both branches.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| failed_task = { | ||
| 'id': str(uuid.uuid4()), | ||
| 'task_id': 'N/A', | ||
| 'name': tes_task['name'], | ||
| 'task_name': tes_task['name'], | ||
| 'description': tes_task['description'], | ||
| 'state': 'SUBMISSION_FAILED', |
There was a problem hiding this comment.
build_failed_task hard-codes task_id to 'N/A' while generating a unique UUID in id. The frontend (taskService.listTasks) prefers task.task_id || task.id, so all failed submissions end up with the same displayed id ('N/A'), causing React key collisions and making task details/logs resolve to the wrong failed task. Make the failed submission identifier unique (e.g., set task_id to the generated UUID, or drop task_id and update the frontend mapping), and keep the TES task id as a separate nullable field if needed.
| @@ -1,15 +1,108 @@ | |||
| from flask import Blueprint, jsonify, request | |||
| from flask import Blueprint, jsonify, request, current_app | |||
There was a problem hiding this comment.
Unused import: current_app is imported but never referenced in this module. Please remove it to avoid linting/maintenance noise.
| from flask import Blueprint, jsonify, request, current_app | |
| from flask import Blueprint, jsonify, request |
| failed_task['error_code'] = error_code | ||
| if error_reason: | ||
| failed_task['error_reason'] = error_reason | ||
| if http_status_code: |
There was a problem hiding this comment.
http_status_code is added to the failed task only when truthy (if http_status_code:). For consistency with build_error_response (which checks is not None) and to avoid accidentally dropping valid falsy values, check http_status_code is not None instead.
| if http_status_code: | |
| if http_status_code is not None: |
|
|
||
| def update_task_statuses(): | ||
| terminal_states = ['COMPLETE', 'CANCELED', 'SYSTEM_ERROR', 'EXECUTOR_ERROR', 'PREEMPTED'] | ||
| terminal_states = ['COMPLETE', 'CANCELED', 'SYSTEM_ERROR', 'EXECUTOR_ERROR', 'PREEMPTED', 'SUBMISSION_FAILED'] |
There was a problem hiding this comment.
terminal_states is now defined in update_task_statuses, but a separate hard-coded terminal-state list still exists in update_single_task_status (used for the "Verified task in terminal state" branch). Consider centralizing terminal states in a single constant so these lists don’t drift over time.
|
Copilot comments still to be addressed. |
Issue #11 (SOLVED)
Logged failed submissions to the task list with:
Error message
Error type (DNS, timeout, auth, etc.)
Error code
Error reason
TES instance details
Timestamp
Captured both failure scenarios:
-> Connectivity failures (can't reach the instance)
-> HTTP errors (instance reachable but rejects task: 400, 403, 500, etc.)
issue #12 (SOLVED)
Solution -
Added 'SUBMISSION_FAILED' to the terminal_states list (line 89). This prevents the background task updater from trying to fetch status updates for failed submissions from the TES instance.
Why this matters: Tasks with SUBMISSION_FAILED state were never actually submitted to the TES instance, so when the background updater tried to fetch their status, it would fail or corrupt the task data, causing them to disappear from the list.
Removed the filter that was excluding error states (ERROR, SYSTEM_ERROR, EXECUTOR_ERROR) and the error_prone_instance check (lines 230-238). Now the filter only checks for essential fields (id, tes_url, state).
Implementation Screenshot:
Summary by Sourcery
Handle failed TES task submissions as first-class tasks and ensure error states remain visible in the task list.
New Features:
Bug Fixes: