Context
Found during E2E testing of RolnickLab/antenna#1197 + #134.
Issues
1. psv2_integration_test.sh populate step fails intermittently
The api_post_empty call to /captures/collections/{id}/populate/ sometimes fails with curl -sf (exit code 22), even though the endpoint returns 200 when called manually moments later. The set -euo pipefail causes the entire script to abort.
Likely cause: race condition between collection creation and populate, or a transient connection issue with curl -sf being too strict.
Suggested fix: add a short retry loop around the populate call, or replace curl -sf with a function that retries on transient failures.
2. Worker "Done" summary lines missing from log files
When the worker spawns per-GPU subprocesses (Found 2 GPUs, spawning one AMI worker instance per GPU), the batch completion summary lines (Done, detections: N. Detecting time: ...) only appear in the subprocess that processed the batch. When redirecting to a log file, these lines sometimes don't appear because the parent process's log stream captures only its own output.
The psv2_integration_test.sh script greps for Done, detections: in the worker log to show timing, but this is unreliable with multi-GPU workers.
3. Worker log analysis section in test script sometimes skipped
The integration test script uses set -euo pipefail. If grep finds no matches (e.g., no errors in logs), it returns exit code 1, which causes the script to exit before printing the final PASS/FAIL verdict. This makes clean runs report as failures.
4. POST URLs require trailing slash — easy to miss
Django's APPEND_SLASH can silently redirect GET requests but returns a 500 for POST requests without a trailing slash. This caused the first E2E test failure. The error message is clear in the Django logs but the worker only sees a generic 500.
Suggestion: document this in the ADC's CLAUDE.md or add a note in the Antenna API docs. Alternatively, the ADC HTTP client could normalize URLs to always include a trailing slash.
Affected files
scripts/psv2_integration_test.sh (Antenna repo)
trapdata/antenna/datasets.py (trailing slash)
- Worker subprocess logging infrastructure
Context
Found during E2E testing of RolnickLab/antenna#1197 + #134.
Issues
1.
psv2_integration_test.shpopulate step fails intermittentlyThe
api_post_emptycall to/captures/collections/{id}/populate/sometimes fails withcurl -sf(exit code 22), even though the endpoint returns 200 when called manually moments later. Theset -euo pipefailcauses the entire script to abort.Likely cause: race condition between collection creation and populate, or a transient connection issue with
curl -sfbeing too strict.Suggested fix: add a short retry loop around the populate call, or replace
curl -sfwith a function that retries on transient failures.2. Worker "Done" summary lines missing from log files
When the worker spawns per-GPU subprocesses (
Found 2 GPUs, spawning one AMI worker instance per GPU), the batch completion summary lines (Done, detections: N. Detecting time: ...) only appear in the subprocess that processed the batch. When redirecting to a log file, these lines sometimes don't appear because the parent process's log stream captures only its own output.The
psv2_integration_test.shscript greps forDone, detections:in the worker log to show timing, but this is unreliable with multi-GPU workers.3. Worker log analysis section in test script sometimes skipped
The integration test script uses
set -euo pipefail. Ifgrepfinds no matches (e.g., no errors in logs), it returns exit code 1, which causes the script to exit before printing the final PASS/FAIL verdict. This makes clean runs report as failures.4. POST URLs require trailing slash — easy to miss
Django's
APPEND_SLASHcan silently redirect GET requests but returns a 500 for POST requests without a trailing slash. This caused the first E2E test failure. The error message is clear in the Django logs but the worker only sees a generic 500.Suggestion: document this in the ADC's CLAUDE.md or add a note in the Antenna API docs. Alternatively, the ADC HTTP client could normalize URLs to always include a trailing slash.
Affected files
scripts/psv2_integration_test.sh(Antenna repo)trapdata/antenna/datasets.py(trailing slash)