fix: Rate-limited batches no longer trigger false exhaustion (#160)#162
fix: Rate-limited batches no longer trigger false exhaustion (#160)#162deucebucket merged 2 commits intodevelopfrom
Conversation
There was a problem hiding this comment.
🔍 Vibe Check Review
Context
This PR introduces a -1 sentinel return value from process_queue to distinguish "rate-limited, skip this batch" from "genuinely nothing to process", preventing false exhaustion triggering in the Layer 4 background worker.
Codebase Patterns I Verified
- Dependency injection:
process_all_queuereceivesis_circuit_openas aCallableparameter (line 126) — used correctly in the new code, not a stray import - Rate limit handling: The outer loop in
process_all_queuealready handles rate limits at lines 375–382 with exponential backoff; the innerprocess_queuerate check is a second gate for race conditions (this is intentional by design) - Sentinel pattern: The rest of the codebase uses
(0, 0)as the "nothing happened" return — the-1sentinel is new and only handled in one of the two callers timemodule: Already imported at worker.py:9;time.sleep(30)is consistent with the existingtime.sleep(10)pattern at line 429
✅ Good
- The root cause analysis is correct — the outer rate-limit check can pass while the inner one fails, and the old
(0, 0)return from the inner check was silently feeding the 3-strike exhaustion counter - Comment at line 22–23 is explicit and future-proof:
# Issue #160: processed == -1 means rate-limited, NOT "nothing to process" - Circuit-breaker check is well-constructed: checks the configured provider + bookdb, skips counting toward exhaustion
_processing_statusupdates on both paths give the user real feedback instead of a silent spin
🚨 Issues Found
| Severity | Location | Exact Code Quote | Issue | Fix |
|---|---|---|---|---|
| MEDIUM | app.py:7684–7685 |
l2_processed, l2_fixed = process_queue(config, limit) / total_processed += l2_processed |
The -1 sentinel now leaks into api_process() (the manual "Process Batch" endpoint). Before this PR, rate-limited returned (0, 0) — harmless. After this PR, it returns (-1, 0), so total_processed goes negative, verified = processed - fixed = -1, and the JSON response shows {processed: -1, verified: -1, message: "0 renamed, -1 already correct"}. |
Add a guard: if l2_processed < 0: l2_processed = 0 before total_processed += l2_processed, or check if l2_processed == -1: pass and skip adding it |
Verification chain:
- Claim:
app.py:7684receives-1fromprocess_queuewhen rate-limited →total_processedgoes negative layer_ai_queue.py:103now returns(-1, 0)✅ (confirmed in diff)app.py:5850–5879wrapper passes return value through unchanged ✅ (read at line 5855)app.py:7684:l2_processed, l2_fixed = process_queue(config, limit)✅ (read directly)app.py:7685:total_processed += l2_processed— no guard ✅ (read directly)app.py:7697:verified = processed - fixed→ returns -1 in API response ✅ (read at 7697–7726)- Confidence: 10/10
📋 Scope Verification
| Issue | Problem | Addressed? | Notes |
|---|---|---|---|
| #160 | Rate-limited batches falsely trigger exhaustion | ✅ | Layer 4 background worker loop correctly skips -1 returns and circuit-open states |
Scope Status: SCOPE_OK — the fix directly targets the described problem. No scope creep.
📝 Documentation Check
This is a fix: PR — CHANGELOG not required per project convention.
🎯 Verdict
REQUEST_CHANGES
One item to fix before merge:
Guard the -1 sentinel in app.py:7684 — the manual process batch endpoint (POST /api/process without all=true) now returns {processed: -1, verified: -1} to the UI when rate-limited. This is a regression from the current behavior (which returns 0). Add if l2_processed < 0: l2_processed = 0 immediately after line 7684, before the total_processed += line.
The core logic in worker.py is correct and the sentinel approach is clean. This is a one-liner to fix the gap.
process_queue now returns (-1, 0) when rate-limited instead of (0, 0), so the Layer 4 loop can distinguish "rate limited" from "nothing to process". The worker also checks circuit breaker state before counting empty batches toward the 3-strike exhaustion rule. Books that are identifiable but temporarily blocked by rate limits will no longer be permanently marked as "all processing layers exhausted".
- Version badge in README matches APP_VERSION - CHANGELOG entry for rate limit exhaustion fix - README Recent Changes updated
1b5fb28 to
f494cf9
Compare
There was a problem hiding this comment.
🔍 Vibe Check Review
Context
This PR fixes Issue #160 by distinguishing rate-limited/circuit-broken batches from genuinely unprocessable ones in Layer 4 processing, preventing identifiable books from being permanently marked as "all processing layers exhausted."
Codebase Patterns I Verified
is_circuit_openis a callable parameter injected intoprocess_all_queue(worker.py:126) fromapp.py— the new code uses it correctly in scope.- Circuit breaker keys (
audnexus,bookdb,openrouter,gemini) confirmed inrate_limiter.py:30-35— theai_providerconfig values map cleanly to these keys. Unknown providers (e.g.ollama) returnFalsesafely (is_circuit_opendoesAPI_CIRCUIT_BREAKER.get(api_name, {})→ no KeyError). - Existing double rate-limit check:
worker.py:375already checks before callingprocess_queue. The inner check atlayer_ai_queue.py:100is the one that was the false trigger source. Fix addresses the root cause. - Logging:
logger = logging.getLogger(__name__)pattern used throughout — new log lines match.
✅ Good
- Clean sentinel design (
-1vs0) — unambiguous and cheap to check. - Circuit-breaker awareness in the
processed == 0branch is the right place for it. sleep(30)on both rate-limit and circuit-break paths is reasonable (avoids thrashing without blocking the thread for full cooldown periods).empty_batch_countis correctly NOT incremented for both skip paths, and NOT reset either — no ghost resets that could mask real exhaustion.- CHANGELOG and README updated correctly for a
fix:PR.
🚨 Issues Found
| Severity | Location | Exact Code Quote | Issue | Fix |
|---|---|---|---|---|
| MEDIUM | app.py:7684-7685 |
l2_processed, l2_fixed = process_queue(config, limit) / total_processed += l2_processed |
process_queue now returns -1 when rate-limited, but this call site unconditionally adds l2_processed to total_processed. Result: processed=-1, verified=-1 in the API response JSON when the daily limit is hit during a manual/limited batch trigger. Old behavior was harmless (0+0). This is a concrete regression. |
total_processed += max(0, l2_processed) (and guard l2_fixed similarly), or add a -1 guard block matching the worker pattern |
📋 Scope Verification
| Issue | Problem | Addressed? | Notes |
|---|---|---|---|
| #160 | Rate-limited/circuit-broken batches falsely trigger 3-strike exhaustion | ✅ | Worker Layer 4 loop fully fixed. -1 sentinel correctly skips exhaustion count. Circuit-breaker check on processed==0 path catches the other false-empty scenario. |
Scope Status: SCOPE_OK — but the fix introduces a regression in the api_process non-process_all path.
📝 Documentation Check
- CHANGELOG.md: ✅ Updated with clear explanation of the sentinel and behavior change
- README.md: ✅ Version bumped, change noted in Recent Changes
One minor note: The process_queue docstring (line 90-91) still says Returns: Tuple of (processed_count, fixed_count) without mentioning the -1 sentinel. Not blocking, but worth updating.
🎯 Verdict
REQUEST_CHANGES
Fix required before merge:
app.py:7685— Guard thetotal_processed += l2_processedagainst-1. Simplest fix:total_processed += max(0, l2_processed). Without this, any user who hits their daily AI call limit while manually triggering a limited batch (notprocess_all) will get{"processed": -1, "verified": -1}in the API response and a message like"-1 already correct"in the UI. The core worker fix is solid; this is just the sentinel leaking to a second caller.
Closes #160
Problem
When AI providers (Gemini, OpenRouter) hit rate limits or circuit breakers during Layer 4 processing,
process_queuereturned(0, 0)- the same signal as "nothing to process". The Layer 4 loop counted these toward its 3-strike exhaustion rule, permanently marking identifiable books as "all processing layers exhausted" even though they just needed the rate limit to clear.Fix
Distinct rate-limit signal:
process_queueinlayer_ai_queue.pynow returns(-1, 0)when rate-limited instead of(0, 0), so the caller can distinguish "rate limited" from "genuinely nothing to process".Rate-limit handling in Layer 4 loop: When
processed == -1, the worker waits 30 seconds and retries without incrementingempty_batch_count.Circuit breaker awareness: Before incrementing
empty_batch_count, the worker checks if any configured AI providers have tripped circuit breakers. If so, it waits for recovery instead of counting toward exhaustion.Files Changed
library_manager/pipeline/layer_ai_queue.py- Return-1, 0on rate limitlibrary_manager/worker.py- Handle rate-limit signal and circuit breaker state in Layer 4 loop