Skip to content

Comments

fix: Rate-limited batches no longer trigger false exhaustion (#160)#162

Merged
deucebucket merged 2 commits intodevelopfrom
fix/issue-160-rate-limit-exhaustion
Feb 19, 2026
Merged

fix: Rate-limited batches no longer trigger false exhaustion (#160)#162
deucebucket merged 2 commits intodevelopfrom
fix/issue-160-rate-limit-exhaustion

Conversation

@deucebucket
Copy link
Owner

Closes #160

Problem

When AI providers (Gemini, OpenRouter) hit rate limits or circuit breakers during Layer 4 processing, process_queue returned (0, 0) - the same signal as "nothing to process". The Layer 4 loop counted these toward its 3-strike exhaustion rule, permanently marking identifiable books as "all processing layers exhausted" even though they just needed the rate limit to clear.

Fix

  1. Distinct rate-limit signal: process_queue in layer_ai_queue.py now returns (-1, 0) when rate-limited instead of (0, 0), so the caller can distinguish "rate limited" from "genuinely nothing to process".

  2. Rate-limit handling in Layer 4 loop: When processed == -1, the worker waits 30 seconds and retries without incrementing empty_batch_count.

  3. Circuit breaker awareness: Before incrementing empty_batch_count, the worker checks if any configured AI providers have tripped circuit breakers. If so, it waits for recovery instead of counting toward exhaustion.

Files Changed

  • library_manager/pipeline/layer_ai_queue.py - Return -1, 0 on rate limit
  • library_manager/worker.py - Handle rate-limit signal and circuit breaker state in Layer 4 loop

Copy link

@bucket-agent bucket-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Vibe Check Review

Context

This PR introduces a -1 sentinel return value from process_queue to distinguish "rate-limited, skip this batch" from "genuinely nothing to process", preventing false exhaustion triggering in the Layer 4 background worker.

Codebase Patterns I Verified

  • Dependency injection: process_all_queue receives is_circuit_open as a Callable parameter (line 126) — used correctly in the new code, not a stray import
  • Rate limit handling: The outer loop in process_all_queue already handles rate limits at lines 375–382 with exponential backoff; the inner process_queue rate check is a second gate for race conditions (this is intentional by design)
  • Sentinel pattern: The rest of the codebase uses (0, 0) as the "nothing happened" return — the -1 sentinel is new and only handled in one of the two callers
  • time module: Already imported at worker.py:9; time.sleep(30) is consistent with the existing time.sleep(10) pattern at line 429

✅ Good

  • The root cause analysis is correct — the outer rate-limit check can pass while the inner one fails, and the old (0, 0) return from the inner check was silently feeding the 3-strike exhaustion counter
  • Comment at line 22–23 is explicit and future-proof: # Issue #160: processed == -1 means rate-limited, NOT "nothing to process"
  • Circuit-breaker check is well-constructed: checks the configured provider + bookdb, skips counting toward exhaustion
  • _processing_status updates on both paths give the user real feedback instead of a silent spin

🚨 Issues Found

Severity Location Exact Code Quote Issue Fix
MEDIUM app.py:7684–7685 l2_processed, l2_fixed = process_queue(config, limit) / total_processed += l2_processed The -1 sentinel now leaks into api_process() (the manual "Process Batch" endpoint). Before this PR, rate-limited returned (0, 0) — harmless. After this PR, it returns (-1, 0), so total_processed goes negative, verified = processed - fixed = -1, and the JSON response shows {processed: -1, verified: -1, message: "0 renamed, -1 already correct"}. Add a guard: if l2_processed < 0: l2_processed = 0 before total_processed += l2_processed, or check if l2_processed == -1: pass and skip adding it

Verification chain:

  • Claim: app.py:7684 receives -1 from process_queue when rate-limited → total_processed goes negative
  • layer_ai_queue.py:103 now returns (-1, 0) ✅ (confirmed in diff)
  • app.py:5850–5879 wrapper passes return value through unchanged ✅ (read at line 5855)
  • app.py:7684: l2_processed, l2_fixed = process_queue(config, limit) ✅ (read directly)
  • app.py:7685: total_processed += l2_processed — no guard ✅ (read directly)
  • app.py:7697: verified = processed - fixed → returns -1 in API response ✅ (read at 7697–7726)
  • Confidence: 10/10

📋 Scope Verification

Issue Problem Addressed? Notes
#160 Rate-limited batches falsely trigger exhaustion Layer 4 background worker loop correctly skips -1 returns and circuit-open states

Scope Status: SCOPE_OK — the fix directly targets the described problem. No scope creep.

📝 Documentation Check

This is a fix: PR — CHANGELOG not required per project convention.

🎯 Verdict

REQUEST_CHANGES

One item to fix before merge:

Guard the -1 sentinel in app.py:7684 — the manual process batch endpoint (POST /api/process without all=true) now returns {processed: -1, verified: -1} to the UI when rate-limited. This is a regression from the current behavior (which returns 0). Add if l2_processed < 0: l2_processed = 0 immediately after line 7684, before the total_processed += line.

The core logic in worker.py is correct and the sentinel approach is clean. This is a one-liner to fix the gap.

process_queue now returns (-1, 0) when rate-limited instead of (0, 0),
so the Layer 4 loop can distinguish "rate limited" from "nothing to
process". The worker also checks circuit breaker state before counting
empty batches toward the 3-strike exhaustion rule. Books that are
identifiable but temporarily blocked by rate limits will no longer be
permanently marked as "all processing layers exhausted".
- Version badge in README matches APP_VERSION
- CHANGELOG entry for rate limit exhaustion fix
- README Recent Changes updated
@deucebucket deucebucket force-pushed the fix/issue-160-rate-limit-exhaustion branch from 1b5fb28 to f494cf9 Compare February 19, 2026 00:28
Copy link

@bucket-agent bucket-agent bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔍 Vibe Check Review

Context

This PR fixes Issue #160 by distinguishing rate-limited/circuit-broken batches from genuinely unprocessable ones in Layer 4 processing, preventing identifiable books from being permanently marked as "all processing layers exhausted."

Codebase Patterns I Verified

  • is_circuit_open is a callable parameter injected into process_all_queue (worker.py:126) from app.py — the new code uses it correctly in scope.
  • Circuit breaker keys (audnexus, bookdb, openrouter, gemini) confirmed in rate_limiter.py:30-35 — the ai_provider config values map cleanly to these keys. Unknown providers (e.g. ollama) return False safely (is_circuit_open does API_CIRCUIT_BREAKER.get(api_name, {}) → no KeyError).
  • Existing double rate-limit check: worker.py:375 already checks before calling process_queue. The inner check at layer_ai_queue.py:100 is the one that was the false trigger source. Fix addresses the root cause.
  • Logging: logger = logging.getLogger(__name__) pattern used throughout — new log lines match.

✅ Good

  • Clean sentinel design (-1 vs 0) — unambiguous and cheap to check.
  • Circuit-breaker awareness in the processed == 0 branch is the right place for it.
  • sleep(30) on both rate-limit and circuit-break paths is reasonable (avoids thrashing without blocking the thread for full cooldown periods).
  • empty_batch_count is correctly NOT incremented for both skip paths, and NOT reset either — no ghost resets that could mask real exhaustion.
  • CHANGELOG and README updated correctly for a fix: PR.

🚨 Issues Found

Severity Location Exact Code Quote Issue Fix
MEDIUM app.py:7684-7685 l2_processed, l2_fixed = process_queue(config, limit) / total_processed += l2_processed process_queue now returns -1 when rate-limited, but this call site unconditionally adds l2_processed to total_processed. Result: processed=-1, verified=-1 in the API response JSON when the daily limit is hit during a manual/limited batch trigger. Old behavior was harmless (0+0). This is a concrete regression. total_processed += max(0, l2_processed) (and guard l2_fixed similarly), or add a -1 guard block matching the worker pattern

📋 Scope Verification

Issue Problem Addressed? Notes
#160 Rate-limited/circuit-broken batches falsely trigger 3-strike exhaustion Worker Layer 4 loop fully fixed. -1 sentinel correctly skips exhaustion count. Circuit-breaker check on processed==0 path catches the other false-empty scenario.

Scope Status: SCOPE_OK — but the fix introduces a regression in the api_process non-process_all path.

📝 Documentation Check

  • CHANGELOG.md: ✅ Updated with clear explanation of the sentinel and behavior change
  • README.md: ✅ Version bumped, change noted in Recent Changes

One minor note: The process_queue docstring (line 90-91) still says Returns: Tuple of (processed_count, fixed_count) without mentioning the -1 sentinel. Not blocking, but worth updating.

🎯 Verdict

REQUEST_CHANGES

Fix required before merge:

  1. app.py:7685 — Guard the total_processed += l2_processed against -1. Simplest fix: total_processed += max(0, l2_processed). Without this, any user who hits their daily AI call limit while manually triggering a limited batch (not process_all) will get {"processed": -1, "verified": -1} in the API response and a message like "-1 already correct" in the UI. The core worker fix is solid; this is just the sentinel leaking to a second caller.

@deucebucket deucebucket merged commit 827398c into develop Feb 19, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant