fix: Rate-limited batches no longer trigger false exhaustion (#160) by deucebucket · Pull Request #162 · deucebucket/library-manager

deucebucket · 2026-02-19T00:02:38Z

Closes #160

Problem

When AI providers (Gemini, OpenRouter) hit rate limits or circuit breakers during Layer 4 processing, process_queue returned (0, 0) - the same signal as "nothing to process". The Layer 4 loop counted these toward its 3-strike exhaustion rule, permanently marking identifiable books as "all processing layers exhausted" even though they just needed the rate limit to clear.

Fix

Distinct rate-limit signal: process_queue in layer_ai_queue.py now returns (-1, 0) when rate-limited instead of (0, 0), so the caller can distinguish "rate limited" from "genuinely nothing to process".
Rate-limit handling in Layer 4 loop: When processed == -1, the worker waits 30 seconds and retries without incrementing empty_batch_count.
Circuit breaker awareness: Before incrementing empty_batch_count, the worker checks if any configured AI providers have tripped circuit breakers. If so, it waits for recovery instead of counting toward exhaustion.

Files Changed

library_manager/pipeline/layer_ai_queue.py - Return -1, 0 on rate limit
library_manager/worker.py - Handle rate-limit signal and circuit breaker state in Layer 4 loop

bucket-agent

🔍 Vibe Check Review

Context

This PR introduces a -1 sentinel return value from process_queue to distinguish "rate-limited, skip this batch" from "genuinely nothing to process", preventing false exhaustion triggering in the Layer 4 background worker.

Codebase Patterns I Verified

Dependency injection: process_all_queue receives is_circuit_open as a Callable parameter (line 126) — used correctly in the new code, not a stray import
Rate limit handling: The outer loop in process_all_queue already handles rate limits at lines 375–382 with exponential backoff; the inner process_queue rate check is a second gate for race conditions (this is intentional by design)
Sentinel pattern: The rest of the codebase uses (0, 0) as the "nothing happened" return — the -1 sentinel is new and only handled in one of the two callers
time module: Already imported at worker.py:9; time.sleep(30) is consistent with the existing time.sleep(10) pattern at line 429

✅ Good

The root cause analysis is correct — the outer rate-limit check can pass while the inner one fails, and the old (0, 0) return from the inner check was silently feeding the 3-strike exhaustion counter
Comment at line 22–23 is explicit and future-proof: # Issue #160: processed == -1 means rate-limited, NOT "nothing to process"
Circuit-breaker check is well-constructed: checks the configured provider + bookdb, skips counting toward exhaustion
_processing_status updates on both paths give the user real feedback instead of a silent spin

🚨 Issues Found

Severity	Location	Exact Code Quote	Issue	Fix
MEDIUM	`app.py:7684–7685`	`l2_processed, l2_fixed = process_queue(config, limit)` / `total_processed += l2_processed`	The `-1` sentinel now leaks into `api_process()` (the manual "Process Batch" endpoint). Before this PR, rate-limited returned `(0, 0)` — harmless. After this PR, it returns `(-1, 0)`, so `total_processed` goes negative, `verified = processed - fixed = -1`, and the JSON response shows `{processed: -1, verified: -1, message: "0 renamed, -1 already correct"}`.	Add a guard: `if l2_processed < 0: l2_processed = 0` before `total_processed += l2_processed`, or check `if l2_processed == -1: pass` and skip adding it

Verification chain:

Claim: app.py:7684 receives -1 from process_queue when rate-limited → total_processed goes negative
layer_ai_queue.py:103 now returns (-1, 0) ✅ (confirmed in diff)
app.py:5850–5879 wrapper passes return value through unchanged ✅ (read at line 5855)
app.py:7684: l2_processed, l2_fixed = process_queue(config, limit) ✅ (read directly)
app.py:7685: total_processed += l2_processed — no guard ✅ (read directly)
app.py:7697: verified = processed - fixed → returns -1 in API response ✅ (read at 7697–7726)
Confidence: 10/10

📋 Scope Verification

Issue	Problem	Addressed?	Notes
#160	Rate-limited batches falsely trigger exhaustion	✅	Layer 4 background worker loop correctly skips -1 returns and circuit-open states

Scope Status: SCOPE_OK — the fix directly targets the described problem. No scope creep.

📝 Documentation Check

This is a fix: PR — CHANGELOG not required per project convention.

🎯 Verdict

REQUEST_CHANGES

One item to fix before merge:

Guard the -1 sentinel in app.py:7684 — the manual process batch endpoint (POST /api/process without all=true) now returns {processed: -1, verified: -1} to the UI when rate-limited. This is a regression from the current behavior (which returns 0). Add if l2_processed < 0: l2_processed = 0 immediately after line 7684, before the total_processed += line.

The core logic in worker.py is correct and the sentinel approach is clean. This is a one-liner to fix the gap.

process_queue now returns (-1, 0) when rate-limited instead of (0, 0), so the Layer 4 loop can distinguish "rate limited" from "nothing to process". The worker also checks circuit breaker state before counting empty batches toward the 3-strike exhaustion rule. Books that are identifiable but temporarily blocked by rate limits will no longer be permanently marked as "all processing layers exhausted".

- Version badge in README matches APP_VERSION - CHANGELOG entry for rate limit exhaustion fix - README Recent Changes updated

bucket-agent

🔍 Vibe Check Review

Context

This PR fixes Issue #160 by distinguishing rate-limited/circuit-broken batches from genuinely unprocessable ones in Layer 4 processing, preventing identifiable books from being permanently marked as "all processing layers exhausted."

Codebase Patterns I Verified

is_circuit_open is a callable parameter injected into process_all_queue (worker.py:126) from app.py — the new code uses it correctly in scope.
Circuit breaker keys (audnexus, bookdb, openrouter, gemini) confirmed in rate_limiter.py:30-35 — the ai_provider config values map cleanly to these keys. Unknown providers (e.g. ollama) return False safely (is_circuit_open does API_CIRCUIT_BREAKER.get(api_name, {}) → no KeyError).
Existing double rate-limit check: worker.py:375 already checks before calling process_queue. The inner check at layer_ai_queue.py:100 is the one that was the false trigger source. Fix addresses the root cause.
Logging: logger = logging.getLogger(__name__) pattern used throughout — new log lines match.

✅ Good

Clean sentinel design (-1 vs 0) — unambiguous and cheap to check.
Circuit-breaker awareness in the processed == 0 branch is the right place for it.
sleep(30) on both rate-limit and circuit-break paths is reasonable (avoids thrashing without blocking the thread for full cooldown periods).
empty_batch_count is correctly NOT incremented for both skip paths, and NOT reset either — no ghost resets that could mask real exhaustion.
CHANGELOG and README updated correctly for a fix: PR.

🚨 Issues Found

Severity	Location	Exact Code Quote	Issue	Fix
MEDIUM	`app.py:7684-7685`	`l2_processed, l2_fixed = process_queue(config, limit)` / `total_processed += l2_processed`	`process_queue` now returns `-1` when rate-limited, but this call site unconditionally adds `l2_processed` to `total_processed`. Result: `processed=-1`, `verified=-1` in the API response JSON when the daily limit is hit during a manual/limited batch trigger. Old behavior was harmless (`0+0`). This is a concrete regression.	`total_processed += max(0, l2_processed)` (and guard `l2_fixed` similarly), or add a `-1` guard block matching the worker pattern

📋 Scope Verification

Issue	Problem	Addressed?	Notes
#160	Rate-limited/circuit-broken batches falsely trigger 3-strike exhaustion	✅	Worker Layer 4 loop fully fixed. `-1` sentinel correctly skips exhaustion count. Circuit-breaker check on `processed==0` path catches the other false-empty scenario.

Scope Status: SCOPE_OK — but the fix introduces a regression in the api_process non-process_all path.

📝 Documentation Check

CHANGELOG.md: ✅ Updated with clear explanation of the sentinel and behavior change
README.md: ✅ Version bumped, change noted in Recent Changes

One minor note: The process_queue docstring (line 90-91) still says Returns: Tuple of (processed_count, fixed_count) without mentioning the -1 sentinel. Not blocking, but worth updating.

🎯 Verdict

REQUEST_CHANGES

Fix required before merge:

app.py:7685 — Guard the total_processed += l2_processed against -1. Simplest fix: total_processed += max(0, l2_processed). Without this, any user who hits their daily AI call limit while manually triggering a limited batch (not process_all) will get {"processed": -1, "verified": -1} in the API response and a message like "-1 already correct" in the UI. The core worker fix is solid; this is just the sentinel leaking to a second caller.

bucket-agent bot requested changes Feb 19, 2026

View reviewed changes

deucebucket added 2 commits February 18, 2026 18:27

Fix #160: Bump version to beta.130, update docs

f494cf9

- Version badge in README matches APP_VERSION - CHANGELOG entry for rate limit exhaustion fix - README Recent Changes updated

deucebucket force-pushed the fix/issue-160-rate-limit-exhaustion branch from 1b5fb28 to f494cf9 Compare February 19, 2026 00:28

bucket-agent bot requested changes Feb 19, 2026

View reviewed changes

deucebucket merged commit 827398c into develop Feb 19, 2026
3 checks passed

This was referenced Feb 19, 2026

fix: Rate-limited batches falsely trigger 'all processing layers exhausted' #160

Closed

fix: Guard -1 sentinel in manual process endpoint (#160) #163

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Comments

fix: Rate-limited batches no longer trigger false exhaustion (#160)#162

fix: Rate-limited batches no longer trigger false exhaustion (#160)#162
deucebucket merged 2 commits intodevelopfrom
fix/issue-160-rate-limit-exhaustion

deucebucket commented Feb 19, 2026

Uh oh!

bucket-agent bot left a comment

Uh oh!

bucket-agent bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Comments

Conversation

deucebucket commented Feb 19, 2026

Problem

Fix

Files Changed

Uh oh!

bucket-agent bot left a comment

Choose a reason for hiding this comment

🔍 Vibe Check Review

Context

Codebase Patterns I Verified

✅ Good

🚨 Issues Found

📋 Scope Verification

📝 Documentation Check

🎯 Verdict

Uh oh!

bucket-agent bot left a comment

Choose a reason for hiding this comment

🔍 Vibe Check Review

Context

Codebase Patterns I Verified

✅ Good

🚨 Issues Found

📋 Scope Verification

📝 Documentation Check

🎯 Verdict

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant