Why AI Agents Loop (And How to Stop Them) #58

bmdhodl · 2026-02-08T16:53:17Z

bmdhodl
Feb 8, 2026
Maintainer

Every team building with AI agents hits the same wall. Your agent works fine in testing, you deploy it, and somewhere around 2 AM it burns through your API budget calling the same tool 200 times in a row.

This isn't a rare bug. It's the default failure mode of any agentic system without runtime guardrails.

The Problem

AI agents enter infinite tool-call loops. Not occasionally — routinely.

Real examples from the wild:

A RAG agent calling search() 200+ times with the same query because the results never quite satisfied the model's criteria
A draw.io diagramming agent that made 3,728 API requests in 31 minutes, burning $10 before anyone noticed
Discord bots flooding channels with duplicate error messages as the model kept retrying the same failed action
Code generation agents stuck in fix-test-fail-fix cycles, rewriting the same function over and over with identical bugs

These aren't edge cases from toy demos. They're production incidents from teams running agents with real users and real money on the line.

Why It Happens

Three root causes explain nearly every agent loop:

1. Models ignore prompt-level stop instructions

You can write "NEVER call search more than 3 times" in your system prompt. The model will comply most of the time. But LLMs are probabilistic — they don't execute instructions, they predict tokens. Under the right (wrong) conditions, the model will confidently ignore your instruction and keep going.

Prompt-level guardrails are suggestions, not constraints.

2. Unsatisfying tool results trigger infinite retries

When a tool returns results that don't match what the model expects, many agents will retry with the same or nearly identical arguments. The model "thinks" it needs to try again, but the tool output is deterministic — same input, same output, forever.

This is especially common with search and retrieval tools where the model has a specific answer in mind and the corpus doesn't contain it.

3. Multi-agent systems cascade failures

Agent A asks Agent B for data. Agent B fails and returns an error. Agent A retries. Agent B fails again. Now multiply this across a graph of 5-10 agents, each with their own retry logic, and you get exponential failure cascading.

One stuck agent can drag the entire system into a loop.

Why `max_iterations` Isn't Enough

LangChain's max_iterations parameter is the most common "fix" people reach for. Set it to 25 and the agent stops after 25 steps. Problem solved?

No. max_iterations is a blunt instrument. It caps total steps regardless of whether they're productive.

A 50-step agent run might be perfectly fine if each step is unique and making progress
A 3-step run is broken if it's the same tool call repeated 3 times

You need to detect the pattern, not just count steps. An agent that calls search("python async") then read_file("main.py") then write_file("main.py") over 50 steps is working. An agent that calls search("python async") three times in a row is stuck.

max_iterations can't tell the difference.

Worse, setting it too low kills legitimate long-running workflows. Setting it too high means you're still burning tokens on loops before the cap kicks in. There's no good number because the right limit depends on what the agent is doing, not how many steps it's taken.

Runtime Guards: The Right Approach

The fix is to check before each tool execution and raise an exception to forcibly break the loop when a bad pattern is detected. Not after the fact in logs. Not via prompt instructions the model might ignore. At runtime, in code, with real enforcement.

Three guard types cover the majority of failure modes:

LoopGuard — Detects identical or near-identical tool calls within a sliding window. If the same function is called with the same arguments N times in the last M calls, something is wrong.

BudgetGuard — Enforces hard limits on token consumption, API call count, or dollar cost. When the budget is spent, the agent stops. No exceptions.

TimeoutGuard — Wall-clock time limits. If an agent run exceeds N seconds, it's terminated. Catches slow-burn loops that stay under call-count limits by spacing out requests.

from agentguard import LoopGuard, BudgetGuard, TimeoutGuard

loop_guard = LoopGuard(max_repeats=3, window=6)
budget_guard = BudgetGuard(max_cost_usd=5.00)
timeout_guard = TimeoutGuard(max_seconds=120)

timeout_guard.start()

# Before each tool call:
loop_guard.check(tool_name="search", tool_args={"query": q})
budget_guard.consume(cost_usd=0.12)
timeout_guard.check()

LoopGuard.check() raises LoopDetected if it sees 3 identical search calls in the last 6 tool invocations. BudgetGuard.consume() raises BudgetExceeded when cumulative cost crosses $5.00. TimeoutGuard.check() raises TimeoutExceeded after 120 seconds.

These are real Python exceptions. They propagate up the call stack and stop execution immediately. The model doesn't get a chance to "decide" whether to keep going — the runtime decides for it.

This is the key insight: guardrails must operate at the runtime level, not the prompt level. You can't ask a stuck model to unstick itself. You have to forcibly intervene.

LangChain Integration

If you're using LangChain, you don't need to wire guards into every tool call manually. A single callback handler does it:

from agentguard.integrations.langchain import AgentGuardCallbackHandler
from agentguard import LoopGuard, BudgetGuard

handler = AgentGuardCallbackHandler(
    loop_guard=LoopGuard(max_repeats=3),
    budget_guard=BudgetGuard(max_cost_usd=5.00),
)
llm = ChatOpenAI(callbacks=[handler])

The handler hooks into on_tool_start and runs guard checks before every tool execution. If a guard trips, the exception propagates and the agent run terminates cleanly. Your existing agent code doesn't change — you just add the callback.

Try It

pip install agentguard47

Zero dependencies. Python 3.9+. MIT licensed.

The full source is in this repo under sdk/agentguard/. The guards are in guards.py, the LangChain integration is in integrations/langchain.py, and there's a working demo in examples/demo_agent.py.

If you've dealt with agent loops in production, I'd like to hear about it — what patterns you saw, what worked, what didn't. Drop a comment below.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Why AI Agents Loop (And How to Stop Them) #58

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Why AI Agents Loop (And How to Stop Them) #58

Uh oh!

bmdhodl Feb 8, 2026 Maintainer

The Problem

Why It Happens

1. Models ignore prompt-level stop instructions

2. Unsatisfying tool results trigger infinite retries

3. Multi-agent systems cascade failures

Why max_iterations Isn't Enough

Runtime Guards: The Right Approach

LangChain Integration

Try It

Replies: 0 comments

bmdhodl
Feb 8, 2026
Maintainer

Why `max_iterations` Isn't Enough