Budget Enforcement Patterns for OpenAI API Calls #108

bmdhodl · 2026-02-09T17:29:25Z

bmdhodl
Feb 9, 2026
Maintainer

One of the most common questions I see in agent repos: "How do I stop my agent from spending more than $X?"

OpenAI's API doesn't have a per-request budget cap. You can set spend limits at the account level, but that's a blunt instrument — it kills everything, not just the runaway agent. And by the time the limit kicks in, you've already spent the money.

Here are three patterns for enforcing budgets at the agent level.

Pattern 1: Hard dollar cap per run

Stop execution the moment estimated cost exceeds a threshold:

from agentguard import BudgetGuard, BudgetExceeded

guard = BudgetGuard(max_cost_usd=2.00)

# After each LLM call, feed in the token count:
guard.consume(tokens=1500, cost_usd=0.045)

# When cumulative cost exceeds $2.00, raises BudgetExceeded

This is the simplest pattern. Set a dollar amount, consume after each call, catch the exception.

Pattern 2: Warning before the limit

Get a heads-up at 80% of the budget so you can gracefully wrap up:

from agentguard import BudgetGuard, BudgetExceeded

wrapping_up = False

def on_budget_warning(msg):
    global wrapping_up
    wrapping_up = True
    print(f"Budget warning: {msg} — finishing current task, no new tool calls")

guard = BudgetGuard(
    max_cost_usd=5.00,
    warn_at_pct=0.8,
    on_warning=on_budget_warning,
)

# In your agent loop:
for step in range(100):
    try:
        guard.consume(cost_usd=0.12)
    except BudgetExceeded:
        print("Hard stop — budget exceeded")
        break

    if wrapping_up:
        # Finish current task but don't start new tool calls
        break

The on_warning callback fires at 80%. BudgetExceeded is the hard stop. This gives you a two-phase shutdown.

Pattern 3: Auto-tracking with OpenAI patching

Skip manual consume() calls — let AgentGuard estimate cost automatically:

from agentguard import Tracer, BudgetGuard, JsonlFileSink, patch_openai

tracer = Tracer(
    sink=JsonlFileSink("traces.jsonl"),
    service="my-agent",
    guards=[BudgetGuard(max_cost_usd=5.00)],
)
patch_openai(tracer)

# Every OpenAI call is now auto-traced with cost estimates.
# BudgetGuard checks the budget after each call.
# Supports GPT-4, GPT-3.5, and embedding models.

The cost estimates use published token pricing. You can override prices for custom or fine-tuned models:

from agentguard import update_prices

update_prices({("openai", "my-fine-tuned-model"): (0.003, 0.006)})

Combining with loop detection

Budget overruns and loops often go together. An agent stuck in a loop burns through your budget fast. Layer both guards:

from agentguard import Tracer, LoopGuard, BudgetGuard, JsonlFileSink

tracer = Tracer(
    sink=JsonlFileSink("traces.jsonl"),
    service="my-agent",
    guards=[
        LoopGuard(max_repeats=3),
        BudgetGuard(max_cost_usd=5.00, warn_at_pct=0.8),
    ],
)

Whichever guard triggers first stops the agent. The loop guard catches pathological repeats; the budget guard catches legitimate-but-expensive runs.

Install

pip install agentguard47

Zero dependencies, MIT licensed, Python 3.9+.

Repo: https://github.com/bmdhodl/agent47

What's your current approach to budget enforcement? Curious if anyone's built custom solutions or run into edge cases.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Budget Enforcement Patterns for OpenAI API Calls #108

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

Uh oh!

Budget Enforcement Patterns for OpenAI API Calls #108

Uh oh!

Uh oh!

bmdhodl Feb 9, 2026 Maintainer

Pattern 1: Hard dollar cap per run

Pattern 2: Warning before the limit

Pattern 3: Auto-tracking with OpenAI patching

Combining with loop detection

Install

Replies: 0 comments

bmdhodl
Feb 9, 2026
Maintainer