Production Agent Runtime
GuardLoop
A guardrail runtime for async agents: pre-flight cost/token/time budgets, per-tool circuit breakers, and OpenTelemetry spans — no agent rewrite.
0
agent rewrites — drop-in adapters
- Pre-flight cost / token / time / tool-call budgets
- Per-tool circuit breakers (closed → open → half-open)
- Verifier feedback retry loop under one shared budget
- OpenTelemetry spans + drop-in LangGraph / OpenAI Agents adapters
The problem
An agent is a loop around a probabilistic system
When an agent goes wrong, it doesn't crash — it keeps going. It calls the same model again, retries the same dead tool, and spends real money doing it. By the time you notice, the failure is in the billing dashboard, not the logs.
GuardLoop wraps the model clients and tools an agent already uses and enforces hard limits around them — so a runaway loop is stopped before the next expensive call, not explained after the fact. No agent rewrite.
How it works
A guarded execution loop
Budget check
Before each LLM/tool call: would it exceed the cost, token, time, or tool-call ceiling? If so, deny and stop.
Call + breaker
Run the call. A per-tool circuit breaker trips after repeated failures and short-circuits further calls.
Verifier
Check the output. On failure, inject feedback and retry under the same shared budget — up to max_retries.
RunResult
Return success, cost_usd, tokens_used, and terminated_reason — plus OpenTelemetry spans for the whole run.
Cost
USD ceiling, Decimal-precise
Tokens
hard cap across the run
Time
wall-clock seconds
Tool calls
max invocations
See it run
A runaway agent, stopped
An agent stuck re-fetching the same page would burn through its budget. GuardLoop denies the call that would cross the ceiling and returns a typed result instead.
runtime = GuardLoop(budget=BudgetConfig(
cost_limit_usd="0.10", tool_call_limit=20,
))
await runtime.run(agent, "scrape every page and summarize")The API
Wrap what you already have
Configure budgets and breakers once, then run any async agent function through the runtime. Existing LangGraph graphs and OpenAI Agents SDK runs work through thin adapters — no rewrite.
from guardloop import GuardLoop, BudgetConfig, is_json_objectruntime = GuardLoop(budget=BudgetConfig(cost_limit_usd="0.10", token_limit=10_000),verifiers=[is_json_object(required_keys=["answer"])],)result = await runtime.run(agent, "task description")result.success, result.cost_usd, result.terminated_reason
from guardloop.adapters.langgraph import guarded_graphagent = guarded_graph(my_compiled_graph, input_key="messages")result = await runtime.run(agent, {"messages": [...]})
bounded self-correction
A verifier can reject an output and pass feedback back to the agent via ctx.retry_feedback. The retry runs under the samebudget — so self-correction can't become an infinite, expensive loop.
What I built
The details that matter in production
Pre-flight, not post-mortem
Budgets are checked before the next risky call executes — Decimal-precise cost math means the runtime stops you at $0.092, not after the bill arrives at $0.13.
Per-tool circuit breakers
closed → open → half-open states with per-tool failure thresholds and recovery timeouts, so one flaky tool can't take the whole agent down or burn retries against a dead endpoint.
Verifier feedback loop
Output verifiers can reject a result and inject feedback via ctx.retry_feedback; the agent retries under the same shared budget — bounded self-correction, not infinite retries.
OpenTelemetry GenAI spans
Every agent run, LLM call, tool, and verifier emits a span. When something fails you get a trace and a typed RunResult with terminated_reason, not a guess.
Get started
Install from PyPI
$ pip install guardloop$ pip install "guardloop[langgraph]"$ pip install "guardloop[openai-agents]"
Python 3.11–3.13.