Agents Need Seatbelts: Guardrails and Infinite-Loop Detection for Tool-Using AI
An agent without guardrails is just a while loop with a credit card.
Tool-using agents are powerful because they can observe, plan, call tools, inspect results, and try again. That same loop is the source of the most boring and expensive failure mode in agent systems:
think -> search -> read -> think -> search -> read -> think -> searchNo explosion. No dramatic exception. Just a polite machine spending money while making no progress.
Guardrails are not a single classifier. They are a control system around the loop.
The loop is the product
A production agent loop has state:
- user goal
- plan
- messages
- tool calls
- observations
- memory
- budget
- permissions
- progress signal
- termination criteria
If the system cannot explain why the next step is allowed, it should not take the step.
Guardrail layers
A useful agent stack has several guardrails:
| Layer | Example control |
|---|---|
| Input | prompt injection screening, sensitive data detection |
| Planning | task scope, allowed tools, required approval |
| Tool call | authorization, schema validation, rate limits |
| Observation | sanitize tool output, detect malicious instructions |
| Memory | do not store secrets, tenant isolation |
| Output | policy, citations, PII checks |
| Loop | budgets, progress checks, recursion limits |
OpenAI’s Agents SDK has guardrail concepts around validating inputs and outputs. NVIDIA NeMo Guardrails uses rails to define conversational and action constraints. LangGraph exposes recursion limits to stop runaway graph execution. AutoGen and CrewAI expose iteration or auto-reply limits. OWASP’s LLM Top 10 calls out prompt injection, sensitive information disclosure, and excessive agency as major application risks.
Different frameworks, same lesson: control the loop.
Infinite loops are usually “no progress” loops
Most bad loops are not exact repeats. They are semantically repetitive:
search "pricing policy"
read same docs
search "pricing policy enterprise"
read same docs
summarize
decide context missing
search "pricing policy"Detecting this needs more than a counter.
Signals:
- repeated tool name with similar arguments
- same URLs or documents observed repeatedly
- no new facts added to state
- plan text repeating
- answer confidence not improving
- token spend increasing while task state is unchanged
- same error returned by tool multiple times
Create a state fingerprint:
fingerprint =
hash(goal
+ normalized_plan
+ last_tool_name
+ normalized_tool_args
+ retrieved_doc_ids
+ known_facts)If the fingerprint repeats, or changes without adding facts, slow down or stop.
Budgets are guardrails
Every agent run should have budgets:
- max steps
- max tool calls
- max input tokens
- max output tokens
- max total tokens
- max wall-clock time
- max retries per tool
- max repeated action fingerprints
- max spend
Budgets should be task-aware. A background research agent may get 50 steps. A customer-support answer may get 4. A payment-changing workflow may require explicit approval before a tool call.
Tool authorization
Every tool should declare:
- required permissions
- allowed input schema
- side effects
- rate limit
- cost estimate
- approval requirement
- data classification
- audit fields
Dangerous tools need stronger controls:
- payment
- deletion
- deployment
- email sending
- external API writes
- database mutation
- browser automation
The model should not decide alone whether it is allowed to send money, delete data, or email a customer. That decision belongs to policy code.
Production controls
Build the runtime with:
- step-by-step tracing
- state snapshots
- token and cost budgets
- tool-call audit log
- per-tool timeout
- idempotency keys
- cancellation propagation
- human approval gates
- emergency kill switch
- replay harness for failed runs
If you cannot replay an agent failure, you cannot debug it. If you cannot bound an agent run, you cannot price it.
The answer is not “more guardrails”
Too many guardrails can make agents useless. The goal is not to block everything. The goal is to make the agent’s freedom explicit:
what can it do?
with whose data?
for how long?
at what cost?
with what approval?
when should it stop?That is the contract. Once the contract is explicit, the agent becomes a system rather than a surprise.
Sources worth reading
- OpenAI Agents SDK guardrails.
- NVIDIA NeMo Guardrails for conversational and action guardrails.
- LangGraph recursion limit docs for graph loop protection.
- AutoGen agent chat docs for agent interaction controls.
- CrewAI max iterations docs for execution limits.
- OWASP Top 10 for LLM Applications for prompt injection, sensitive data, and excessive agency risks.
