Step Limits for AI Agents: how to stop loops before an incident

Step limits in production: how to stop loops, return stop reasons, and keep run execution under control.
On this page
  1. Idea in 30 seconds
  2. Problem
  3. Solution
  4. Step limits β‰  budget controls
  5. Step-control metrics
  6. How it looks in architecture
  7. Example
  8. In code it looks like this
  9. How it looks during execution
  10. Scenario 1: max_steps reached
  11. Scenario 2: loop detected
  12. Scenario 3: normal execution
  13. Common mistakes
  14. Self-check
  15. FAQ
  16. Where Step Limits fit in the system
  17. Related pages

Idea in 30 seconds

Step limits are a runtime control that force-stops a run when the agent loops or makes no progress.

When you need it: when an agent runs in a loop, interacts with tools, and can repeat the same actions in production.

Problem

Without step limits, an agent often "does something" but does not move toward a result. In demos this can look fine. In production it quickly turns into cost, latency, and log noise.

Typical pattern:

  • one unstable tool response
  • repeat of the same action
  • one more repeat
  • and then the same cycle again

Analogy: it is like a navigator that keeps looping the same intersection. The car is moving, but it is not getting closer to the destination.

To prevent this from becoming an incident, limits must live in the runtime loop, not in UI or prompt text.

Solution

The solution is to move step control into a policy layer at runtime level. Every step is checked after forming the next action, but before execution.

Policy layer returns one of these decisions:

  • allow
  • stop (reason=max_steps)
  • stop (reason=loop_detected)
  • stop (reason=no_progress)

This is a separate system layer, not part of prompt or model logic.

Step limits β‰  budget controls

These are different control layers:

  • Step limits control loop behavior (how many and what kind of steps the agent takes)
  • Budget controls control run resources (time, money, action count)

One without the other is insufficient:

  • without step limits, a loop can run for a long time without progress
  • without budget controls, even a "limited" loop can still be expensive

Example:

  • step limits: max_steps=18, max_repeat_action=3
  • budget controls: max_seconds=45, max_usd=1.00

Step-control metrics

These checks work together on every agent step.

MetricWhat it controlsKey mechanicsWhy
Step capMaximum run lengthmax_steps
runtime step counter
Stops an infinite loop before cost growth
Repeat-action controlRepetition of the same actionmax_repeat_action
tool + args key
Catches loops where the agent repeats the same call
No-progress controlCases with no real progressno_progress_window
state-change checks
Stops a run when steps exist but progress does not
Stop reason surfacingTransparency of stop causeexplicit stop reason
partial response
User and team can see why the run was stopped

How it looks in architecture

Step policy layer sits in the runtime loop between planning and action execution. Every decision (allow or stop) is recorded in audit log.

Each agent step passes through this flow before execution: runtime does not execute the next action directly; it first passes step checks.

Flow summary:

  • Runtime forms the next action
  • Step policy layer checks max_steps, repeats, and progress
  • allow -> next agent action executes
  • stop -> stop reason + partial response are returned
  • both decisions are written to audit log

Example

A support agent repeatedly calls search.docs because of unstable external responses.

With step limits:

  • max_steps = 18
  • max_repeat_action = 3
  • no_progress_window = 4

-> the run stops with an explicit stop reason instead of continuing the loop endlessly.

Step limits stop the incident at runtime-loop level rather than relying on model behavior.

In code it looks like this

The simplified scheme above shows the main control flow. In practice, step checks run centrally before each action.

Example step config:

YAML
step_limits:
  max_steps: 18
  max_repeat_action: 3
  no_progress_window: 4
PYTHON
while True:
    # Here, a step is counted as an action intent, not an already executed action.
    state = state.with_step_increment()
    action = planner.next(state)  # planner forms the action for this step
    repeat_key = make_repeat_key(action.name, action.args)  # normalized tool+args key

    decision = step_policy.check(state, action, repeat_key=repeat_key)
    if decision.outcome == "stop":
        audit.log(
            run_id,
            decision.outcome,
            reason=decision.reason,
            step=state.steps,
            action=action.name,
            repeat_key=repeat_key,
        )
        return stop(decision.reason)

    result = tool.execute(action.args)
    state = state.apply(action, result)

    decision = Decision.allow(reason="step_ok")
    audit.log(
        run_id,
        decision.outcome,
        reason=decision.reason,
        step=state.steps,
        action=action.name,
        repeat_key=repeat_key,
        result=result.status,
    )

    if result.final:
        return result

Step policy usually checks three signals: step limit, action repetition, and no progress. For loop detection, use a tool+args key, not only action.name.

How it looks during execution

Scenario 1: max_steps reached

  1. Runtime forms step 19 and increments step counter.
  2. Policy sees max_steps exceeded.
  3. Decision: stop (reason=max_steps).
  4. Stop cause is written to audit log.
  5. User gets a partial response.

Scenario 2: loop detected

  1. Runtime repeatedly forms search.docs with the same args.
  2. Policy counts tool+args repeats.
  3. Decision: stop (reason=loop_detected).
  4. Run stops before another unnecessary call.
  5. Logs show exact cause and action.

Scenario 3: normal execution

  1. Runtime forms a new step.
  2. Policy checks limits: all within bounds.
  3. Decision: allow.
  4. Next agent action executes.
  5. Result and decision are recorded in audit log.

Common mistakes

  • setting max_steps only in UI, not in runtime loop
  • not returning explicit stop reason in response
  • counting only tool calls and ignoring steps
  • not checking repeats (tool + args) and no-progress
  • logging only success steps without stop decisions
  • setting max_steps too high "just in case"

Result: run looks active, but loop growth is faster than team visibility.

Self-check

Quick step-limits check before production launch:

Progress: 0/8

⚠ Baseline governance controls are missing

Before production, you need at least access control, limits, audit logs, and an emergency stop.

FAQ

Q: What starting max_steps should I use?
A: For most synchronous runs, start with 15-25. Then tune based on stop-event frequency in real scenarios.

Q: Is only max_steps enough?
A: No. At minimum add max_repeat_action and no-progress control. For production you also need budgets (max_seconds, max_usd, max_tool_calls).

Q: Which matters more: repeat detection or no-progress?
A: You need both. Repeat detection catches explicit repeats; no-progress catches "soft" loops where actions differ but progress is absent.

Q: What should users see when a run is stopped?
A: Partial response + explicit stop reason + a short next action (rephrase request or rerun with different scope).

Q: Do step limits replace kill switch?
A: No. Step limits govern each run, while kill switch is needed for emergency global stop.

Where Step Limits fit in the system

Step limits are one of Agent Governance layers. Together with RBAC, budgets, approval, and audit, they form a unified execution-control system.

Next on this topic:

⏱️ 6 min read β€’ Updated March 26, 2026Difficulty: β˜…β˜…β˜…
Implement in OnceOnly
Budgets + permissions you can enforce at the boundary.
Use in OnceOnly
# onceonly guardrails (concept)
version: 1
budgets:
  max_steps: 25
  max_tool_calls: 12
  max_seconds: 60
  max_usd: 1.00
policy:
  tool_allowlist:
    - search.read
    - http.get
writes:
  require_approval: true
  idempotency: true
controls:
  kill_switch: { enabled: true }
Integrated: production controlOnceOnly
Add guardrails to tool-calling agents
Ship this pattern with governance:
  • Budgets (steps / spend caps)
  • Tool permissions (allowlist / blocklist)
  • Kill switch & incident stop
  • Idempotency & dedupe
  • Audit logs & traceability
Integrated mention: OnceOnly is a control layer for production agent systems.

Author

Nick β€” engineer building infrastructure for production AI agents.

Focus: agent patterns, failure modes, runtime control, and system reliability.

πŸ”— GitHub: https://github.com/mykolademyanov


Editorial note

This documentation is AI-assisted, with human editorial responsibility for accuracy, clarity, and production relevance.

Content is grounded in real-world failures, post-mortems, and operational incidents in deployed AI agent systems.