Anti-Pattern No Stop Conditions: When Agents Never Stop

Idea In 30 Seconds

No Stop Conditions is an anti-pattern where an agent starts without clear completion conditions.

As a result, the agent can loop and spend budget without real progress. This increases latency, cost, and risk of side effects (state changes).

Simple rule: every agent run must have explicit stop conditions for successful completion and safe exit.

Anti-Pattern Example

The team builds a support agent that should find an answer in internal data and return a result to the user.

But the agent loop has no clear stop conditions.

PYTHON

state = init_state(user_message)

while True:
    decision = agent.next_step(state)
    result = run_tool(decision.tool, decision.args)
    state.append(result)
    # no has_final_answer(...)
    # no no_progress(...)
    # no repeated_action(...) check

In this setup, the agent may endlessly repeat similar steps:

PYTHON

search_docs -> fetch_page -> summarize -> search_docs -> ...

For this case, you need a controlled loop with explicit boundaries:

PYTHON

for step in range(MAX_STEPS):
    ...
    if has_final_answer(state):
        return build_answer(state)

In this case, missing stop conditions lead to:

runaway loop risk (infinite loop)
extra tool and LLM calls
uncontrolled time and budget consumption

Why It Happens And What Goes Wrong

This anti-pattern often appears when a team relies on the model and expects it to "figure out" when to stop.

Typical causes:

no explicit max_steps, timeout, or budget limit
no definition of what counts as a "ready answer"
no no_progress or repeated-action checks
stop control is left only at infrastructure level

As a result, teams get problems:

infinite or long loops - agent repeats similar steps without completion
higher latency - response arrives much later or never arrives
higher cost - number of LLM/tool steps grows for one request
side effects (state changes) - repeated actions may create duplicate records, re-update status, duplicate API calls, or re-send external actions
unstable results - same request completes differently across runs

Typical production signals that stop conditions are missing or weak:

noticeable share of runs end with infrastructure timeout, not controlled stop
P95 by step count keeps growing
traces show repeated identical calls with minimal argument changes
cost per request grows faster than success rate

It is important that each next step is part of LLM inference. If there are no clear completion conditions in the loop, the model keeps choosing "one more step" even when there is almost no new useful information.

When this setup expands, without trace and execution visualization it becomes hard to explain why a run did not stop in time. That is why production systems usually have a dedicated observability layer for agent runs.

Correct Approach

Start with the simplest controlled loop that reliably handles most requests today. Add new stop conditions only when there is a measurable failure, risk, or limitation in the current design.

Practical framework:

set a positive completion condition (final_answer_ready)
set guard boundaries (max_steps, timeout, budget)
add no_progress and repeated-action checks
record stop reason for each run and track metrics (for example, improved success rate without sharp growth in latency and cost per request)

In practice, no_progress often means repeated identical tool calls, minimal state changes, or no new useful information after the next step.

PYTHON

MAX_STEPS = 8

def run_agent(user_message: str):
    state = init_state(user_message)

    for step in range(MAX_STEPS):  # hard limit for runaway loops
        if timed_out():
            return stop("timeout")
        if budget_exceeded():
            return stop("budget_exceeded")

        decision = agent.next_step(state)

        if decision.type == "final_answer":
            if validate_output(decision.output):  # format, required fields, no empty answer
                return decision.output
            return stop("invalid_output")

        result = run_tool(decision.tool, decision.args)
        if no_progress(state, result):  # same tool/result pattern or no meaningful state change
            return stop("no_progress")

        state.append(result)

    return stop("max_steps_exceeded")

In this setup, the loop becomes controlled: the system either returns a valid answer or stops with a transparent reason.

Quick Test

If these questions are answered with "yes", you have no-stop-conditions risk:

Do some runs end with timeout instead of a controlled stop_reason?
Does one request sometimes do disproportionately many steps with no visible progress?
Do traces show repeated similar actions without new outcomes?

How It Differs From Other Anti-Patterns

No Monitoring vs No Stop Conditions

No Monitoring	No Stop Conditions
Main problem: system lacks enough observability to see what happens during a run.	Main problem: agent loop has no clear completion conditions.
When it appears: when run-level logs, traces, metrics, and `stop_reason` are missing.	When it appears: when a run proceeds without `max_steps`, `timeout`, budget limit, or `no_progress` checks.

Agents Without Guardrails vs No Stop Conditions

Agents Without Guardrails	No Stop Conditions
Main problem: agent runs without policy boundaries and system constraints.	Main problem: agent can run infinite or too-long loops.
When it appears: when there is no allowlist, deny-by-default, budget, or safety constraints.	When it appears: when loop logic has no explicit completion criterion and controlled `stop_reason`.

Self-Check: Do You Have This Anti-Pattern?

Quick check for the anti-pattern No Stop Conditions.
Mark items for your system and check status below.

Check your system:

agent runs without explicit max_steps for one run
there is no run-level timeout in logic, only infrastructure timeout
there is no budget limit or it does not affect loop stop
stop_reason is not recorded for each completion
loop may repeat identical tool calls without no_progress check
there is no explicit condition when an answer is considered ready
a typical request sometimes takes disproportionately many steps
run ends by timeout without clear stop_reason in logs

Progress: 0/8

⚠ There are signs of this anti-pattern

Move simple steps into a workflow and keep the agent only for complex decisions.

FAQ

Q: If we have max_steps, is that already enough?
A: No. max_steps is required, but by itself does not cover all risks. You also need timeout, budget limit, progress checks, and a valid ready-answer criterion.

Q: When should we add a new stop condition?
A: When there is a concrete signal: incidents, repeated loops, or growth in cost or latency that current rules cannot address without disproportionate system complexity.

Q: How to start if stop conditions are almost absent now?
A: Start with the minimum: max_steps, timeout, budget, and stop_reason in logs. Then add no_progress and final-answer validation.

What Next

Related anti-patterns:

No Monitoring - when you cannot see that the agent is looping or degrading.
Too Many Tools - when tool overload increases the number of unnecessary steps.
Overengineering Agents - when extra complexity makes completion control harder.

What to build instead:

Stop Conditions - core model for defining safe stop conditions.
Step Limits - how to set step limits at governance level.
Budget Controls - how to limit run costs.
Kill Switch - emergency stop when system goes out of control.

Anti-Pattern No Stop Conditions: When Agents Never Stop

Idea In 30 Seconds

Anti-Pattern Example

Why It Happens And What Goes Wrong

Correct Approach

Quick Test

How It Differs From Other Anti-Patterns

No Monitoring vs No Stop Conditions

Agents Without Guardrails vs No Stop Conditions

Self-Check: Do You Have This Anti-Pattern?

FAQ

What Next

Used by patterns

Related failures

Governance required

Author

Editorial note