Esta es la implementacion educativa completa del ejemplo del articulo Cuando un agente debe detenerse (y quien lo decide).
Si aun no leiste el articulo, empieza por ahi. Aqui el foco es solo en codigo: como el sistema verifica stop conditions despues de cada paso y termina de forma controlada.
Que Demuestra Este Ejemplo
- Como el agente avanza por pasos, y la decision de detenerse no la toma el modelo sino la policy en runtime
- Como funcionan las stop conditions basicas:
goal_reached,step_limit,too_many_errors,no_progress - Que condiciones se activan en esta demo:
goal_reached,too_many_errors,step_limit - Por que el agente debe devolver
stop_reason, incluso si la tarea no termino - Como tres situaciones distintas dan finales distintos: exito vs parada de emergencia
Estructura del Proyecto
foundations/
└── stop-conditions/
└── python/
├── main.py # ejecuta escenarios e imprime resumen
├── agent.py # ciclo del agente + chequeo de stop conditions
├── llm.py # capa simple de decisiones: next action
├── tools.py # tools educativos y fallos controlados
└── requirements.txt
Como Ejecutar
1. Clona el repositorio y entra en la carpeta:
git clone https://github.com/AgentPatterns-tech/agentpatterns.git
cd foundations/stop-conditions/python
2. Instala dependencias (este ejemplo no tiene paquetes externos):
pip install -r requirements.txt
3. Ejecuta la demo:
python main.py
Que Construimos en el Codigo
Construimos un ciclo de agente simple con un circuito separado de parada.
- el modelo elige la siguiente accion
- una herramienta se ejecuta y cambia el estado
- runtime cuenta metricas (
steps,errors,no_progress) - despues de cada paso, policy verifica stop conditions
- si se activa una condicion, el ciclo termina con
stop_reasonexplicito
Idea clave: el modelo no decide cuando "es suficiente". Eso lo decide el sistema.
Codigo
tools.py - tools educativos con fallo controlado
from typing import Any
def make_initial_state(user_id: int, fail_fetch_times: int) -> dict[str, Any]:
return {
"user_id": user_id,
"fetch_calls": 0,
"fail_fetch_times": fail_fetch_times,
}
def fetch_orders(state: dict[str, Any]) -> dict[str, Any]:
state["fetch_calls"] += 1
if state["fetch_calls"] <= state["fail_fetch_times"]:
return {"ok": False, "error": "orders_api_timeout"}
orders = [
{"id": "ord-2001", "total": 49.9, "status": "paid"},
{"id": "ord-2002", "total": 19.0, "status": "shipped"},
]
state["orders"] = orders
return {"ok": True, "orders": orders}
def build_summary(state: dict[str, Any]) -> dict[str, Any]:
orders = state.get("orders")
if not orders:
return {"ok": False, "error": "missing_orders"}
summary = f"Prepared report for {len(orders)} recent orders."
state["summary"] = summary
return {"ok": True, "summary": summary}
llm.py - eleccion educativa simple de la siguiente accion
from typing import Any
def choose_next_action(task: str, state: dict[str, Any]) -> dict[str, Any]:
# Learning version: fixed policy keeps behavior easy to reason about.
_ = task
if "orders" not in state:
return {"action": "fetch_orders", "parameters": {}}
return {"action": "build_summary", "parameters": {}}
agent.py - ciclo del agente con stop conditions guiadas por policy
from dataclasses import dataclass
from typing import Any
from llm import choose_next_action
from tools import build_summary, fetch_orders, make_initial_state
TOOLS = {
"fetch_orders": fetch_orders,
"build_summary": build_summary,
}
@dataclass
class StopPolicy:
max_steps: int
max_errors: int
max_no_progress: int
def evaluate_stop_conditions(
state: dict[str, Any],
steps: int,
errors: int,
no_progress: int,
policy: StopPolicy,
) -> str | None:
if "summary" in state:
return "goal_reached"
if steps >= policy.max_steps:
return "step_limit"
if errors >= policy.max_errors:
return "too_many_errors"
if no_progress >= policy.max_no_progress:
return "no_progress"
return None
def run_agent(task: str, user_id: int, fail_fetch_times: int, policy: StopPolicy) -> dict[str, Any]:
state = make_initial_state(user_id=user_id, fail_fetch_times=fail_fetch_times)
history: list[dict[str, Any]] = []
steps = 0
errors = 0
no_progress = 0
stop_reason: str | None = None
while True:
stop_reason = evaluate_stop_conditions(
state=state,
steps=steps,
errors=errors,
no_progress=no_progress,
policy=policy,
)
if stop_reason is not None:
break
steps += 1
call = choose_next_action(task, state)
action = call["action"]
history.append({"step": steps, "action": action, "status": "requested"})
tool = TOOLS.get(action)
if not tool:
errors += 1
no_progress += 1
state["last_error"] = f"unknown_action:{action}"
history.append({"step": steps, "action": action, "status": "error"})
else:
before_keys = set(state.keys())
result = tool(state)
if result.get("ok"):
after_keys = set(state.keys())
progress = len(after_keys - before_keys) > 0
no_progress = 0 if progress else no_progress + 1
state.pop("last_error", None)
history.append({"step": steps, "action": action, "status": "ok"})
else:
errors += 1
no_progress += 1
state["last_error"] = result.get("error", "unknown_error")
history.append({"step": steps, "action": action, "status": "error"})
return {
"done": stop_reason == "goal_reached",
"stop_reason": stop_reason,
"steps": steps,
"errors": errors,
"no_progress": no_progress,
"summary": state.get("summary"),
"history": history,
}
main.py - tres escenarios con finales distintos
import json
from agent import StopPolicy, run_agent
TASK = "Build weekly orders summary"
POLICY = StopPolicy(max_steps=6, max_errors=2, max_no_progress=3)
STEP_LIMIT_POLICY = StopPolicy(max_steps=1, max_errors=2, max_no_progress=3)
def compact_result(result: dict) -> str:
return (
"{"
f"\"done\": {str(bool(result.get('done'))).lower()}, "
f"\"stop_reason\": {json.dumps(result.get('stop_reason'), ensure_ascii=False)}, "
f"\"steps\": {int(result.get('steps', 0))}, "
f"\"errors\": {int(result.get('errors', 0))}, "
f"\"no_progress\": {int(result.get('no_progress', 0))}, "
f"\"summary\": {json.dumps(result.get('summary'), ensure_ascii=False)}, "
"\"history\": [{...}]"
"}"
)
def print_policy(policy: StopPolicy) -> None:
print(
"Policy:",
json.dumps(
{
"max_steps": policy.max_steps,
"max_errors": policy.max_errors,
"max_no_progress": policy.max_no_progress,
},
ensure_ascii=False,
),
)
def main() -> None:
print("=== SCENARIO 1: GOAL REACHED ===")
print_policy(POLICY)
result_ok = run_agent(
task=TASK,
user_id=42,
fail_fetch_times=1,
policy=POLICY,
)
print("Run result:", compact_result(result_ok))
print("\n=== SCENARIO 2: STOPPED BY ERROR LIMIT ===")
print_policy(POLICY)
result_stopped = run_agent(
task=TASK,
user_id=42,
fail_fetch_times=10,
policy=POLICY,
)
print("Run result:", compact_result(result_stopped))
print("\n=== SCENARIO 3: STOPPED BY STEP LIMIT ===")
print_policy(STEP_LIMIT_POLICY)
result_step_limit = run_agent(
task=TASK,
user_id=42,
fail_fetch_times=0,
policy=STEP_LIMIT_POLICY,
)
print("Run result:", compact_result(result_step_limit))
if __name__ == "__main__":
main()
requirements.txt
# No external dependencies for this learning example.
Ejemplo de Salida
python main.py
=== SCENARIO 1: GOAL REACHED ===
Policy: {"max_steps": 6, "max_errors": 2, "max_no_progress": 3}
Run result: {"done": true, "stop_reason": "goal_reached", "steps": 3, "errors": 1, "no_progress": 0, "summary": "Prepared report for 2 recent orders.", "history": [{...}]}
=== SCENARIO 2: STOPPED BY ERROR LIMIT ===
Policy: {"max_steps": 6, "max_errors": 2, "max_no_progress": 3}
Run result: {"done": false, "stop_reason": "too_many_errors", "steps": 2, "errors": 2, "no_progress": 2, "summary": null, "history": [{...}]}
=== SCENARIO 3: STOPPED BY STEP LIMIT ===
Policy: {"max_steps": 1, "max_errors": 2, "max_no_progress": 3}
Run result: {"done": false, "stop_reason": "step_limit", "steps": 1, "errors": 0, "no_progress": 0, "summary": null, "history": [{...}]}
Nota: en la salida,
historyse muestra intencionalmente en forma corta -"history": [{...}].
Criterio de correccion del ejemplo: el agente siempre termina el ciclo constop_reasonexplicito y no corre infinitamente.
Lo Que Se Ve en la Practica
| SCENARIO 1 | SCENARIO 2 | SCENARIO 3 | |
|---|---|---|---|
| Final | goal_reached | too_many_errors | step_limit |
| Tarea completada con exito | ✅ | ❌ | ❌ |
| Ciclo limitado por policy | ✅ | ✅ | ✅ |
| Hay explicacion explicita de por que se detuvo | ✅ | ✅ | ✅ |
Que Cambiar en Este Ejemplo
- Agrega
max_duration_secy parada por wall-clock timeout - Agrega un presupuesto separado para tool-calls (
max_tool_calls) - Agrega
stop_reason_detailspara registrar la causa con mas precision - Agrega un cuarto escenario donde se active
no_progress
Codigo Completo en GitHub
En el repositorio esta la version completa de esta demo: ciclo del agente, policy stop conditions y finalizacion controlada.
Ver codigo completo en GitHub ↗