Agents LLM vs workflows (comparatif production) + code

  • Choisis sans te faire piéger par la démo.
  • Vois ce qui casse en prod (ops, coût, drift).
  • Obtiens un chemin de migration + une checklist.
  • Pars avec des defaults : budgets, validation, stop reasons.
Les agents sont des boucles qui prennent des décisions. Les workflows sont des exécutions déterministes (ou presque). En prod, la question est : où mets-tu l’incertitude ?
Sur cette page
  1. Le problème (côté prod)
  2. Décision rapide (qui choisit quoi)
  3. Pourquoi on choisit mal en prod
  4. 1) They confuse “flexible” with “reliable”
  5. 2) They underestimate governance cost
  6. 3) They start with writes
  7. 4) Workflows fail loudly, agents fail quietly
  8. Tableau comparatif
  9. Où ça casse en prod
  10. Workflow breaks
  11. Agent breaks
  12. Exemple d’implémentation (code réel)
  13. Incident réel (avec chiffres)
  14. Chemin de migration (A → B)
  15. Workflow → Agent (safe-ish)
  16. Agent → Workflow (when you regret it)
  17. Guide de décision
  18. Compromis
  19. Quand NE PAS l’utiliser
  20. Checklist (copier-coller)
  21. Config par défaut sûre (JSON/YAML)
  22. FAQ (3–5)
  23. Pages liées (3–6 liens)

Le problème (côté prod)

You have a task: “handle support tickets”, “triage alerts”, “enrich leads”, “review code”.

Someone suggests an agent. Someone else suggests a workflow.

In a demo, the agent wins. In production, the winner is usually: the thing you can operate.

The most expensive mistake we see is choosing an agent when you needed a workflow, and then adding governance until it’s basically a workflow anyway — except now it’s nondeterministic.

Décision rapide (qui choisit quoi)

  • Pick a workflow when you can define steps, inputs, and success conditions. You’ll ship faster and sleep better.
  • Pick an agent when the environment is messy (unknown docs, noisy tools) and you can’t enumerate all paths — but only if you’re willing to add budgets, permissions, and monitoring.
  • If you’re not ready to build a control layer, don’t pick an agent. Pick a workflow.

Pourquoi on choisit mal en prod

1) They confuse “flexible” with “reliable”

Agents are flexible. Reliability comes from:

  • budgets
  • validations
  • idempotency
  • approvals
  • monitoring

Without those, agents are flexible at creating incidents.

2) They underestimate governance cost

The first time an agent loops, you add step limits. The first time it spams a tool, you add tool budgets. The first time it writes incorrectly, you add approvals.

At that point, you’ve built a workflow… but with extra variance.

3) They start with writes

Agents with write tools in week one are a predictable failure. Start read-only.

4) Workflows fail loudly, agents fail quietly

Workflow failure: a step errors. Agent failure: it “kind of works” but gets slower, costlier, and weirder.

That’s drift. Drift is a production problem.

Tableau comparatif

| Criteria | Workflow | LLM Agent | What matters in prod | |---|---|---|---| | Determinism | High | Low/medium | Debuggability, replay | | Failure handling | Explicit | Emergent unless designed | Prevent thrash, stop reasons | | Observability | Straightforward | Requires intentional tracing | “What did it do?” | | Cost control | Predictable | Needs budgets + gating | No finance surprises | | Change safety | Standard deploy | Drift-prone | Canary, golden tasks | | Best for | Known paths | Unknown paths | Match system to reality |

Où ça casse en prod

The failure modes differ:

Workflow breaks

  • a step fails (timeout, 500)
  • a queue backs up
  • a schema changes

Fixes are mostly deterministic: retry policy, backoff, idempotency, rollbacks.

Agent breaks

  • tool spam loops (search thrash)
  • partial outages amplify (retries in loops)
  • prompt injection steers tool calls
  • token overuse truncates policy
  • silent drift changes behavior

Agents break like control systems, because they are control systems.

Exemple d’implémentation (code réel)

The “agent vs workflow” decision isn’t about libraries. It’s about boundaries.

Here’s a minimal boundary you can use for either:

  • tool gateway with allowlist
  • budgets (steps/tool calls/time)
  • stop reasons
PYTHON
from dataclasses import dataclass
from typing import Any


@dataclass(frozen=True)
class Budgets:
  max_steps: int = 25
  max_tool_calls: int = 12


class Stop(RuntimeError):
  def __init__(self, reason: str):
      super().__init__(reason)
      self.reason = reason


class ToolGateway:
  def __init__(self, *, allow: set[str]):
      self.allow = allow
      self.calls = 0

  def call(self, tool: str, args: dict[str, Any], *, budgets: Budgets) -> Any:
      self.calls += 1
      if self.calls > budgets.max_tool_calls:
          raise Stop("max_tool_calls")
      if tool not in self.allow:
          raise Stop(f"tool_denied:{tool}")
      return tool_impl(tool, args=args)  # (pseudo)


def workflow(task: str, *, budgets: Budgets) -> dict[str, Any]:
  tools = ToolGateway(allow={"kb.read"})
  try:
      doc = tools.call("kb.read", {"q": task}, budgets=budgets)
      return {"status": "ok", "answer": summarize(doc)}  # (pseudo)
  except Stop as e:
      return {"status": "stopped", "stop_reason": e.reason}


def agent(task: str, *, budgets: Budgets) -> dict[str, Any]:
  tools = ToolGateway(allow={"search.read", "kb.read", "http.get"})
  try:
      for _ in range(budgets.max_steps):
          action = llm_decide(task)  # (pseudo)
          if action.kind == "final":
              return {"status": "ok", "answer": action.final_answer}
          obs = tools.call(action.name, action.args, budgets=budgets)
          task = update(task, action, obs)  # (pseudo)
      return {"status": "stopped", "stop_reason": "max_steps"}
  except Stop as e:
      return {"status": "stopped", "stop_reason": e.reason}
JAVASCRIPT
export class Stop extends Error {
constructor(reason) {
  super(reason);
  this.reason = reason;
}
}

export class ToolGateway {
constructor({ allow = [] } = {}) {
  this.allow = new Set(allow);
  this.calls = 0;
}

call(tool, args, { budgets }) {
  this.calls += 1;
  if (this.calls > budgets.maxToolCalls) throw new Stop("max_tool_calls");
  if (!this.allow.has(tool)) throw new Stop("tool_denied:" + tool);
  return toolImpl(tool, { args }); // (pseudo)
}
}

Incident réel (avec chiffres)

We saw a team replace a simple workflow with an agent “for flexibility”.

The workflow had fixed steps and predictable costs. The agent started calling search + browser tools because “maybe it helps”.

Impact in the first week:

  • p95 latency: 1.9s → 9.7s
  • spend: +$640 vs baseline
  • and the worst part: incidents were harder to debug because behavior wasn’t deterministic

Fix:

  1. they moved 80% of the task back into a workflow
  2. the agent became a bounded “investigation step” behind strict budgets
  3. writes required approval

In production, hybrid usually wins: workflow for the known path, agent for the messy corner.

Chemin de migration (A → B)

Workflow → Agent (safe-ish)

  1. keep the workflow as the default path
  2. add an agent only for ambiguous sub-tasks (bounded)
  3. enforce budgets + permissions + monitoring first
  4. canary rollout + golden tasks to catch drift

Agent → Workflow (when you regret it)

  1. log traces and identify the common path
  2. codify common path as deterministic steps
  3. keep the agent only for exceptions
  4. delete “agent as default” once confidence is high

Guide de décision

  • If you can write a state machine for it → pick a workflow.
  • If you can’t, but the cost of being wrong is low → bounded agent might work.
  • If the cost of being wrong is high → workflow + approvals, or don’t automate.
  • If you can’t afford monitoring and governance → don’t ship an agent.

Compromis

  • Workflows are less flexible.
  • Agents require governance to be safe.
  • Hybrid systems add complexity, but often reduce incident rate.

Quand NE PAS l’utiliser

  • Don’t use agents for irreversible writes without approvals.
  • Don’t use agents when success conditions are crisp and steps are known.
  • Don’t use workflows when the input space is too open-ended (you’ll just rebuild an agent poorly).

Checklist (copier-coller)

  • [ ] Can you enumerate steps? If yes, start with a workflow.
  • [ ] If you use an agent, add budgets + tool gateway first.
  • [ ] Start read-only; gate writes behind approvals.
  • [ ] Return stop reasons; don’t timeout silently.
  • [ ] Monitor tokens, tool calls, latency, stop reasons.
  • [ ] Canary changes to models/prompts/tools; expect drift.

Config par défaut sûre (JSON/YAML)

YAML
mode:
  default: "workflow"
  agent_for_exceptions: true
budgets:
  max_steps: 25
  max_tool_calls: 12
  max_seconds: 60
tools:
  allow: ["kb.read", "search.read", "http.get"]
writes:
  require_approval: true
monitoring:
  track: ["tool_calls_per_run", "tokens_per_request", "latency_p95", "stop_reason"]

FAQ (3–5)

Can we use an agent without a tool gateway?
If there are no tools and no side effects, maybe. The moment tools exist, you need a gateway for policy and budgets.
What’s the safest hybrid?
Workflow for the common path, bounded agent for investigations, approvals for writes.
Why do agents drift more?
Model/prompt/tool changes shift decisions. Without golden tasks and canaries, regressions ship quietly.
What’s the first metric to watch?
Tool calls/run. It moves before correctness complaints and before invoices.

Q: Can we use an agent without a tool gateway?
A: If there are no tools and no side effects, maybe. The moment tools exist, you need a gateway for policy and budgets.

Q: What’s the safest hybrid?
A: Workflow for the common path, bounded agent for investigations, approvals for writes.

Q: Why do agents drift more?
A: Model/prompt/tool changes shift decisions. Without golden tasks and canaries, regressions ship quietly.

Q: What’s the first metric to watch?
A: Tool calls/run. It moves before correctness complaints and before invoices.

Pages liées (3–6 liens)

Pas sur que ce soit votre cas ?

Concevez votre agent ->
⏱️ 8 min de lectureMis à jour Mars, 2026Difficulté: ★★☆
Intégré : contrôle en productionOnceOnly
Ajoutez des garde-fous aux agents tool-calling
Livrez ce pattern avec de la gouvernance :
  • Budgets (steps / plafonds de coût)
  • Permissions outils (allowlist / blocklist)
  • Kill switch & arrêt incident
  • Idempotence & déduplication
  • Audit logs & traçabilité
Mention intégrée : OnceOnly est une couche de contrôle pour des systèmes d’agents en prod.
Auteur

Cette documentation est organisée et maintenue par des ingénieurs qui déploient des agents IA en production.

Le contenu est assisté par l’IA, avec une responsabilité éditoriale humaine quant à l’exactitude, la clarté et la pertinence en production.

Les patterns et recommandations s’appuient sur des post-mortems, des modes de défaillance et des incidents opérationnels dans des systèmes déployés, notamment lors du développement et de l’exploitation d’une infrastructure de gouvernance pour les agents chez OnceOnly.