CrewAI vs LangGraph (comparatif production) + code

  • Choisis sans te faire piéger par la démo.
  • Vois ce qui casse en prod (ops, coût, drift).
  • Obtiens un chemin de migration + une checklist.
  • Pars avec des defaults : budgets, validation, stop reasons.
CrewAI vise l’orchestration multi‑agent par rôles. LangGraph vise des machines à états explicites. Ce qui casse en prod, ce qui s’opère le mieux, et comment migrer proprement.
Sur cette page
  1. Le problème (côté prod)
  2. Décision rapide (qui choisit quoi)
  3. Pourquoi on choisit mal en prod
  4. 1) They pick based on “demo vibes”
  5. 2) They confuse “graph” with “safe”
  6. 3) They don’t define state
  7. Tableau comparatif
  8. Où ça casse en prod
  9. CrewAI-style multi-agent breaks
  10. LangGraph-style flow breaks
  11. Exemple d’implémentation (code réel)
  12. Incident réel (avec chiffres)
  13. Chemin de migration (A → B)
  14. CrewAI → LangGraph (common path)
  15. LangGraph → CrewAI (when roles matter)
  16. Guide de décision
  17. Compromis
  18. Quand NE PAS l’utiliser
  19. Checklist (copier-coller)
  20. Config par défaut sûre (JSON/YAML)
  21. FAQ (3–5)
  22. Pages liées (3–6 liens)

Le problème (côté prod)

You want to ship an agent system that does real work, not a weekend demo.

Someone on the team says: “Let’s do multi-agent with CrewAI.” Someone else says: “We should use LangGraph; graphs are easier to reason about.”

Both can work. Both can also produce the same outcome in production: a slow, expensive, hard-to-debug system if you don’t build a control layer.

The question isn’t “which is cooler”. The question is: which one makes failure modes obvious and governable.

Décision rapide (qui choisit quoi)

  • Pick CrewAI if you explicitly want role-based multi-agent collaboration and you can invest in orchestration + monitoring to prevent deadlocks/thrash.
  • Pick LangGraph if you want explicit state + deterministic-ish transitions you can test, replay, and roll back without guessing what the model “meant”.
  • If you don’t have strong budgets/permissions/monitoring yet, LangGraph-style explicit flow usually hurts less.

Pourquoi on choisit mal en prod

1) They pick based on “demo vibes”

Multi-agent role play looks impressive. It also adds:

  • coordination overhead
  • waiting states
  • circular dependencies
  • more tool calls

If you’re not ready to instrument it, it’ll fail quietly.

2) They confuse “graph” with “safe”

A graph is not governance. It’s a place to put governance.

You still need:

  • budgets
  • permissions
  • validation
  • approvals for writes
  • stop reasons

3) They don’t define state

If you can’t write down:

  • current state
  • allowed transitions
  • stop conditions

…your system will drift into “agent chooses everything”, which is just a fancy way to say “debugging is vibes”.

Tableau comparatif

| Criterion | CrewAI | LangGraph | What matters in prod | |---|---|---| | Primary abstraction | Roles + collaboration | State + transitions | Debuggability | | Determinism | Lower | Higher | Replay + tests | | Failure handling | Emergent unless designed | Easier to encode | Stop reasons | | Observability | You must add it | You must add it | “What did it do?” | | Loop/Deadlock risk | Higher | Medium | On-call load | | Migration friendliness | Medium | High | Canaries/rollback |

Où ça casse en prod

CrewAI-style multi-agent breaks

  • agents wait on each other (deadlocks)
  • roles “disagree” and loop
  • more context passed around → token overuse
  • tool spam (agents “helpfully” re-search)

LangGraph-style flow breaks

  • state machine grows complex
  • devs cram “just let the model decide” nodes everywhere
  • missing validation on edges turns graphs into “unsafe pipes”

The common failure is the same: missing governance.

Exemple d’implémentation (code réel)

The production trick is to separate:

  1. your orchestration framework
  2. your control layer (which should survive framework changes)

This is a framework-agnostic tool gateway + budget guard you can wrap around either approach.

PYTHON
from dataclasses import dataclass
from typing import Any, Callable
import time


@dataclass(frozen=True)
class Budgets:
  max_steps: int = 40
  max_tool_calls: int = 20
  max_seconds: int = 120


class Stop(RuntimeError):
  def __init__(self, reason: str):
      super().__init__(reason)
      self.reason = reason


class ToolGateway:
  def __init__(self, *, allow: set[str], impls: dict[str, Callable[..., Any]]):
      self.allow = allow
      self.impls = impls
      self.calls = 0

  def call(self, tool: str, args: dict[str, Any], *, budgets: Budgets) -> Any:
      self.calls += 1
      if self.calls > budgets.max_tool_calls:
          raise Stop("max_tool_calls")
      if tool not in self.allow:
          raise Stop(f"tool_denied:{tool}")
      fn = self.impls.get(tool)
      if not fn:
          raise Stop(f"tool_missing:{tool}")
      return fn(**args)


def run_framework(orchestration_fn, *, budgets: Budgets, tools: ToolGateway) -> dict[str, Any]:
  started = time.time()
  for step in range(budgets.max_steps):
      if time.time() - started > budgets.max_seconds:
          return {"status": "stopped", "stop_reason": "max_seconds"}
      try:
          # orchestration_fn must call tools via ToolGateway only.
          out = orchestration_fn(step=step, tools=tools)  # (pseudo)
          if out.get("done"):
              return {"status": "ok", "result": out.get("result")}
      except Stop as e:
          return {"status": "stopped", "stop_reason": e.reason}
  return {"status": "stopped", "stop_reason": "max_steps"}
JAVASCRIPT
export class Stop extends Error {
constructor(reason) {
  super(reason);
  this.reason = reason;
}
}

export class ToolGateway {
constructor({ allow = [], impls = {} } = {}) {
  this.allow = new Set(allow);
  this.impls = impls;
  this.calls = 0;
}

call(tool, args, { budgets }) {
  this.calls += 1;
  if (this.calls > budgets.maxToolCalls) throw new Stop("max_tool_calls");
  if (!this.allow.has(tool)) throw new Stop("tool_denied:" + tool);
  const fn = this.impls[tool];
  if (!fn) throw new Stop("tool_missing:" + tool);
  return fn(args);
}
}

export function runFramework(orchestrationFn, { budgets, tools }) {
const started = Date.now();
for (let step = 0; step < budgets.maxSteps; step++) {
  if ((Date.now() - started) / 1000 > budgets.maxSeconds) return { status: "stopped", stop_reason: "max_seconds" };
  try {
    const out = orchestrationFn({ step, tools }); // (pseudo)
    if (out && out.done) return { status: "ok", result: out.result };
  } catch (e) {
    if (e instanceof Stop) return { status: "stopped", stop_reason: e.reason };
    throw e;
  }
}
return { status: "stopped", stop_reason: "max_steps" };
}

Incident réel (avec chiffres)

We saw a multi-agent system shipped for “support triage”. It was role-based, and it looked great in a demo.

In production:

  • one role started “double-checking” by re-searching
  • another role waited for the first role’s output

Impact over a day:

  • tool calls/run: 6 → 24
  • p95 latency: 4.1s → 21.6s
  • spend: +$530 vs baseline
  • on-call time: ~2 hours to identify that the issue was “agent coordination”, not an external outage

Fix:

  1. explicit step limits + repeat detection
  2. tool gateway dedupe for repeated search calls
  3. degrade mode during search instability

The framework wasn’t the villain. Lack of control was.

Chemin de migration (A → B)

CrewAI → LangGraph (common path)

  1. log real runs and identify the “happy path”
  2. encode that path as explicit graph states
  3. keep a bounded “agentic” branch for edge cases
  4. keep the same tool gateway + budgets (don’t rewrite governance)

LangGraph → CrewAI (when roles matter)

  1. keep the graph as the orchestrator
  2. swap specific nodes to call “role agents”
  3. enforce budgets and stop reasons at the outer loop

Guide de décision

  • If you need explicit state and replay → pick LangGraph-style graphs.
  • If you need collaboration patterns (reviewer/critic/planner) → CrewAI can fit, but budget it hard.
  • If you’re early and under-instrumented → pick the approach that’s easiest to test and trace.

Compromis

  • Multi-agent can improve quality on complex tasks, but increases coordination failures.
  • Graphs improve debuggability, but the state machine becomes real code you must maintain.
  • Either way, the control layer is non-optional in production.

Quand NE PAS l’utiliser

  • Don’t ship multi-agent without timeouts, leases, and stop reasons.
  • Don’t build graphs that “just call the model to decide everything” — you lose the point of a graph.
  • Don’t pick a framework first. Pick the failure modes you can tolerate.

Checklist (copier-coller)

  • [ ] Keep governance framework-agnostic (budgets + tool gateway)
  • [ ] Add stop reasons and surface them to users
  • [ ] Add repeat detection + tool dedupe
  • [ ] Start read-only; gate writes behind approvals
  • [ ] Canary changes; expect drift
  • [ ] Test replay on golden tasks

Config par défaut sûre (JSON/YAML)

YAML
budgets:
  max_steps: 40
  max_tool_calls: 20
  max_seconds: 120
tools:
  allow: ["search.read", "kb.read", "http.get"]
writes:
  require_approval: true
monitoring:
  track: ["tool_calls_per_run", "latency_p95", "stop_reason"]

FAQ (3–5)

Is multi-agent always better?
No. It can improve quality, but it increases coordination failures. You pay for it in observability and governance.
Are graphs only for workflows?
No. Graphs can orchestrate agents too. The value is explicit state and testability.
What’s the first guardrail to add?
Budgets (steps/tool calls/time) and a tool gateway with a default-deny allowlist.
Can we migrate without rewriting everything?
Yes if you keep governance outside the framework: budget guard + tool gateway + logging.

Q: Is multi-agent always better?
A: No. It can improve quality, but it increases coordination failures. You pay for it in observability and governance.

Q: Are graphs only for workflows?
A: No. Graphs can orchestrate agents too. The value is explicit state and testability.

Q: What’s the first guardrail to add?
A: Budgets (steps/tool calls/time) and a tool gateway with a default-deny allowlist.

Q: Can we migrate without rewriting everything?
A: Yes if you keep governance outside the framework: budget guard + tool gateway + logging.

Pages liées (3–6 liens)

Pas sur que ce soit votre cas ?

Concevez votre agent ->
⏱️ 8 min de lectureMis à jour Mars, 2026Difficulté: ★★☆
Intégré : contrôle en productionOnceOnly
Ajoutez des garde-fous aux agents tool-calling
Livrez ce pattern avec de la gouvernance :
  • Budgets (steps / plafonds de coût)
  • Permissions outils (allowlist / blocklist)
  • Kill switch & arrêt incident
  • Idempotence & déduplication
  • Audit logs & traçabilité
Mention intégrée : OnceOnly est une couche de contrôle pour des systèmes d’agents en prod.
Auteur

Cette documentation est organisée et maintenue par des ingénieurs qui déploient des agents IA en production.

Le contenu est assisté par l’IA, avec une responsabilité éditoriale humaine quant à l’exactitude, la clarté et la pertinence en production.

Les patterns et recommandations s’appuient sur des post-mortems, des modes de défaillance et des incidents opérationnels dans des systèmes déployés, notamment lors du développement et de l’exploitation d’une infrastructure de gouvernance pour les agents chez OnceOnly.