Kill Switch for AI Agents: how to emergency-stop actions without a release

Practical kill switch in production: global/per-tenant stop, writes-disable mode, stop reasons, audit trail, and a short runbook.
On this page
  1. Idea in 30 seconds
  2. Problem
  3. Solution
  4. Kill switch β‰  full governance system
  5. Kill-switch control envelope
  6. How it looks in architecture
  7. Example
  8. In code it looks like this
  9. How it looks during execution
  10. Scenario 1: global stop
  11. Scenario 2: tenant stop
  12. Scenario 3: writes disabled
  13. Common mistakes
  14. Self-check
  15. FAQ
  16. Where Kill Switch fits in the system
  17. Related pages

Idea in 30 seconds

Kill switch is an emergency runtime control that lets you instantly stop new agent actions during an incident, without a release and without changing the prompt.

When you need it:
when an agent can perform write actions, works with external APIs, and an error is already escalating into a production incident.

Problem

When an agent starts doing harmful actions, there is usually no time to "tune the prompt and ship a release". While the team is analyzing the issue, the agent may continue doing the same actions. And every minute of delay means more side effects in production.

Typical pattern:

  • duplicate emails or messages
  • mass create/update operations
  • excessive calls to external APIs

Analogy: it is like an emergency stop button on a production line. During a failure, first you stop movement, then you investigate the cause.

If kill switch works only in UI, it is not emergency control. The real stop must happen in runtime loop and in tool gateway.

Solution

The solution is to add a centralized kill-switch policy layer that is checked after the next action is formed, but before execution. Policy returns allow or stop with explicit reason: killed_global, killed_tenant, writes_disabled, tool_disabled.

Baseline model:

  • global kill β€” emergency-stops everyone
  • tenant kill β€” stops a specific customer
  • writes disabled β€” allows read, blocks write
  • tool disabled β€” blocks one specific tool

This is a separate emergency control layer, not part of prompt or UI logic.

Kill switch β‰  full governance system

Kill switch and governance solve different tasks:

  • Kill switch stops the incident "here and now"
  • Governance controls agent behavior continuously (RBAC, limits, budgets, approval)

One without the other is not enough:

  • without kill switch, it is hard to stop an incident immediately
  • without governance, incidents happen too often

Kill-switch control envelope

These checks work together as an emergency control envelope in runtime.

ComponentWhat it controlsKey mechanicsWhy
Global stopStopping all runsglobal_kill=true
stop before next action
Quickly stops a widespread incident
Tenant stopStopping within one tenanttenant_kill=true
tenant-scoped flag
Localizes the issue without global outage
Writes disabled modeBlocking write actionswrite tool policy
read-only fallback
Enables safe degradation instead of full stop
Tool disable listTargeted tool blockingtool_disabled[]
incident mode rules
Disables a problematic tool without stopping all runs
Operator observabilityVisibility of operator actions and blocksaudit logs
actor + reason + scope
Makes it clear who activated stop and why

How it looks in architecture

Kill switch policy layer sits in runtime loop between planning and next-action execution. Every decision (allow or stop) is recorded in audit log.

Each step passes through this flow before execution: runtime does not execute actions directly, it passes the decision to policy layer first.

Flow summary:

  • Runtime forms next action
  • Policy reads global/tenant flags + writes/tool rules
  • allow -> next agent action is executed
  • stop -> run is stopped with explicit reason (killed_global, killed_tenant, writes_disabled, tool_disabled)
  • decision is written to audit log

Example

A support agent started sending email.send in bulk due to a faulty scenario. Operator enables writes_disabled for a specific tenant.

Result:

  • new write actions are blocked immediately
  • read actions can remain available
  • logs contain who/when/why for each block

Kill switch stops the incident directly in runtime loop instead of waiting for a new release.

In code it looks like this

The simplified scheme above shows the main flow. In practice, kill state is read centrally and cached for seconds. Critical point: kill-check must be O(1) and use short cache (1-2 seconds), otherwise emergency stop reacts too late. In production, the same kill-check is usually duplicated in tool gateway so no call can bypass runtime control.

Example kill-switch config:

YAML
kill_switch:
  global_flag: agent_kill_global
  tenant_flag_prefix: "agent_kill_tenant:"
  writes_disabled_default: false
  disabled_tools_key: agent_disabled_tools
  cache_ttl_seconds: 2
PYTHON
while True:
    action = planner.next(state)
    action_key = make_action_key(action.name, action.args)  # stable key for dedupe/audit

    kill_state = kill_store.read(tenant_id=state.tenant_id)
    decision = kill_policy.check(kill_state, action)

    if decision.outcome == "stop":
        audit.log(
            run_id,
            decision=decision.outcome,
            reason=decision.reason,
            scope=decision.scope,
            action=action.name,
            action_key=action_key,
            actor=kill_state.last_updated_by,
        )
        return stop(decision.reason)

    result = tool.execute(action.args)

    audit.log(
        run_id,
        decision=decision.outcome,
        reason=decision.reason,
        scope=decision.scope,
        action=action.name,
        action_key=action_key,
        result=result.status,
    )

    if result.final:
        return result

Kill switch stops new actions. In-flight actions usually require a separate best-effort cancel mechanism.

How it looks during execution

Scenario 1: global stop

  1. Operator activates global_kill=true.
  2. Runtime forms next action and reads kill state.
  3. Policy returns stop (reason=killed_global).
  4. New actions are not executed.
  5. Logs contain scope=global and actor.

Scenario 2: tenant stop

  1. tenant_kill=true is activated for tenant t_42.
  2. Runs for this tenant get stop (reason=killed_tenant).
  3. Other tenants continue working.
  4. Incident is localized without global stop.

Scenario 3: writes disabled

  1. writes_disabled=true is activated.
  2. Read action passes with allow.
  3. Write action gets stop (reason=writes_disabled).
  4. System enters read-only degrade mode.

Common mistakes

  • kill switch only in UI, but not in runtime/tool gateway
  • one global stop without per-tenant mode
  • missing writes-disabled mode
  • long cache TTL (minutes instead of seconds)
  • missing audit trail for operator actions
  • missing tested incident runbook

Result: team has a "button", but no real emergency control.

Self-check

Quick kill-switch check before production launch:

Progress: 0/8

⚠ Baseline governance controls are missing

Before production, you need at least access control, limits, audit logs, and an emergency stop.

FAQ

Q: What should be activated first: global stop or writes disabled?
A: Start with writes_disabled if incident is in write actions. Use global_kill when failure risk is broad and immediate full stop is required.

Q: Where exactly should kill switch be checked?
A: At minimum in two places: in runtime loop before next action and in tool gateway before tool execution.

Q: Can kill state be cached?
A: Yes, but briefly (seconds). During an incident, minute-long cache makes kill switch almost useless.

Q: How to implement kill flags technically?
A: Usually as global and tenant-scoped flags in Redis/config store, read by policy layer before each action.

Q: Does kill switch cancel already running actions?
A: Not always. It reliably blocks new actions. In-flight tasks need a separate best-effort cancel mechanism.

Q: Can kill switch replace RBAC and budgets?
A: No. Kill switch is an emergency stop mechanism. RBAC, limits, and budgets are needed for continuous control.

Where Kill Switch fits in the system

Kill switch is the emergency layer of Agent Governance. Together with RBAC, budgets, approval, and audit, it forms a complete production control system.

Next on this topic:

⏱️ 7 min read β€’ Updated March 27, 2026Difficulty: β˜…β˜…β˜…
Implement in OnceOnly
Budgets + permissions you can enforce at the boundary.
Use in OnceOnly
# onceonly guardrails (concept)
version: 1
budgets:
  max_steps: 25
  max_tool_calls: 12
  max_seconds: 60
  max_usd: 1.00
policy:
  tool_allowlist:
    - search.read
    - http.get
writes:
  require_approval: true
  idempotency: true
controls:
  kill_switch: { enabled: true }
Integrated: production controlOnceOnly
Add guardrails to tool-calling agents
Ship this pattern with governance:
  • Budgets (steps / spend caps)
  • Tool permissions (allowlist / blocklist)
  • Kill switch & incident stop
  • Idempotency & dedupe
  • Audit logs & traceability
Integrated mention: OnceOnly is a control layer for production agent systems.

Author

Nick β€” engineer building infrastructure for production AI agents.

Focus: agent patterns, failure modes, runtime control, and system reliability.

πŸ”— GitHub: https://github.com/mykolademyanov


Editorial note

This documentation is AI-assisted, with human editorial responsibility for accuracy, clarity, and production relevance.

Content is grounded in real-world failures, post-mortems, and operational incidents in deployed AI agent systems.