Human Approval For AI Agents: How To Safely Control Write Actions

Practical approval flow in production: approval_required, TTL, stop reasons, approval token, and audit trail for write actions.
On this page
  1. Idea In 30 Seconds
  2. Problem
  3. Solution
  4. Human Approval != Manual Mode
  5. Approval-Control Metrics
  6. How It Looks In Architecture
  7. Example
  8. In Code It Looks Like This
  9. How It Looks During Execution
  10. Common Mistakes
  11. Self-Check
  12. FAQ
  13. Where Human Approval Fits In The Whole System
  14. Related Pages

Idea In 30 Seconds

Human approval is a runtime gate for risky write actions: before execution, the agent gets approval_required and waits for human confirmation.

When you need it: when an agent can modify data, send customer messages, or trigger irreversible actions in production.

Problem

Without approval, write actions execute immediately when policy allows them. In demos this is convenient. In production, one agent mistake turns into a real incident.

The problem is not only a "bad" model. Even a good model can fail on atypical requests. If there is no human gate between agent and write tool, a mistake immediately becomes a side effect in real systems.

Analogy: it is like a payment without 3D Secure. While everything goes well, there is no delay. When something goes wrong, consequences get expensive in seconds.

Solution

The solution is to add a dedicated approval flow for risky write actions in policy layer. Policy returns one of these decisions: allow, deny, or approval_required.

approval_required does not execute the action immediately: runtime creates an approval request, waits for a human decision, and executes the tool call only after approval_granted. This decision is made on every step, not only at the end of a run.

Human Approval != Manual Mode

These are different models:

  • Manual mode: a human performs almost every action instead of the agent.
  • Human approval: the agent works autonomously, and human confirms only risky write actions.

One without the other is not enough:

  • without approval, risky actions pass without extra control
  • if you use manual mode for everything, the system loses speed and scalability

Example:

  • without approval: ticket.close_bulk executes immediately
  • with approval: policy returns approval_required, and action waits for confirmation

Approval-Control Metrics

These metrics and signals work together at every agent step.

MetricWhat it controlsKey mechanicsWhy
Approval scopeWhich actions require confirmationwrite policy
risk tiers
Reduces risk for irreversible actions
Approval request contextWhat exactly a human sees before decidingpreview + args hash
reason + policy context
Enables a grounded decision
TTL and cancellationLifecycle of approval requestapproval TTL
cancel flow
Prevents runs from hanging indefinitely
Execution gateActual execution of write actionapproval token
gateway enforcement
Guarantees write will not run without confirmation
Approval observabilityVisibility into approval decisionsaudit logs
alerts on timeout spikes
Does not directly limit action, but helps detect bottlenecks in approval process

How It Looks In Architecture

Policy layer (tool gateway) sits between runtime and tools and is the single access-control point before each step. Each decision (allow, deny, approval_required) is recorded in audit log.

Each agent step passes through this flow before execution: runtime does not execute write action directly β€” first policy check -> approval gate -> execution.

Flow summary:

  • Runtime forms a tool call
  • Policy layer checks risk and may return approval_required
  • on approval_granted, write is executed
  • on approval_denied or approval_timeout, run gets a stop reason
  • every decision is written to audit log

In runtime, deny is also converted to an explicit stop reason visible in logs and run response.

Approval request usually contains:

  • tool
  • short action preview
  • args hash
  • reason / risk tier
  • TTL

Example

A support agent wants to run email.send for a customer. Policy defines that this tool requires human confirmation.

Result:

  • without approval token, write is not executed
  • after approval_granted, call is allowed
  • on timeout, agent returns stop("approval_timeout")

Human approval stops the risky action before side effects, not after an incident.

In Code It Looks Like This

In the simplified schema above, the main control flow is shown. In practice, validation and execution should go through a single policy/tool gateway.

Example approval config:

YAML
approvals:
  required_for:
    - email.send
    - ticket.close_bulk
    - db.write
  ttl_seconds: 300
  fallback_when_not_approved: stop
PYTHON
decision = policy.evaluate(tool, user_context, mode="normal")

if decision.outcome == "approval_required":
    request = approvals.create_request(
        run_id=run_id,
        tool=tool,
        args_hash=hash_args(args),
        ttl_seconds=300,
    )
    audit.log(run_id, decision.outcome, reason="pending_human_review", tool=tool, pending_id=request.id)
    return stop("approval_required", pending_id=request.id)

elif decision.outcome == "deny":
    audit.log(run_id, decision.outcome, reason=decision.reason, tool=tool)
    return deny(decision.reason)

# later, in resume flow with pending_id
approval = approvals.get_decision(pending_id)
if approval.outcome != "approved":
    audit.log(run_id, "deny", reason=approval.outcome, tool=tool, pending_id=pending_id)
    return stop(approval.outcome)

audit.log(run_id, "approval_granted", reason="human_approved", tool=tool, approver=approval.approved_by)
result = tool.execute({**args, "approval_token": approval.token})
decision = Decision.allow(reason="policy_ok")
audit.log(run_id, decision.outcome, reason=decision.reason, tool=tool, result=result.status)
return result

In production, approval flow is usually asynchronous: runtime creates a request, returns pending/stop state without blocking worker, and resumes run after human decision.

How It Looks During Execution

TEXT
Scenario 1: confirmed (approval granted)

1. Runtime forms email.send call.
2. Policy returns approval_required.
3. Runtime creates approval request and returns pending/stop state.
4. Human confirms action within TTL.
5. Runtime resumes run, executes tool call, and writes `approval_granted -> allow`.

---

Scenario 2: approval timeout

1. Runtime forms db.write call.
2. Policy returns approval_required.
3. Runtime creates approval request and returns pending/stop state.
4. Confirmation is not received before TTL expires.
5. Runtime returns stop (approval_timeout), action is not executed.

---

Scenario 3: policy deny without approval

1. Runtime forms write call outside allowed scope.
2. Policy immediately returns deny.
3. Runtime returns stop reason.
4. Audit: decision=deny, reason=policy_denied.
5. Action is not executed.

Common Mistakes

  • approval only in UI, but not in policy/tool gateway
  • approval without TTL and without cancellation
  • same approach for low-risk and high-risk write actions
  • missing approval token in tool execution
  • not logging approval_required and approval_granted
  • blocking all runs while waiting for approval instead of returning explicit stop/pending state

As a result, the system either allows unsafe actions or hangs in approval queues with no transparent state.

Self-Check

Quick human-approval check before production launch:

Progress: 0/8

⚠ Baseline governance controls are missing

Before production, you need at least access control, limits, audit logs, and an emergency stop.

FAQ

Q: Which actions must always require approval?
A: Irreversible or customer-visible write actions: data changes, bulk closes, message sends, financial operations.

Q: How do we avoid drowning the team in approval-spam?
A: Split write actions by risk tiers. High-risk -> mandatory approval, low-risk -> separate policy with tighter limits and audit.

Q: Should worker be blocked while waiting for approval?
A: Better not. Return pending/stop state and resume run after decision to avoid queues and deadlocks.

Q: Can approval replace RBAC and budgets?
A: No. Approval is an additional gate for risky actions. RBAC, limits, and budgets are still required.

Q: What is the minimum logging set?
A: approval_required, approval_granted|approval_denied|approval_timeout, approver, tool, reason, and execution result.

Where Human Approval Fits In The Whole System

Human approval is one of Agent Governance layers. Together with allowlist/RBAC, budgets, limits, and audit, it forms a unified execution-control system.

Next on this topic:

⏱️ 7 min read β€’ Updated March 25, 2026Difficulty: β˜…β˜…β˜…
Implement in OnceOnly
Budgets + permissions you can enforce at the boundary.
Use in OnceOnly
# onceonly guardrails (concept)
version: 1
budgets:
  max_steps: 25
  max_tool_calls: 12
  max_seconds: 60
  max_usd: 1.00
policy:
  tool_allowlist:
    - search.read
    - http.get
writes:
  require_approval: true
  idempotency: true
controls:
  kill_switch: { enabled: true }
Integrated: production controlOnceOnly
Add guardrails to tool-calling agents
Ship this pattern with governance:
  • Budgets (steps / spend caps)
  • Tool permissions (allowlist / blocklist)
  • Kill switch & incident stop
  • Idempotency & dedupe
  • Audit logs & traceability
Integrated mention: OnceOnly is a control layer for production agent systems.

Author

Nick β€” engineer building infrastructure for production AI agents.

Focus: agent patterns, failure modes, runtime control, and system reliability.

πŸ”— GitHub: https://github.com/mykolademyanov


Editorial note

This documentation is AI-assisted, with human editorial responsibility for accuracy, clarity, and production relevance.

Content is grounded in real-world failures, post-mortems, and operational incidents in deployed AI agent systems.