Human Approval For AI Agents: How To Safely Control Write Actions

Idea In 30 Seconds

Human approval is a runtime gate for risky write actions: before execution, the agent gets approval_required and waits for human confirmation.

When you need it: when an agent can modify data, send customer messages, or trigger irreversible actions in production.

Problem

Without approval, write actions execute immediately when policy allows them. In demos this is convenient. In production, one agent mistake turns into a real incident.

The problem is not only a "bad" model. Even a good model can fail on atypical requests. If there is no human gate between agent and write tool, a mistake immediately becomes a side effect in real systems.

Analogy: it is like a payment without 3D Secure. While everything goes well, there is no delay. When something goes wrong, consequences get expensive in seconds.

Solution

The solution is to add a dedicated approval flow for risky write actions in policy layer. Policy returns one of these decisions: allow, deny, or approval_required.

approval_required does not execute the action immediately: runtime creates an approval request, waits for a human decision, and executes the tool call only after approval_granted. This decision is made on every step, not only at the end of a run.

Human Approval != Manual Mode

These are different models:

Manual mode: a human performs almost every action instead of the agent.
Human approval: the agent works autonomously, and human confirms only risky write actions.

One without the other is not enough:

without approval, risky actions pass without extra control
if you use manual mode for everything, the system loses speed and scalability

Example:

without approval: ticket.close_bulk executes immediately
with approval: policy returns approval_required, and action waits for confirmation

Approval-Control Metrics

These metrics and signals work together at every agent step.

Metric	What it controls	Key mechanics	Why
Approval scope	Which actions require confirmation	write policy risk tiers	Reduces risk for irreversible actions
Approval request context	What exactly a human sees before deciding	preview + args hash reason + policy context	Enables a grounded decision
TTL and cancellation	Lifecycle of approval request	approval TTL cancel flow	Prevents runs from hanging indefinitely
Execution gate	Actual execution of write action	approval token gateway enforcement	Guarantees write will not run without confirmation
Approval observability	Visibility into approval decisions	audit logs alerts on timeout spikes	Does not directly limit action, but helps detect bottlenecks in approval process

How It Looks In Architecture

Policy layer (tool gateway) sits between runtime and tools and is the single access-control point before each step. Each decision (allow, deny, approval_required) is recorded in audit log.

Each agent step passes through this flow before execution: runtime does not execute write action directly — first policy check -> approval gate -> execution.

Flow summary:

Runtime forms a tool call
Policy layer checks risk and may return approval_required
on approval_granted, write is executed
on approval_denied or approval_timeout, run gets a stop reason
every decision is written to audit log

In runtime, deny is also converted to an explicit stop reason visible in logs and run response.

Approval request usually contains:

tool
short action preview
args hash
reason / risk tier
TTL

Example

A support agent wants to run email.send for a customer. Policy defines that this tool requires human confirmation.

Result:

without approval token, write is not executed
after approval_granted, call is allowed
on timeout, agent returns stop("approval_timeout")

Human approval stops the risky action before side effects, not after an incident.

In Code It Looks Like This

In the simplified schema above, the main control flow is shown. In practice, validation and execution should go through a single policy/tool gateway.

Example approval config:

YAML

approvals:
  required_for:
    - email.send
    - ticket.close_bulk
    - db.write
  ttl_seconds: 300
  fallback_when_not_approved: stop

PYTHON

decision = policy.evaluate(tool, user_context, mode="normal")

if decision.outcome == "approval_required":
    request = approvals.create_request(
        run_id=run_id,
        tool=tool,
        args_hash=hash_args(args),
        ttl_seconds=300,
    )
    audit.log(run_id, decision.outcome, reason="pending_human_review", tool=tool, pending_id=request.id)
    return stop("approval_required", pending_id=request.id)

elif decision.outcome == "deny":
    audit.log(run_id, decision.outcome, reason=decision.reason, tool=tool)
    return deny(decision.reason)

# later, in resume flow with pending_id
approval = approvals.get_decision(pending_id)
if approval.outcome != "approved":
    audit.log(run_id, "deny", reason=approval.outcome, tool=tool, pending_id=pending_id)
    return stop(approval.outcome)

audit.log(run_id, "approval_granted", reason="human_approved", tool=tool, approver=approval.approved_by)
result = tool.execute({**args, "approval_token": approval.token})
decision = Decision.allow(reason="policy_ok")
audit.log(run_id, decision.outcome, reason=decision.reason, tool=tool, result=result.status)
return result

In production, approval flow is usually asynchronous: runtime creates a request, returns pending/stop state without blocking worker, and resumes run after human decision.

How It Looks During Execution

TEXT

Scenario 1: confirmed (approval granted)

1. Runtime forms email.send call.
2. Policy returns approval_required.
3. Runtime creates approval request and returns pending/stop state.
4. Human confirms action within TTL.
5. Runtime resumes run, executes tool call, and writes `approval_granted -> allow`.

---

Scenario 2: approval timeout

1. Runtime forms db.write call.
2. Policy returns approval_required.
3. Runtime creates approval request and returns pending/stop state.
4. Confirmation is not received before TTL expires.
5. Runtime returns stop (approval_timeout), action is not executed.

---

Scenario 3: policy deny without approval

1. Runtime forms write call outside allowed scope.
2. Policy immediately returns deny.
3. Runtime returns stop reason.
4. Audit: decision=deny, reason=policy_denied.
5. Action is not executed.

Common Mistakes

approval only in UI, but not in policy/tool gateway
approval without TTL and without cancellation
same approach for low-risk and high-risk write actions
missing approval token in tool execution
not logging approval_required and approval_granted
blocking all runs while waiting for approval instead of returning explicit stop/pending state

As a result, the system either allows unsafe actions or hangs in approval queues with no transparent state.

Self-Check

Quick human-approval check before production launch:

There is an explicit list of write actions that require approval
Approval flow runs through centralized policy/tool gateway
Every approval request has TTL and can be canceled
Without approval token, write action is not executed
Every decision has explicit outcome: allow/deny/approval_required
All approval outcomes are written to audit log
There is alerting on approval_timeout spikes
Low-risk actions have separate rules (without approval-spam)

Progress: 0/8

⚠ Baseline governance controls are missing

Before production, you need at least access control, limits, audit logs, and an emergency stop.

FAQ

Q: Which actions must always require approval?
A: Irreversible or customer-visible write actions: data changes, bulk closes, message sends, financial operations.

Q: How do we avoid drowning the team in approval-spam?
A: Split write actions by risk tiers. High-risk -> mandatory approval, low-risk -> separate policy with tighter limits and audit.

Q: Should worker be blocked while waiting for approval?
A: Better not. Return pending/stop state and resume run after decision to avoid queues and deadlocks.

Q: Can approval replace RBAC and budgets?
A: No. Approval is an additional gate for risky actions. RBAC, limits, and budgets are still required.

Q: What is the minimum logging set?
A: approval_required, approval_granted|approval_denied|approval_timeout, approver, tool, reason, and execution result.

Where Human Approval Fits In The Whole System

Human approval is one of Agent Governance layers. Together with allowlist/RBAC, budgets, limits, and audit, it forms a unified execution-control system.

Next on this topic:

Agent Governance Overview — overall production control model for agents.
Allowlist vs Blocklist — how to build default-deny tool access.
Access Control (RBAC) — how to limit access by roles and tenant scope.
Budget Controls — how to contain runaway spend and loops.
Audit Logs For Agents — how to explain approval/policy decisions in incidents.

Human Approval For AI Agents: How To Safely Control Write Actions

Idea In 30 Seconds

Problem

Solution

Human Approval != Manual Mode

Approval-Control Metrics

How It Looks In Architecture

Example

In Code It Looks Like This

How It Looks During Execution

Common Mistakes

Self-Check

FAQ

Where Human Approval Fits In The Whole System

Used by patterns

Related failures

Governance required

Author

Editorial note

Human Approval For AI Agents: How To Safely Control Write Actions

Idea In 30 Seconds

Problem

Solution

Human Approval != Manual Mode

Approval-Control Metrics

How It Looks In Architecture

Example

In Code It Looks Like This

How It Looks During Execution

Common Mistakes

Self-Check

FAQ

Where Human Approval Fits In The Whole System

Related Pages

Used by patterns

Related failures

Governance required

Author

Editorial note