Building a human-in-the-loop approval queue for an AI agent

Most teams putting AI agents into production either run them with full autonomy (then panic the first time the agent does something irreversible) or gate every single action on a human review (then watch the throughput collapse to where the agent costs more than it saves). The pattern that survives both failures is human-review on the fault line, not everywhere. This post is the queue pattern we ship into every agentic system: single table, one-tap approve/reject, structured diff. With the agent dispatch loop that respects the queue.

Where the fault line lives

The fault line is the small set of actions that, if wrong, produce real damage. Financial transactions. Customer-visible mutations. Irreversible deletes. Anything that goes outbound to a system you don't own (email send, webhook fire, third-party API call) without an undo button. Everything else runs on rails with audit-after-the-fact: the agent does it, the audit trail records it, a human can review the audit trail asynchronously, but the action does not block on human approval.

For a content-generation agent, the fault line might be: publish to public surface (gate), edit a draft (no gate). For a code-edit agent: commit to main (gate), edit a feature branch (no gate). For a customer-comms agent: send the email (gate), draft the email (no gate). The classification is per-action-class, not per-tool: a single tool like “send_email” gates because of what it does, not because emailing is inherently dangerous.

The approval_queue table

The code examples below are in our stack (Next.js, Drizzle ORM, tRPC) because that's where we have running production. The pattern itself is stack-agnostic: a queue table, a dispatcher that checks gate status before executing, a request handler that approves/rejects, and inline execution on approve. Translate the data-layer + dispatcher + handler shapes to whatever stack the engagement ships in.

// schema/approval-queue.ts
import { pgTable, text, jsonb, timestamp, uuid } from 'drizzle-orm/pg-core';

export const approval_queue = pgTable('approval_queue', {
  id: uuid('id').primaryKey().defaultRandom(),
  // What the agent wants to do
  action_type: text('action_type').notNull(),     // 'publish_post' | 'send_email' | 'create_charge' | ...
  payload: jsonb('payload').notNull(),            // arguments the agent would pass
  // What it would change
  before_state: jsonb('before_state'),            // snapshot for diff
  after_state: jsonb('after_state'),              // proposed state
  // Audit
  requested_by: text('requested_by').notNull(),   // agent session id
  requested_at: timestamp('requested_at').notNull().defaultNow(),
  reason: text('reason'),                         // agent's own explanation
  // Decision
  status: text('status').notNull().default('pending'),  // 'pending' | 'approved' | 'rejected'
  decided_by: text('decided_by'),                 // user id of approver
  decided_at: timestamp('decided_at'),
  decision_note: text('decision_note'),
  // Executed
  executed_at: timestamp('executed_at'),
  execution_error: text('execution_error'),
});

The shape matters. before_state and after_state are jsonb snapshots so the review UI can render a structured diff without re-running the agent. The reviewer is not asked to imagine what the change would do; they see exactly what changes by comparing the two snapshots. That single design choice is what makes the queue usable at high throughput.

The agent dispatch loop

Every typed tool the agent can call goes through a dispatcher that checks the action's gate status. If the action class needs human approval, the dispatcher enqueues instead of executing:

// agent/dispatch.ts
import { db, approval_queue } from '@/lib/db';

const GATED_ACTIONS = new Set([
  'publish_post',
  'send_email',
  'create_charge',
  'delete_record',
  // ... whatever's on the fault line for this system
]);

export async function dispatch(action: AgentAction, session: AgentSession) {
  // 1. If the action class is on the fault line, enqueue.
  if (GATED_ACTIONS.has(action.type)) {
    const beforeState = await snapshotBefore(action);
    const afterState = computeAfter(action, beforeState);
    await db.insert(approval_queue).values({
      action_type: action.type,
      payload: action.payload,
      before_state: beforeState,
      after_state: afterState,
      requested_by: session.id,
      reason: action.reason,        // agent has to explain itself
    });
    return { gated: true, message: 'Queued for human review.' };
  }

  // 2. Not gated -- execute, but always write the audit trail.
  const result = await executeAction(action);
  await writeAuditTrail({
    action,
    result,
    session_id: session.id,
    at: new Date(),
  });
  return { gated: false, result };
}

The agent always returns to its loop after dispatch. Gated actions return “queued for review” as a result; the agent treats that as a successful tool call and moves on to the next action. Whether the queued action actually executes is now decoupled from the agent's session. The agent can finish its turn while the queue is still pending; the human reviews on their own schedule.

The review UI

The review surface is a single tRPC procedure plus a server component that lists pending items. The UX rule: one-tap approve, one-tap reject, structured diff visible without scrolling.

// trpc/router/approvals.ts
import { protectedProcedure, router } from '../trpc';
import { db, approval_queue } from '@/lib/db';
import { eq } from 'drizzle-orm';
import { executeQueuedAction } from '@/agent/dispatch';

export const approvalsRouter = router({
  pending: protectedProcedure.query(async () => {
    return db.select().from(approval_queue)
      .where(eq(approval_queue.status, 'pending'))
      .orderBy(approval_queue.requested_at);
  }),

  approve: protectedProcedure
    .input(z.object({ id: z.string().uuid(), note: z.string().optional() }))
    .mutation(async ({ ctx, input }) => {
      const [row] = await db.update(approval_queue)
        .set({
          status: 'approved',
          decided_by: ctx.user.id,
          decided_at: new Date(),
          decision_note: input.note,
        })
        .where(eq(approval_queue.id, input.id))
        .returning();

      // Execute. Wrap so a failure doesn't crash the API call.
      try {
        await executeQueuedAction(row);
        await db.update(approval_queue)
          .set({ executed_at: new Date() })
          .where(eq(approval_queue.id, input.id));
      } catch (err) {
        await db.update(approval_queue)
          .set({ execution_error: String(err) })
          .where(eq(approval_queue.id, input.id));
        throw err;
      }
      return { ok: true };
    }),

  reject: protectedProcedure
    .input(z.object({ id: z.string().uuid(), note: z.string() }))
    .mutation(async ({ ctx, input }) => {
      await db.update(approval_queue).set({
        status: 'rejected',
        decided_by: ctx.user.id,
        decided_at: new Date(),
        decision_note: input.note,    // required for rejects -- forces an explanation back to the agent
      }).where(eq(approval_queue.id, input.id));
      return { ok: true };
    }),
});

Rejects require a note. That note feeds back into the agent's context the next time it tries the same action class. Over time, the agent learns which patterns get rejected and stops proposing them. The notes also accumulate into the rule library if the same rejection rationale comes up multiple times: a class of failure the rule layer should prevent before the queue layer sees it.

Why dispatch on approve, not just mark as approved

One pattern we see fail: the approve button marks the row as approved and a background cron picks up approved-but-not-yet-executed rows and runs them. Sounds fine. It is not fine. The lag between approve-click and execution becomes the window where a second human action races. Two reviewers approve different rows that affect the same underlying state; cron picks them both up; the second one finds the state already changed and either errors out or does the wrong thing.

Executing on approve makes the click the commit point. The reviewer sees success or failure inline. State changes are serialised through whoever clicked first. The cron-driven model trades latency for a race condition you cannot reason about.

Earning the right to remove a gate

The interesting part is graduating action classes off the gate list. Each gated action class starts on the queue. After enough approvals with no rejections and no execution errors, we demote it to audit-after-the-fact: the agent executes, the audit trail records it, no human in the loop. Each demotion is itself a rule in the library, with the criteria that justified it (e.g. “send_email to internal team members: 200 consecutive approvals with no rejection, 0 execution errors, dropped to audit-only 2026-04-12”).

That demotion ladder is the difference between a system that gets slower as it ages (because more actions need review) and one that gets faster as it ages (because trust compounds through demonstrated reliability). Direction of travel is the signal.

What this prevents in production

Irreversible mistakes shipping unattended: financial transactions, deletes, customer-visible mutations all gate by default.
Throughput collapse from over-gating: actions off the fault line run on rails with audit-after-the-fact, not human-approval-in-the-loop.
Reviewer fatigue: structured diff + one-tap decision + reject-requires-note keeps the reviewer's cognitive load low.
Lock-in to inflexible gates: the demotion ladder lets action classes graduate to audit-only once they earn it.
Silent state corruption from approve-then-execute lag: executing inline on approve makes the click the serialisation point.

Where this rule comes from

PickNDeal ships a draft-then-submit version of this pattern: the offer agent (Phase 8) drafts the offer items on behalf of the supplier and returns them through a confirm-flag mechanism on the mutation tool. The supplier reviews the draft inline and explicitly submits it before the offer goes to the restaurant. The AI chat assistant (Phase 10) uses the same confirm-flag mechanism for any mutation tool: the agent proposes the action, the user confirms inline before it fires.

These are synchronous, two-actor flows: the reviewer and the agent are in the loop together at call time, no queue table needed. The fuller queue pattern described above (separate queue table, asynchronous review at the reviewer's own pace, demotion ladder) is what we ship into client engagements where async review is actually the right shape: the reviewer wants to decide later, the action can't easily be replayed, or multiple approvers need to weigh in. The principle is the same in both cases (human in the loop on the fault line, not everywhere); the implementation matches the use case. Don't build a queue table for a two-actor synchronous flow; don't rely on confirm-flag mechanics for a flow where the reviewer is off the clock.

The principle lives at /method as “Human-in-the-loop on the fault line, not everywhere.” This post is the worked implementation of the full async queue version. The synchronous confirm-flag version is simpler and lives directly in the AI router's tool executor; it's a one-line allowlist of tools that require explicit confirmation before they fire.

More on the methodology this pattern comes from in The agentic engineering method, principle 04. The dispatch + audit pattern also shows up in the three-outages post (state-before-and-after as the universal pattern for “diverged silently from production” failures).