Production engineering.
Now agentic.
Ship AI in production without shipping the same failure twice. Agentic engineering for software teams: a method with audit trail by default, human-review gates on the fault line, and a rule library that survives every new session. Built for teams using AI today and shipping agents tomorrow.
Treat agents like senior engineers.
Same governance you'd apply to any senior IC: audit trail by default, human review on the fault line, rules that codify what you've learned the hard way. The agentic engineering method is this principle, worked out into seven concrete practices and a rule library that travels with us between codebases.
Each rule answers a specific class of failure: production outages, agent-behaviour failures, review-gate misses. Each one is of the form “don't do X, here's what taught us why,” with code-level or session-level enforcement when possible. Like this:
# Read the files, don't just skim the index.
# Why
# Every new AI session auto-loads the index, but NOT
# the rule files linked from it. The session sees a
# one-line summary and thinks it knows the rule.
# Summaries are pointers; the rule lives in the file.
# How the setup enforces it
# 1. The index (docs/context/INDEX.md) auto-loads
# on every new session via the project's CLAUDE.md
# import.
# 2. Each entry links to a rule file with a summary
# that ends "read the file, not just the line."
# 3. Before any diff, copy change, or factual claim,
# the session reads every rule whose summary
# overlaps the user's current topic.
# 4. Writes only start after that reconciliation pass
# completes. No exception for "obvious" requests.
# What goes wrong without it
# Session skims an index entry, generates code or
# copy that silently violates the rule, ships. The
# failure surfaces in review (best case) or in
# production (worst case). Cost compounds because the
# next session re-makes the same skim.
# Enforcement
# - CLAUDE.md flags this rule by name on turn one
# - INDEX.md summaries explicitly say "read the file"
# - Every observed violation produces a new rule;
# the trap codifies itselfThe full method (seven principles, 50+ rules in the PickNDeal codebase, 75+ across our libraries) is documented at the method page. The rule library lives across our production codebases and is the deliverable in every client engagement. Production-specific extensions stay inside the engagement.
Every new AI session is another brand new hire.
Your team has coding standards. Your AI assistants don't read them.
Every reset, every new session, every new contributor's first day on the codebase: the AI you're paying for starts blind. It doesn't know the bug you caught in PR #142. It doesn't know the architectural decision from last quarter. It doesn't know which library you've already ruled out. The senior engineer who caught the failure mode in PR #142 isn't in the loop on PR #189 when the same failure ships again.
AI speeds up implementation. Fixing the regressions introduced by AI suggestions can offset the gains.
You don't fix this with more context tokens or smarter prompts. You fix it with a context system that survives every reset, every new contributor, and every fork of the codebase: a learning system that captures the patterns and prevents recurrence.
That's agentic engineering done right, and it's what this practice ships.
Three references. Every session reads all three. No rule survives in only one place.
Per-session context. Read on turn one, every session. Subagents inherit. No memory of previous sessions.
Cross-project + per-project memory. Survives compaction. Updated when a failure mode produces a new rule.
Repo-versioned docs/context/. Travels with git clone. Survives fresh contributors, forks, and environments.
Failure mode to rule pipeline. Every observed failure mode becomes a rule the next session reads, before the next pull request, not after the next postmortem.
Six things your AI tooling does differently the day you install this.
Agentic engineering isn't documentation. It's the behavior change in every AI session that reads the documentation. All six are observable on day one.
Deeper analysis on turn one
Sessions skip the clarifying questions. The cheap-to-answer questions (constraint stack, what has been tried, what has been ruled out) are pre-answered. Every session goes straight to substantive work.
AI sessions stop billing for ramp-up time.
Failure mode recognition
The third time a failure mode is about to ship, the session recognises it from the rule file. Same root cause, different surface, caught before merge instead of after the postmortem.
The third occurrence stops shipping to production.
Decisions stay decided
Sessions don't relitigate “should we ship X” every quarter. The decision lives in writing; the next session reads it; the answer stays consistent across rotations of contributors.
Quarterly relitigation cost approaches zero.
Hallucination defense
When a rule says “don't claim features that aren't shipped,” the session doesn't generate plausible-sounding-but-false copy. The cheapest hallucination to prevent is the one the session never makes.
Code review burden on the lead engineer drops.
Multi-file relevance
One trigger phrase pulls every relevant rule. The session reads the graph, not a single file: picking up i18n, brand, ship-only-real, and competitor rules all at once when you ask for a translation.
One trigger phrase, all the relevant rules.
Survives compaction
Conversation context windows fill up and reset. The rules don't. The next session opens with the same constraints as the last, and the one before that.
Long sessions don't degrade.
Every rule in our library exists because we observed the failure mode on a real production system (usually PickNDeal) and chose to codify the prevention. The library compounds. So does the behavior change.

PickNDeal: building an AI-native multi-supplier marketplace
A B2B + D2C food marketplace where AI agents handle supplier offers, group orders, and operations end-to-end. MCP-native by design. Buyers, suppliers, and their existing CRMs all use the same API surface.
Read the case studyLive production: pickndeal.app

PayoutKit: extracting a product agentically
A hardened Stripe Connect module, extracted from PickNDeal using the agentic engineering method. The technique applies to any reusable functionality you want to lift out of a running operational system.
Read the case studyThe engineering rule library: how a methodology compounds across client engagements
The asset that survives every reset, contributor change, and fork. File shape, index pattern, session-start enforcement loop. Worked examples from PickNDeal (50+ rules) and the cross-project layer (29 rules that travel between codebases).
Three production failure modes from one schema migration
Drizzle generates queries against the schema your code was deployed with, not the schema in production. Three 500s in a row taught us why schema-touch detection has to gate the deploy.