This is Part 2 of my series on agentic enforcement. Part 1 covered why rules files are theater and how hook-based enforcement at action boundaries changes that. This post goes deep into the actual machinery: the Python scripts, YAML configs, and adjudication logic that make it all work.
Where We Left Off
In Part 1 I described the three-layer architecture: hooks fire at action boundaries, validators contain the enforcement logic, and guard YAMLs define the rules as data. I showed that hooks by themselves are just event triggers — smoke detectors without sprinklers.
Now I want to open up each layer and show you what's actually inside.
The Directory Structure (and Why It's Not Arbitrary)
.agents/hooks/
├── executor.py
├── lib/
│ ├── adjudication_engine.py
│ ├── compliance_validators.py
│ ├── security_validators.py
│ ├── quality_validators.py
│ ├── orchestration_handlers.py
│ ├── token_circuit_breaker.py
│ └── ... (12 utility modules)
├── compliance/ (7 YAMLs)
├── security/ (7 YAMLs)
├── quality/ (4 YAMLs)
├── orchestration/ (4 YAMLs)
├── hook-manifest.yaml
└── rollout-config.yaml
The structure mirrors the four enforcement families. Each family has a lib/ Python module and a YAML config directory. The Bash scripts in the root are thin wrappers that handle shell-level plumbing and delegate to executor.py for actual logic.
executor.py: The Router
Every hook across all three platforms calls executor.py with JSON on stdin. The executor's job is straightforward:
- Parse the incoming JSON event
- Look up which validators apply (from
hook-manifest.yaml) - Run those validators, passing the event context
- Collect results (pass/fail/warn + messages)
- Run the adjudication engine to make a final decision
- Write the result as JSON to stdout with the appropriate exit code
The executor itself is maybe 80 lines of Python. It doesn't contain enforcement logic — it's a router.
The Validator Modules: What They Actually Check
compliance_validators.py
Handles UEAH attribution and PARA cross-linking. The UEAH check triggers on any afterFileEdit event targeting CHANGELOG.md. It reads the diff, extracts new lines, and runs the regex from changelog-ueah.yaml. If the new content doesn't contain a valid UEAH tag, the validator returns block.
What makes this non-trivial is edge cases I kept hitting in practice: agents that copy an old UEAH tag instead of generating a fresh one, agents that put the tag in a code block where it doesn't render, agents that write the date format wrong. Each check exists because I hit the failure in production first.
security_validators.py
The most complex module. CLI command scanning reads cli-command-guard.yaml and applies pattern matching against shell commands. Unicode sanitization checks for zero-width joiners, bidirectional override characters, and homograph characters. Token exposure scanning uses both regex patterns (for known key formats like sk-..., AKIA..., ghp_...) and entropy analysis for random-looking strings in assignment contexts. Prompt injection detection scans for "ignore previous instructions", role-reassignment attempts, and system prompt extraction.
quality_validators.py
The lightest module but the one my sanity depends on. Sound triggers play distinct audio notifications based on agent and event type. Context drift tracks semantic distance between original task and recent actions. This catches the classic "I asked you to fix the CSS and you're refactoring the database" failure mode.
orchestration_handlers.py
Anti-spiral detection tracks action hashes over a sliding window and fires when it sees too many near-duplicates. Handoff validation ensures structured handoffs include required fields: task description, current state, files modified, blockers, and UEAH tag.
The Guard YAMLs: Schema and Conventions
Every guard config follows the same shape:
trigger: <event_type>
target_files: [<glob patterns>]
action: block | warn | monitor
message: "<human-readable>"
# Then family-specific fields:
pattern: "<regex>"
threshold: <number>
allowed_patterns: [...]
blocked_patterns: [...]
The consistency means validators share parsing code. One convention I wish I'd established earlier: every YAML has a message field that produces the exact text the agent sees on failure. Early on I was generating messages in Python code, which meant updating user-facing text required a code change.
The Adjudication Engine: Gradual Rollout
adjudication_engine.py reads rollout-config.yaml, which maps every guard to its current enforcement level:
guards:
changelog-ueah:
level: enforce
since: 2026-02-10
para-links:
level: warn
since: 2026-02-15
context-drift:
level: monitor
since: 2026-03-01
anti-spiral:
level: warn
since: 2026-02-28
The rollout path: monitor (logs only, watch for false positives) → warn (agent sees message, action proceeds) → enforce (hard block). I've had guards stay in monitor for a month because the false positive rate was too high. Without the adjudication layer, I'd have had to choose between "deploy and break things" or "don't deploy at all."
The Federation Pipeline
Rather than maintain three platform-specific configs that inevitably drift, I have compile scripts:
compile-hook-settings.py— readshook-manifest.yaml, generates platform configscompile-constitution.py— assembles per-platform rule files from shared sourcevalidate-federation.sh— checks all generated configs are in sync
These run as part of the session-start hook. Every new session confirms all platforms are in sync.
Health Checks
check-hook-integrity.py— every guard in manifest has a YAML config and a validatorcheck-executor-integrity.py— all validator modules import without errorscheck-policy-drift.py— compiled configs match what manifest says they should contain
These run in CI and on-demand. If a health check fails, it means someone (probably me, probably at 2am) edited a validator without updating the manifest.
The Running Inventory
The full stack is a few dozen files: four Python validator modules, about a dozen utility modules, a handful of Bash wrappers, over 20 YAML guard configs, federation compile scripts, and health checks. All stdlib Python; zero external dependencies. Every file registered in Notion with correct file type, path, and cross-database relations.
lmk How You're Handling This
I genuinely want to know: if you're running AI agents in production, how are you handling enforcement?
Pure rules files and hoping for the best?
Custom hooks?
Something I haven't thought of?
Hit me up at johnclick.ai or johnclick.dev.
Part 2 of the Agentic Enforcement series. Part 1: Markdown is Agent Enforcement Theater. Based on T-ADR-038 and the Agentic Developer Toolkit enforcement stack.
John Click is a DevOps / IT Platform Engineer building agentic governance infrastructure for enterprise AI agent deployments.
