Ken Huang Ch 15 — Structured Output and Schema-Constrained Generation
Source: Ken Huang / Agentic AI Substack Raw: raw/rss/2026-05-02-agentic-ai-chapter-15-structured-output-and-schema-constrained-gen.md URL: https://kenhuangus.substack.com/p/chapter-15-structured-output-and Date: 2026-05-02 Tier: 1 — agent harness architecture
TL;DR
Fifth chapter of Ken Huang's Claude-Code-vs-Hermes series (after Ch 8 memory, Ch 12 skills, Ch 13 MCP, Ch 14 routing). Both harnesses solve schema-constrained output through tool-use forcing — wrap the JSON schema as a fake tool, force the model to call it. Claude Code: SyntheticOutputTool with Ajv compile, identity-cached by schema reference (~110ms → ~4ms on 80-call workflows), MAX_STRUCTURED_OUTPUT_RETRIES=5, structured-output retries don't eat the agent's tool budget, child agents have the synthetic tool stripped from their tool list. Hermes: extract_structured() with provider-agnostic tool-choice forcing, JSONL-format trajectory output as infrastructure-level structured output via batch_runner.py parallel multiprocessing.Pool.
Key claims
- Tool-use forcing is the portable structured-output mechanism. Both harnesses converge on it — they reuse the model's already-trained tool-calling reliability rather than depending on provider-specific JSON-mode flags.
- Claude Code's
SyntheticOutputTooldesign details:SYNTHETIC_OUTPUT_TOOL_NAME = 'StructuredOutput'- Ajv compiled once and identity-cached per schema object reference; 80-call workflows go from ~110ms to ~4ms total Ajv overhead
MAX_STRUCTURED_OUTPUT_RETRIESdefaults to 5, env-overridableerror_max_structured_output_retriesis a typed result subtype distinct from "max turns" or "budget exceeded"- Hook (
registerStructuredOutputEnforcement) prevents stop without calling the synthetic tool SYNTHETIC_OUTPUT_TOOL_NAMEis filtered from agent subtools — child agents don't inherit the parent's output contract
- Hermes's structured-output design details:
ModelCapabilities.structured_outputflag gates which mechanism is available per model- Validation feedback as conversation messages (richer self-correction context vs Claude Code's tool-call exception)
batch_runner.pyruns structured extraction across thousands of inputs in parallel viamultiprocessing.Pool;_normalize_tool_stats()enforces fixed schema across JSONL batch entries — infrastructure-level structured output, not just model-level
Why this matters for cere-bro
Continuation of the most detailed public production-harness reference series available. Three things this chapter clarifies that prior chapters did not:
- Structured output is a routing decision, too. Hermes's
ModelCapabilities.structured_outputflag is a per-model feature; routing to a model that lacks it requires a different mechanism. This is a sub-axis of Ch 14 routing. - The retry budget is composed. Structured-output retries are excluded from the regular tool-call budget in Claude Code. The agent's tool budget × structured-output retry budget × Hermes's per-turn fallback chain × Step-level Optimization escalation budget all interact. No paper has written down the joint budget composition.
- Trajectory format as structured output is the unstated infrastructure thesis. Hermes's JSONL trajectory is itself a structured-output contract. This is what makes downstream RL post-training, replay, and evaluation tractable. Trajectory-as-structured-output is the precondition for trajectory-aware routing, which Step-level Optimization (05-02) formalizes one level up.
Connections to prior wiki pages
- Ken Huang Ch 14 — Routing (05-01) — direct continuation. Ch 14 formalized provider/tier routing; Ch 15 formalizes the output-shape constraint orthogonal to routing.
- Ken Huang Ch 13 — MCP (05-01) — MCP servers expose tools whose input schemas are themselves structured-output contracts. Ch 15's mechanism applies one level up: forcing the agent's reply to follow a schema.
- Step-level Optimization (05-02) — the Stuck/Milestone monitors read trajectory format. JSONL trajectory schema enables monitor training.
- Synthetic Computers at Scale (05-01) — synthetic-environment training depends on structured trajectory schemas; Ch 15's batch runner is the engineering analog for parallel data generation at scale.
Research angles
- Schema-aware routing. A router that knows model A supports
response_formatnatively but model B requires tool-use forcing has more options at the dispatch layer. Hermes's flag exists; nobody has built the router that consumes it. - Trajectory-format standardization. ShareGPT-format JSONL is the de facto standard, but no formal schema with tool-call provenance, KV-cache markers, or retry-budget annotations exists. The first standardized trajectory schema would unlock cross-harness replay and benchmarking.
- Joint-budget composition. Tool-budget × output-retry-budget × fallback-chain-depth × step-level-escalation-budget. These are independent today; an agent that exhausts one but has slack in another should rebalance. Operations research problem nobody has formalized.