Why Your Agentic AI Pentester Is Probably Just a Fancy Scanner — Ken Huang
Source: Agentic AI / Ken Huang Substack, 2026-05-04 · Post
Raw: raw/rss/2026-05-04-agentic-ai-why-your-agentic-ai-pentester-is-probably-just-a-fancy.md
Tier: 1 (agent architecture, security tooling)
TL;DR
Ken Huang dissects a Ridge Security benchmark of three agentic pentesters (RidgeGen, Shannon, Strix) on OWASP Juice Shop. All three used the same Gemini 3 Flash model so the variable under test is system architecture, not model capability. The numbers expose three architectural failure modes:
- Belief state amnesia. Shannon and Strix treat each tool call as independent. RidgeGen maintains persistent belief state — when JWT alg:none is confirmed, the system updates its model of the application's authentication and reprioritizes its testing. Result: a cascade of 12 IDOR findings, mass assignment, vertical privilege escalation, all from one initial JWT bypass.
- Evidence validation as architectural invariant vs best-effort output. RidgeGen produced 55 findings, all evidence-backed, 0% hallucination. Shannon produced 27 findings, 17 unconfirmed (template descriptions of vulnerability classes), 63% hallucination. The architecture either gates output on evidence collection or it does not.
- Semantic reasoning vs syntactic pattern matching. Only RidgeGen found the negative-quantity-basket race condition (the model has to understand the financial transaction model). Pattern-matching tools find SQL injection; semantic tools find business-logic violations.
Token efficiency: Shannon 2138K tokens per finding (with 63% requiring manual validation); RidgeGen 846K tokens per confirmed finding.
Why it matters
This is the cleanest empirical demonstration the wiki has of the "harness > model" claim. GTA-2 (04-20) named the principle abstractly. Ken Huang's three-architecture comparison instantiates it concretely with three architectures, the same model held constant, and a measurable performance gap of >5x in evidence-backed findings.
For Tier 1 routing and agent design, the implications are direct:
- Belief state is the missing routing input. Step-Level Optimization (05-02) routes based on trajectory state (Stuck/Milestone monitors); these monitors require a belief state to evaluate. Most production routers do not maintain it explicitly.
- Evidence-validation-as-invariant is what AHE (05-04) calls a contract: the architecture's output is gated on a structural property, not a probabilistic check. AHE gates harness decisions on benchmark contracts; RidgeGen gates findings on execution evidence. Same architectural primitive.
- Cascading exploitation is the security analog of Step-Level Optimization's escalation: confirm a vulnerability → reprioritize the search. Routing to escalate vs routing to expand are dual operations on the same trajectory state.
Connections
- GTA-2 (2026-04-20) — execution harness dominates model capability. RidgeGen vs Shannon vs Strix at constant model is the cleanest experimental confirmation.
- AHE (2026-05-04) — contract-based decisions. Evidence-validation-as-invariant is the security instantiation.
- Defense Trilemma (2026-05-04) — Layer 2 (Agent Orchestration Layer) failure modes are exactly belief-state amnesia and absence of trust propagation. The trilemma argues no single defense is complete; this article shows that even at the offense side, single-architecture systems miss compound vulnerabilities.
- Step-Level Optimization (2026-05-02) — trajectory-aware routing in computer-use agents. The trust-propagation pattern Huang describes is structurally identical: confirm an event, update belief state, reprioritize. Two domains, one mechanism.
- Ken Huang Ch 14 routing + Ch 15 structured output (2026-05-01/04) — the same author's harness architecture series. This piece is the application of those harness principles to a security domain. The Ch 14 fallback chain and Ch 15 schema-identity caching are infrastructure-level architectural invariants; RidgeGen's belief-state propagation is the application-level architectural invariant.
Research angle (Tier 1)
- Belief state representation as a measurable harness property. Today every harness either has it or does not, but no public standard exists for "what counts as belief state." A formal definition (data structures, propagation rules, query interface) would let researchers compare harnesses on this dimension directly.
- Trust propagation in non-security agents. The cascading exploitation pattern works in pentesting because vulnerabilities compose. Whether the same propagation pattern transfers to non-security domains (debug-then-test, read-then-edit, plan-then-act) is largely unmeasured.
- Architecture-vs-model decomposition methodology. Ridge Security's same-model-different-architecture methodology is the right experimental design for harness research; it should become standard. Most agent benchmarks today vary both axes simultaneously.
Open questions
- The benchmark is single-run on a single target (OWASP Juice Shop). Variance and generalization to real-world targets are unaddressed.
- The Ridge Security benchmark sponsored the post — disclosure is upfront, but the comparative numbers should be replicated by independent evaluators before they become canonical.
- Whether RidgeGen's architecture transfers without the benchmark's specific affordances (Docker sandbox, isolated network, stable API surface) is open.
- Belief state representation is described conceptually but not in code; the load-bearing technical details (data structure, propagation rules) are not in the public post.