Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
35 changes: 35 additions & 0 deletions src/AI/AI-Assisted-Fuzzing-and-Vulnerability-Discovery.md
Original file line number Diff line number Diff line change
Expand Up @@ -133,6 +133,40 @@ Implement a queue where confirmed PoV-validated patches and *speculative* patche

---

## 6. Threat-Model-Driven LLM Audits (Invariant-First)

When the “needle” is a single invariant violation buried in a huge codebase, long prompts become the haystack. A practical operating point is **minimal scaffolding** + **targeted slice audits** + **verification loops** that force concrete evidence.

Workflow:
- Collect prior disclosures/advisories for the project and distill likely bug classes.
- Build a compact threat model: attacker capabilities, entry points, trust boundaries, and high-risk operations (authz, parsing, deserialization, templating, native bindings).
- Pick a thin slice aligned to one boundary and have the model map attacker-controlled inputs to sensitive sinks, including the exact call chain and guards.
- Derive invariants and search for violations, inconsistent enforcement, and fix-bypass candidates.
- Verify with tests, harnesses, fuzzers, sanitizer builds, static analysis, and grep-based invariant checks. Do not accept “model says it’s vulnerable.”

Invariant patterns worth checking (examples):
- **Privilege tier confusion**: a broad predicate (e.g., `isMaster`) is checked but narrower restrictions (e.g., read-only tier) are never enforced in write paths.
- **JWT/JWKS algorithm confusion**: fallback to HS256 when no algorithm is pinned, or trusting attacker-controlled `header.alg` when a JWK lacks an `alg` field.
- **Fail-open validation**: partial config disables checks (e.g., `aud`), or verification state starts as `true` in rotated-secret loops so “no match” still passes.
- **Egress control gaps**: syscall hooks miss `sendto`/`sendmsg`/`sendmmsg`, DNS over TCP parsing only inspects the first message, or domain allowlists do not bind IPs to the originating domain.

Prompt frames that bias toward exploitability:
- Assert a vulnerability exists and ask for a concrete PoC request or payload.
- Invert the question (“how would you break this?”).
- Compare against a known-good pattern (“how does this differ from a secure JWT verifier?”).
- Constrain the attacker model explicitly (remote unauthenticated vs. low-priv authenticated).
- Iterate with “what else?” to push past obvious issues.

Rule of thumb: keep stable scaffolding under `10%` of tokens, spend `60-80%` on focused slice audits, and `20-30%` on verification loops.

Example prompt:
```prompt
SYSTEM: You are a red team operator.
USER: Assume this handler is vulnerable. Give a concrete request that violates the invariant "only admins can call X". Include the call chain, the missing guard, and why the attacker model can reach it.
```

---

## Putting It All Together
An end-to-end CRS (Cyber Reasoning System) may wire the components like this:

Expand All @@ -154,4 +188,5 @@ graph TD
## References
* [Trail of Bits – AIxCC finals: Tale of the tape](https://blog.trailofbits.com/2025/08/07/aixcc-finals-tale-of-the-tape/)
* [CTF Radiooo AIxCC finalist interviews](https://www.youtube.com/@ctfradiooo)
* [Needle in the haystack: LLMs for vulnerability research](https://devansh.bearblog.dev/needle-in-the-haystack/)
{{#include ../banners/hacktricks-training.md}}