diff --git a/README.md b/README.md
index bd2a023..6d99bd8 100644
--- a/README.md
+++ b/README.md
@@ -12,6 +12,17 @@ The product vision for this tooling is proposed in [`propose/PRODUCT-VISION.md`]
 > for the assumptions this MCP makes about a Java repo (annotations, DI patterns,
 > service layout, naming) and a per-file map of where to edit the bundle if you
 > can't or don't want to refactor your codebase to match.
+>
+> **Driving this MCP from an agent:**
+> - [`docs/AGENT-GUIDE.md`](./docs/AGENT-GUIDE.md) — copy-paste-into-`QWEN.md` /
+>   `CLAUDE.md` block. Forced reasoning preamble, decision tree, full
+>   reference for all 22 tools, ontology glossary (v9), recovery playbook,
+>   slash-style aliases. Engineered for weak / mid models that otherwise
+>   pick the wrong tool.
+> - [`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md)
+>   — 7-phase agent-driven verification you run after indexing your real
+>   project. Each item has a copy-paste prompt and calibration data from
+>   `tests/bank-chat-system`.
 
 ## 1. Install
 
diff --git a/docs/AGENT-GUIDE.md b/docs/AGENT-GUIDE.md
new file mode 100644
index 0000000..623e84a
--- /dev/null
+++ b/docs/AGENT-GUIDE.md
@@ -0,0 +1,506 @@
+# Agent Guide — `java-enterprise-codebase-rag` MCP
+
+> **How to use this file.** Copy the block between the `<!-- BEGIN/END
+> user-rag MCP guide -->` markers below into your project's `QWEN.md`,
+> `CLAUDE.md`, `AGENTS.md`, or equivalent. The block is self-contained:
+> all 22 MCP tools, the ontology glossary (v9), a forced reasoning
+> preamble, a decision tree, a recovery playbook, and slash-style prompt
+> aliases. Update by re-pulling from this repo when the ontology bumps.
+>
+> Why this exists: weak / mid models pick the wrong tool, pass simple
+> names where FQNs are required, or ask vector search for things the
+> graph already knows exactly. This guide is engineered to keep them on
+> the rails.
+>
+> Calibrated against ontology version **9** (see `java_ontology.py`).
+
+---
+
+<!-- BEGIN user-rag MCP guide -->
+
+## user-rag MCP — agent operating manual
+
+This MCP indexes Java enterprise projects into two stores:
+
+- **LanceDB** — vector + hybrid search over Java/SQL/YAML chunks, scoped
+  by role / capability / module / microservice.
+- **Kuzu graph** — exact symbol graph with edges `EXTENDS`, `IMPLEMENTS`,
+  `INJECTS`, `DECLARES`, `CALLS`, `EXPOSES`, `HTTP_CALLS`, `ASYNC_CALLS`,
+  plus `Route` nodes for inbound endpoints (HTTP, Kafka, Feign, …).
+
+**Use this MCP when** the user asks anything that needs whole-codebase
+context: "who calls X", "what handles route Y", "trace the flow when Z
+happens", "what breaks if I change this", "where is concept C
+implemented", "review this PR diff for blast radius".
+
+**Do NOT use this MCP when** the answer is fully visible in the file the
+user is currently editing, or when the question is about a third-party
+library you can answer from training data. Prefer the cheapest tool that
+answers the question.
+
+### Forced reasoning preamble (every tool call)
+
+Before every MCP tool call, output **one short line** with this shape:
+
+```
+Q-class: <semantic | exact-symbol | route | call-graph | impact | pr | diagnostic>
+Pick: <tool_name>  Why: <≤8 words>
+```
+
+Then, **before issuing the call**, sanity-check arguments against
+*Argument shapes* below: arrays must be JSON arrays (not stringified),
+method needles must be `pkg.Type#method(SimpleArg1,SimpleArg2)`, and
+path templates must be the normalised servlet form. Most weak-model
+failures here are not wrong-tool-choice but wrong-argument-shape.
+
+Then make the tool call. If the first call returns nothing useful, do
+**not** loop the same tool with random tweaks — go to **Recovery
+playbook** at the bottom of this guide.
+
+### Argument shapes — what the parser actually wants
+
+Two classes of mistakes burn the most calls. Read this once, then refer
+back when a call returns nothing or fails validation.
+
+#### A. JSON, not stringified JSON
+
+FastMCP / Pydantic enforce real JSON types. **Pass arrays as JSON arrays
+and objects as JSON objects — never as a string containing JSON.** This
+is the single most common mistake on weak models because they over-quote
+defensively.
+
+| Param                | ✅ Right                                       | ❌ Wrong (will fail or coerce poorly)                |
+| -------------------- | ----------------------------------------------- | ----------------------------------------------------- |
+| `exclude_roles`      | `["DTO","ENTITY","CONFIG","OTHER"]`              | `"[\"DTO\",\"ENTITY\",\"CONFIG\",\"OTHER\"]"`           |
+| `edge_types`         | `["EXTENDS","IMPLEMENTS"]`                       | `"EXTENDS,IMPLEMENTS"` or `"[EXTENDS,IMPLEMENTS]"`     |
+| `confirm`            | `true`                                          | `"true"`                                              |
+| `limit`              | `20`                                            | `"20"`                                                |
+| `min_confidence`     | `0.9`                                           | `"0.9"`                                               |
+| any optional you don't want | omit the key entirely                    | `null` is OK; empty string `""` is NOT (treated as a real filter that matches nothing) |
+| string enums (`role`, `framework`, `capability`, `kind`) | `"CONTROLLER"`            | `["CONTROLLER"]` (single value, not a list)            |
+
+**One-line rule:** if the schema says `list[str]`, send `["a","b"]`. If
+it says `str`, send `"a"`. Don't wrap arrays in extra quotes "to be
+safe."
+
+#### B. Method needles — FQN + signature, with simple type names
+
+`find_callers` / `find_callees` accept three needle shapes. The signed
+FQN form is the only one that's unambiguous on overloaded methods.
+
+**The FQN format is exactly:**
+
+```
+<package>.<Type>[.<NestedType>]#<methodName>(<SimpleType1>,<SimpleType2>,…)
+```
+
+Key rules:
+
+- **Simple type names only**, no package prefixes inside the parens:
+  `String`, not `java.lang.String`. `List`, not `java.util.List`.
+- **Generics are erased**: `List<String>` → `List`. `Map<String,Long>` → `Map`.
+- **Arrays / varargs**: not formally tested in fixture; if your
+  search misses, try the simple base type without `[]` first.
+- **No spaces** between commas and types: `(String,String,String)`.
+- **No-arg method**: trailing `()`.
+- **Constructor**: methodName is `<init>`. Example:
+  `com.foo.Bar#<init>(String,int)`.
+- **Nested type**: dot-separated under the outer type, before the `#`:
+  `com.foo.Outer.Inner#method()`.
+
+**Examples (verbatim from `tests/bank-chat-system`):**
+
+```
+✅ com.bank.chat.assign.ChatAssignApplication#main(String)
+✅ com.bank.chat.assign.config.AssignProperties.ChatCore#setBaseUrl(String)
+✅ com.bank.chat.assign.integration.ChatCoreJoinClient#joinOperator(String,String,String)
+✅ com.bank.chat.assign.service.OperatorSessionService#openSession(String,List)
+✅ com.bank.chat.assign.ChatAssignApplication#<init>()
+```
+
+**The three needle shapes, ranked by precision:**
+
+1. **Method FQN with signature** — unambiguous, exact match. Use
+   whenever you have it.
+2. **Type FQN** (e.g. `com.foo.Bar`) — fans out to ALL declared
+   methods of that type via `DECLARES`. Useful for "who calls anything
+   on this class."
+3. **Simple method name** (e.g. `joinOperator`) — matches every method
+   of that name across the codebase. May return many rows; only use
+   when you don't know the type.
+
+**Overloaded methods — the failure you actually hit.** If a class has
+both `bar()` and `bar(String)` and you pass `Foo#bar()` expecting
+both, you'll only get the no-arg one. To resolve:
+
+- Don't know the signature? **Drop the parens** entirely and use just
+  the simple name (`bar`) — you'll get rows for every overload, then
+  pick the one(s) you want and re-query with full FQN+sig.
+- Or: pass the **type FQN** (`com.foo.Foo`) which fans out via
+  `DECLARES` and includes every method of every overload.
+- Or: call `codebase_search({"query":"Foo bar","auto_hybrid":true,"limit":5})`
+  to recover the exact stored FQN, then retry with that string.
+
+**How to find the FQN you need:**
+
+- From `codebase_search` results: each `CodeChunkHit` carries `fqn`
+  for the enclosing symbol — copy it verbatim.
+- From `list_by_role` / `list_by_annotation` / `find_implementors`:
+  each `SymbolDto` has an `fqn` field for the type. Then run
+  `find_callees({"fqn_or_signature":"<typeFqn>","depth":1})` to list
+  its methods with their signed FQNs.
+- Phantom rows (`?HashMap<>#<init>(0)`, `?RestTemplate#<init>(0)`) are
+  internal placeholders for unindexed external types. **Never pass
+  them as a needle** — they won't match anything.
+
+#### C. Path templates — the normalised servlet form
+
+`get_route_by_path` and `find_route_callers` expect `path_template` in
+the form the graph stores, NOT the raw `@RequestMapping` value:
+
+| Source code annotation               | What to pass            |
+| ------------------------------------ | ----------------------- |
+| `@GetMapping("/users/{id}")`         | `"/users/{id}"`         |
+| `@PostMapping("/users/{id}/avatar")` | `"/users/{id}/avatar"`  |
+| `@RequestMapping("/api")` + method `@GetMapping("/me")` | the **concatenated** template `"/api/me"` |
+| SpEL only: `@GetMapping("${app.endpoint}")` | empty string — use `list_routes` with `path_prefix` instead |
+
+If unsure, run `list_routes({"path_prefix":"/users"})` first and copy
+the `path` field from a result.
+
+### Decision tree — pick the first tool
+
+| User asks…                                                       | First tool                                          | Typical follow-up                              |
+| ---------------------------------------------------------------- | --------------------------------------------------- | ---------------------------------------------- |
+| "How does X work" / "where is concept Y" (natural language)      | `codebase_search`                                   | `find_callers` on the top hit's FQN            |
+| "What happens when <event> in <feature>" (end-to-end behaviour)  | `trace_flow`                                        | `find_callees` on stage-1 symbols              |
+| "Who calls method/class M"                                       | `find_callers` (FQN preferred)                      | Widen with `depth`, narrow with `microservice` |
+| "What does method M call"                                        | `find_callees`                                      | `graph_neighbors` for type wiring              |
+| "Show me the handler for HTTP path /foo/bar"                     | `get_route_by_path` then `find_route_handlers`      | `trace_request_flow`                           |
+| "List all HTTP endpoints / Kafka topics"                         | `list_routes` (filter by `framework`)               | `find_route_handlers` per id                   |
+| "Who calls route /foo/bar"                                       | `find_route_callers`                                | `trace_request_flow`                           |
+| "All controllers / services / repositories in service X"         | `list_by_role`                                      | `list_by_role` + `capability=` filter          |
+| "Everything annotated `@Transactional`"                          | `list_by_annotation`                                | `find_callers` per result                      |
+| "Everything that produces / listens to messages"                 | `list_by_capability` (`MESSAGE_PRODUCER` / `_LISTENER`) | `find_callees`                              |
+| "Who implements this interface"                                  | `find_implementors`                                 | `find_callers` on each impl                    |
+| "Who extends this class"                                         | `find_subclasses`                                   | `impact_analysis`                              |
+| "Where is X injected"                                            | `find_injectors`                                    | `find_callers`                                 |
+| "What breaks if I change this type"                              | `impact_analysis`                                   | `analyze_pr` if there's a diff                 |
+| "Review this PR / diff"                                          | `analyze_pr` (paste the unified diff)               | `find_route_callers` on touched routes         |
+| "Why is path X ignored / not indexed"                            | `diagnose_ignore`                                   | —                                              |
+| "Is the index healthy / what version / how big"                  | `graph_meta`                                        | `list_code_index_tables`                       |
+| "Rebuild the index" (slow, requires confirm)                     | `refresh_code_index`                                | `graph_meta` to verify                         |
+
+**Two rules of thumb:**
+
+1. **Graph beats vector for exact questions.** "Who calls `Foo#bar()`"
+   is a graph question — never use `codebase_search` for that.
+2. **Vector beats graph for fuzzy questions.** "How does authentication
+   work" should start with `codebase_search` (or `trace_flow`); the
+   graph alone won't surface the right entry point.
+
+### Tool reference — all 22 tools
+
+Grouped by purpose. Required arguments are **bold**; common mistakes are
+flagged with ⚠.
+
+#### Search (LanceDB)
+
+##### `codebase_search` — vector / hybrid search over Java / SQL / YAML chunks
+
+- **Args:** **`query`** (string, natural language or identifier).
+  Useful optionals: `table` (`java`|`sql`|`yaml`|`all`, default `java`),
+  `limit` (1-50, default 5), `role`, `exclude_roles`, `capability`,
+  `module`, `microservice`, `package_prefix`, `auto_hybrid` (set true
+  for identifier-ish queries like `DistributionChunkService`),
+  `graph_expand` (BFS through Kuzu after top-k), `context_neighbors`
+  (attach 1-2 adjacent chunks for context).
+- ⚠ For behavioural questions, set
+  `exclude_roles=["DTO","ENTITY","CONFIG","OTHER"]` — DTOs and entities
+  are noisy and rarely the answer.
+- ⚠ `hybrid=true` and `auto_hybrid=true` require a single `table` (not
+  `all`).
+- **Example:** `{"query":"how chat assigns on operator","exclude_roles":["DTO","ENTITY","CONFIG","OTHER"],"limit":8}`
+
+##### `list_code_index_tables` — index health summary
+
+- **Args:** none.
+- Returns LanceDB URI, embedding model, project root, refresh-allowed
+  flag, graph metadata (use `graph_meta` for just the graph side).
+
+#### Symbols (Kuzu graph — type wiring)
+
+##### `find_implementors` — classes implementing an interface
+
+- **Args:** **`name`** (interface simple name or FQN). Optionals:
+  `module`, `microservice`, `capability`, `limit`.
+- ⚠ Pass simple name (`PaymentService`) **or** FQN
+  (`com.acme.PaymentService`) — both work via the simple-name index.
+
+##### `find_subclasses` — classes / interfaces extending a given type
+
+- **Args:** **`name`**. Same optionals as `find_implementors`.
+
+##### `find_injectors` — types that inject (field/ctor/setter/Lombok) a given type
+
+- **Args:** **`name`** (the type **being** injected). Optional
+  `capability` filters the **consumer** (injecting class), not the
+  injected type.
+- Returns edges with `mechanism`, `annotation`, `field_or_param`.
+
+##### `graph_neighbors` — generic bidirectional neighbour expansion
+
+- **Args:** **`name`**, `depth` (1-3, default 1), `direction`
+  (`out`|`in`|`both`, default `both`), `edge_types` (subset of
+  `EXTENDS`, `IMPLEMENTS`, `INJECTS`).
+- Use this when none of the specialised tools fit (e.g. "find
+  everything one hop from `Foo` over implements + extends").
+
+##### `impact_analysis` — reverse closure over INJECTS+IMPLEMENTS+EXTENDS
+
+- **Args:** **`name`**, `depth` (1-4, default 2), `limit` (default 300).
+- Answers "who breaks if I change this type". Also returns
+  `cross_service_callers` for any route the impacted symbol exposes.
+
+#### Routes (inbound entry points)
+
+##### `list_routes` — list `Route` nodes (HTTP, Feign, Kafka, …)
+
+- **Args:** none required. Optionals: `microservice`, `framework`
+  (`spring_mvc`|`webflux`|`feign`|`kafka`|`rabbitmq`|`jms`|`stream`),
+  `path_prefix`, `method`, `limit`.
+- ⚠ Routes with empty `framework` are ones the extractor couldn't
+  classify — usually annotation-only Kafka topic constants. If you
+  expected an HTTP route here, check brownfield overrides.
+
+##### `find_route_handlers` — symbols that EXPOSES a Route id
+
+- **Args:** **`route_id`** (e.g. `r:0a2bdd…`).
+- ⚠ Feign **consumer** routes do NOT emit `EXPOSES` and return empty —
+  use `find_route_callers` instead.
+
+##### `get_route_by_path` — resolve one Route by (microservice, path, method)
+
+- **Args:** **`microservice`**, **`path_template`**, optional `method`.
+- ⚠ `path_template` must be the normalised servlet form: `{` `}` placeholders
+  are kept as `{}` (e.g. `/api/users/{}`). For SpEL-only routes
+  (`${kafka.topic}`) `path_template` is empty — use `list_routes` with
+  `path_prefix` instead.
+
+##### `find_route_callers` — who calls a Route (HTTP_CALLS / ASYNC_CALLS)
+
+- **Args:** either **`route_id`**, OR **`microservice`** +
+  **`path_template`** + optional `method`.
+- Use this for cross-service dependency questions.
+
+##### `trace_request_flow` — inbound + outbound around one entry route
+
+- **Args:** **`entry_route_id`**, optional `max_hops`.
+- Returns: callers (HTTP/ASYNC) → handler → outbound CALLS chain. Best
+  starting point for "what happens when this endpoint is hit".
+
+#### Calls (CALLS edges between methods)
+
+##### `find_callers` — inbound CALLS closure for a method or type
+
+- **Args:** **`fqn_or_signature`**. Three needle shapes (see *Argument shapes §B* for the full format spec):
+  - method FQN with sig (most precise): `com.foo.Bar#baz(String,int)` — simple type names only, no spaces, generics erased
+  - type FQN: `com.foo.Bar` (fans out to all methods via DECLARES)
+  - simple method name: `baz` (matches all overloads everywhere; useful as a recovery step)
+- Optionals: `depth` (1-5, default 1), `limit`, `min_confidence` (e.g.
+  `0.9` to drop low-confidence chained-receiver edges), `exclude_external`
+  (default true — drops JDK / Spring / Lombok callers), `module`,
+  `microservice`.
+- ⚠ For "who really calls this", set `min_confidence=0.9` and
+  `depth=1` first; widen if too narrow.
+
+##### `find_callees` — outbound CALLS closure
+
+- **Args / optionals:** same shape as `find_callers`.
+
+#### Roles & capabilities (multi-tag axes)
+
+##### `list_by_role` — graph symbols with a given role
+
+- **Args:** **`role`** (one of
+  `CONTROLLER|SERVICE|REPOSITORY|COMPONENT|CONFIG|ENTITY|CLIENT|MAPPER|OTHER`).
+  Optionals: `module`, `microservice`, `capability` (AND-filter), `limit`.
+- ⚠ Use `OTHER` to find things the inference missed — these are
+  brownfield candidates.
+
+##### `list_by_annotation` — symbols whose annotation list contains a simple name
+
+- **Args:** **`annotation`** (simple name, e.g. `Transactional`,
+  `Async`). Optionals: `module`, `microservice`, `capability`, `limit`.
+- ⚠ Pass the **simple** name without `@`.
+
+##### `list_by_capability` — symbols carrying a capability
+
+- **Args:** **`capability`** (one of
+  `MESSAGE_LISTENER|MESSAGE_PRODUCER|HTTP_CLIENT|SCHEDULED_TASK|EXCEPTION_HANDLER`).
+  Optionals: `module`, `microservice`, `limit`.
+
+#### Behavioural / cross-cutting
+
+##### `trace_flow` — end-to-end behavioural trace from a natural-language query
+
+- **Args:** **`query`**. Optionals: `microservice`, `module`,
+  `seed_limit` (default ~5), `stage_limit` (default ~8), `depth`
+  (hops-per-stage), `follow_calls` (default true).
+- Picks seeds via vector search restricted to behavioural roles
+  (CONTROLLER / COMPONENT / SERVICE / CLIENT + MESSAGE_LISTENER /
+  SCHEDULED_TASK), then walks the graph in 3 role-ordered stages
+  (entrypoints → services → integrations). Each result row carries
+  `via: [{edge_type, from_fqn, hop}]` so you know **why** it's there.
+- Use this for "what happens when X" questions instead of chaining 4
+  separate tools.
+
+##### `analyze_pr` — map a unified diff to indexed symbols + risk score
+
+- **Args:** **`diff_unified`** (string, full `git diff` output).
+- Returns: `changed_symbols`, `blast_radius_total`,
+  `cross_service_callers`, `routes_touched`, `risk_score` (0-1),
+  `risk_band`, `notes`. Binary hunks and renames are surfaced in
+  `notes` and skipped for symbol mapping.
+
+#### Index management & diagnostics
+
+##### `graph_meta` — Kuzu metadata: counts, ontology version, build timestamp
+
+- **Args:** none. First tool to run on a fresh index — confirms
+  `ontology_version=9` and surfaces build counts.
+
+##### `diagnose_ignore` — explain why a path is ignored
+
+- **Args:** **`path`** (relative to project root or absolute inside
+  project). Returns the layer that decided
+  (`builtin_default`|`project_root`|`nested`|`gitignore`).
+
+##### `refresh_code_index` — rebuild LanceDB chunks + Kuzu graph (slow)
+
+- **Args:** **`confirm`** (must be `true`). Requires
+  `LANCEDB_MCP_ALLOW_REFRESH=1`.
+- ⚠ Always call `graph_meta` after to verify the rebuild succeeded.
+
+### Ontology glossary (version 9)
+
+Source of truth: `java_ontology.py`. Pass these strings verbatim
+(case-sensitive).
+
+#### Roles (`role` column on type-level Symbol nodes)
+
+`CONTROLLER`, `SERVICE`, `REPOSITORY`, `COMPONENT`, `CONFIG`, `ENTITY`,
+`CLIENT`, `MAPPER`, `DTO`, `OTHER`.
+
+- `CLIENT` covers Feign clients (`@FeignClient`) and brownfield
+  `@CodebaseRole(CLIENT)`. As of ontology 9, plain `RestTemplate`
+  wrappers stay in their natural stereotype role (typically `SERVICE`)
+  unless you explicitly tag them.
+- `OTHER` = the inference didn't recognise the type. Treat as a
+  brownfield candidate.
+
+#### Capabilities (multi-tag, may be empty)
+
+`MESSAGE_LISTENER`, `MESSAGE_PRODUCER`, `HTTP_CLIENT`, `SCHEDULED_TASK`,
+`EXCEPTION_HANDLER`.
+
+- Capabilities are independent of role — a `@Service` can carry
+  `MESSAGE_PRODUCER` + `MESSAGE_LISTENER` simultaneously.
+- `HTTP_CLIENT` fires for `@FeignClient` types and brownfield
+  `@CodebaseCapability(HTTP_CLIENT)`. RestTemplate-only wrappers do not
+  auto-promote.
+- Capabilities are derived at the **type level**: method-level evidence
+  is aggregated up to the enclosing type.
+
+#### Route framework (on `Route` nodes)
+
+`spring_mvc`, `webflux`, `feign`, `kafka`, `rabbitmq`, `jms`, `stream`.
+
+#### Route kind
+
+`http_endpoint`, `http_consumer`, `kafka_topic`, `rabbit_queue`,
+`jms_destination`, `stream_binding`.
+
+- `feign` framework with `http_consumer` kind = a Feign declaration
+  registers an outbound contract; it does NOT expose an inbound handler
+  and won't appear in `find_route_handlers`.
+
+#### Client kind (on `HTTP_CALLS` / `ASYNC_CALLS` edges)
+
+`feign_method`, `rest_template`, `web_client`, `kafka_send`,
+`stream_bridge_send`.
+
+#### Call match (resolution outcome on cross-service edges)
+
+`cross_service`, `intra_service`, `ambiguous`, `phantom`, `unresolved`.
+
+- `phantom` = the called type is referenced by name but has no Symbol
+  row (external library or unindexed code). Common and not always a
+  bug.
+- `cross_service` = caller and callee are in different microservices
+  and the resolver had enough information to bind them. Goal is to
+  maximise this for legitimate inter-service calls.
+
+### Recovery playbook — when results look wrong
+
+| Symptom                                                                  | Likely cause                                                                                             | Fix                                                                                                   |
+| ------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- |
+| `find_callers`/`find_callees` returns 0 rows                             | Wrong needle shape: pass FQN with sig (`com.foo.Bar#baz(String,int)`), not just `baz`                    | Run `codebase_search` with the simple name to recover the FQN, then retry                             |
+| `find_callers`/`find_callees` returns LESS than expected on an overloaded method | Needle was `Foo#bar()` but the overload you wanted is `Foo#bar(String)` — the resolver only matched the no-arg one | Drop the parens (`bar`) to list all overloads, then re-query with the full FQN+sig of the right one. Or pass the type FQN to fan out via DECLARES. See *Argument shapes §B*. |
+| Tool returns a validation / type error mentioning a list field           | Stringified JSON: `"[\"DTO\"]"` instead of `["DTO"]`                                                       | Pass real JSON arrays. See *Argument shapes §A* table.                                                |
+| `path_template` filter returns nothing                                   | Passed the raw annotation value, but the graph stores the concatenated servlet form                     | Run `list_routes({"path_prefix":"/your/prefix"})` and copy the exact `path` field, then retry         |
+| Tool says "graph unavailable"                                            | Index not built or `LANCEDB_MCP_PROJECT_ROOT` not set                                                    | Run `graph_meta` to confirm; `refresh_code_index({"confirm":true})` if needed                         |
+| Expected route is missing from `list_routes`                             | Framework not recognised by built-in extractor                                                           | Add `@CodebaseRoute(framework=…, kind=…, path=…, method=…)` per README §3b, then `refresh_code_index` |
+| `list_by_role` shows a `*Controller` class as `OTHER`                    | Non-Spring web stack (JAX-RS, custom)                                                                    | Add `@CodebaseRole(CodebaseRoleKind.CONTROLLER)` per README §3a, or `role_overrides.fqn` in YAML      |
+| `cross_service_calls_total = 0` but you know there are inter-service calls | Resolution mode is `brownfield_only` and call sites have no brownfield tag, OR target services unindexed | Switch to `cross_service_resolution: auto` in YAML, or tag with `@CodebaseClient`                     |
+| `codebase_search` returns DTOs / config classes instead of behaviour     | Default ranking; no role filter                                                                          | Add `exclude_roles=["DTO","ENTITY","CONFIG","OTHER"]`                                                 |
+| Identifier search returns junk                                           | Pure vector lookup is fuzzy on identifiers                                                               | Set `auto_hybrid=true` (FTS + vector RRF)                                                             |
+| Same query returns different results across runs                         | None — graph build is deterministic                                                                      | If you actually see this, file a bug with `graph_meta` `built_at` from both runs                     |
+
+If two consecutive recovery attempts on the same intent fail, **stop
+and report** the failure to the user with the tool name, the args you
+tried, and what you got back. Do not loop further.
+
+### Slash-style aliases (prompt templates, not real commands)
+
+Paste these into your prompt to nudge a weak model. They are just
+shorthand for the right tool + args.
+
+- `/who-calls <fqn-with-sig>` → `find_callers({"fqn_or_signature":"<fqn>","depth":1,"min_confidence":0.9})`. **Pass the full signed FQN** (e.g. `com.foo.Bar#baz(String,int)`) — see *Argument shapes §B* for format. If you only have the simple name, query that first and re-issue with the exact FQN.
+- `/calls-from <fqn-with-sig>` → `find_callees({"fqn_or_signature":"<fqn>","depth":1})`. Same FQN-with-signature rule — simple name will match all overloads but not let you target one.
+- `/route <method> <path> [microservice]` → `list_routes({"path_prefix":"<path>","method":"<method>","microservice":"<ms>"})`
+- `/handler <route_id>` → `find_route_handlers({"route_id":"<route_id>"})`
+- `/who-hits <microservice> <path>` → `find_route_callers({"microservice":"<ms>","path_template":"<path>"})`
+- `/why-no-route <fqn>` → 1) `list_by_role({"role":"OTHER"})` to confirm the type wasn't classified, 2) `list_by_annotation` for any custom annotation, 3) suggest brownfield `@CodebaseRoute`
+- `/role-of <name>` → `find_implementors({"name":"<name>"})` if it's an interface; `list_by_role({"role":"…"})` to scan
+- `/impact <fqn>` → `impact_analysis({"name":"<fqn>","depth":2})`
+- `/cross-service <fqn>` → 1) `impact_analysis`, 2) inspect `cross_service_callers`, 3) `find_route_callers` per route
+- `/flow <natural language>` → `trace_flow({"query":"<nl>","seed_limit":5,"stage_limit":8})`
+- `/diff-risk <unified diff>` → `analyze_pr({"diff_unified":"<diff>"})`
+- `/health` → `graph_meta()` then `list_code_index_tables()`
+
+### One-liner: the canonical workflow for "explain feature X"
+
+1. `trace_flow({"query":"<X>","seed_limit":5})` — get the role-ordered chain.
+2. For each stage symbol whose hop is interesting: `find_callees` (depth 1) to fan out, `find_callers` (depth 1) to fan in.
+3. If a `Route` shows up in stage 0: `trace_request_flow({"entry_route_id":"<id>"})` for the full inbound + outbound picture.
+4. If anything looks wrong, run **Recovery playbook** before re-querying.
+
+<!-- END user-rag MCP guide -->
+
+---
+
+## Maintenance notes (for the repo, not the agent)
+
+- Bump the **ontology version** sentence at the top of the BEGIN block
+  whenever `ONTOLOGY_VERSION` changes in `kuzu_queries.py`.
+- When a new MCP tool is added in `server.py`, add it to (a) the
+  decision tree, (b) the tool reference, (c) a slash alias if the use
+  case is common.
+- The forced-reasoning preamble adds ~30 tokens per tool call. That's
+  intentional cost for substantially better tool selection on weak
+  models. Remove it if you're driving with Opus / GPT-5 / Sonnet 4.6
+  and don't need the scaffolding.
+- For the per-tool `Skills/` split (one file per tool / per workflow),
+  see the follow-up plan once usage patterns shake out from real
+  enterprise project use.
diff --git a/docs/MANUAL-VERIFICATION-CHECKLIST.md b/docs/MANUAL-VERIFICATION-CHECKLIST.md
new file mode 100644
index 0000000..7087f32
--- /dev/null
+++ b/docs/MANUAL-VERIFICATION-CHECKLIST.md
@@ -0,0 +1,636 @@
+# Manual Verification Checklist — `java-enterprise-codebase-rag`
+
+Use this **after** you've read `README.md` + `CODEBASE_REQUIREMENTS.md`,
+applied any brownfield annotations, and built the index against your
+real project. The checklist drives an MCP-aware agent (Qwen Code,
+Claude Code, Cursor, …) through 7 phases of progressively deeper
+verification.
+
+Each item has:
+
+- ☐ a checkbox
+- a **Verification prompt** — paste verbatim into your agent
+- **Expected (calibration)** — what the same prompt produces on
+  `tests/bank-chat-system` (the in-repo fixture, ontology v9). If your
+  numbers diverge wildly from the calibration column, that's a signal,
+  not a verdict — your project just is bigger or smaller; what matters
+  is the **shape** (proportions, error rates, presence of expected
+  edges).
+- **If failing → fix** — concrete next step
+
+Calibration was captured against `tests/bank-chat-system` on
+`master @ d62b48c` (post PR-H1, ontology version 9): 84 files, 92
+types, 474 members, 0 parse errors, 17 routes, 793 calls, 2 HTTP_CALLS,
+5 ASYNC_CALLS, microservices = `chat-core` + `chat-assign`.
+
+---
+
+## Pre-flight — build the graph and prepare the agent
+
+Run **once** before working through the phases:
+
+```bash
+# 1. Build the graph against your project (verbose, deterministic)
+rm -rf /tmp/verify_kuzu
+python build_ast_graph.py \
+  --source-root /path/to/your/project \
+  --kuzu-path /tmp/verify_kuzu --verbose 2>&1 | tee /tmp/verify_build.log
+
+# 2. Read the summary lines (last ~10 lines of the log)
+tail -12 /tmp/verify_build.log
+
+# 3. Point the MCP server at the new graph + run it from the agent of choice
+export LANCEDB_MCP_PROJECT_ROOT=/path/to/your/project
+export LANCEDB_MCP_KUZU_PATH=/tmp/verify_kuzu
+# … then start your MCP client (Qwen Code / Claude Code) so it sees this MCP
+```
+
+> **Quick read of the build log.** The `[pass3]` line tells you
+> call-resolution health. The `[pass4]` line is route extraction
+> (`routes_resolved_pct` is the headline number — a fully Spring MVC
+> service hits 95-100; Kafka-heavy services drop to 70-90 because some
+> topics are SpEL `${…}`). `[pass6]` shows cross-service match results.
+
+**Calibration on `tests/bank-chat-system`:**
+
+```
+[pass1] parsed 84 files in 0.24s: 92 types, 474 members, 0 parse errors, 0 skipped
+[pass2] emitted 10 EXTENDS, 14 IMPLEMENTS, 71 INJECTS, 8 phantoms in 0.00s
+[pass3] Call resolution: 800 sites, 77 chained phantoms (9.6%), 294 unresolved callee (36.8%), 138 phantom receiver (17.2%), …
+[pass4] Route extraction: emitted=11, exposes=11, skipped_unresolved=0, routes_resolved_pct=81.8, by_framework={'spring_mvc': 9, 'kafka': 2}
+[pass5] HTTP_CALLS: 2 edges, ASYNC_CALLS: 5 edges
+[pass6] http_match={'phantom': 2}, async_match={'intra_service': 1, 'phantom': 4}, cross_service_calls_total=0
+```
+
+> Note: `cross_service_calls_total=0` on the fixture is **expected** —
+> the fixture is intra-service-heavy. On a real multi-service project
+> this should be > 0 (otherwise see Phase 5).
+
+---
+
+## Phase 1 — Index health (4 items)
+
+### 1.1 ☐ Ontology version is 9
+
+**Verification prompt:**
+
+> Call `graph_meta()`. Report `ontology_version`, `built_at`,
+> `source_root`, and `parse_errors`. Does `ontology_version` equal `9`?
+
+**Expected (calibration):** `ontology_version: 9`,
+`source_root: /home/user/workspace/user-rag/tests/bank-chat-system`,
+`parse_errors: 0`.
+
+**If failing → fix:** older ontology means you're running a stale wheel
+or an old graph file. Re-pull the repo, `git rev-parse HEAD`, then
+rebuild from scratch with `rm -rf /tmp/verify_kuzu && python
+build_ast_graph.py …`.
+
+### 1.2 ☐ Parse error rate is acceptable
+
+**Verification prompt:**
+
+> Call `graph_meta()`. Look at `counts.files` and `parse_errors`. Compute
+> `parse_errors / files * 100`. If above 1%, name the most likely
+> culprit by inspecting the build log (`/tmp/verify_build.log`) for
+> `[parse-error]` lines.
+
+**Expected (calibration):** `0 / 84 = 0%`.
+
+**If failing → fix:** > 5% means tree-sitter is choking — usually
+non-UTF-8 files or generated sources you forgot to ignore. Add to
+`.gitignore` or to the project's `lancedb_mcp_ignore`. Re-run
+`diagnose_ignore({"path":"src/main/generated"})` to confirm the rule
+took effect.
+
+### 1.3 ☐ Symbol counts match the project's rough scale
+
+**Verification prompt:**
+
+> Call `graph_meta()`. Report `counts.types`, `counts.members`,
+> `counts.injects`. For a back-of-envelope sanity check, run
+> `wc -l src/**/*.java` outside the agent and compare: types should be
+> ~1 per non-trivial file.
+
+**Expected (calibration):** 92 types from 84 files (= 1.10 types/file —
+nested classes account for the slight overshoot), 474 members, 71
+injects.
+
+**If failing → fix:** types ≪ files usually means tree-sitter parser
+errors swallowed type declarations. Cross-check Phase 1.2.
+
+### 1.4 ☐ LanceDB tables exist and are readable
+
+**Verification prompt:**
+
+> Call `list_code_index_tables()`. Report `lancedb_uri`,
+> `embedding_model`, the list of tables, and `refresh_enabled`. Then
+> run `codebase_search({"query":"main","table":"java","limit":1})`.
+> Did it return at least 1 hit?
+
+**Expected (calibration):** tables include `java`, `sql`, `yaml`; the
+search returns ≥1 chunk.
+
+**If failing → fix:** missing tables → run
+`refresh_code_index({"confirm":true})` (slow, requires
+`LANCEDB_MCP_ALLOW_REFRESH=1`). Empty results from `codebase_search` →
+the embedding model didn't load; check `SBERT_MODEL` env and disk
+space.
+
+### Red flags for Phase 1
+
+- `parse_errors / files > 5%` → ignore rules wrong
+- `routes = 0` and you have controllers → see Phase 3
+- `injects = 0` and you have any DI → built-in inference broken,
+  rebuild
+
+---
+
+## Phase 2 — Roles & capabilities (5 items)
+
+### 2.1 ☐ Controllers are recognised
+
+**Verification prompt:**
+
+> Call `list_by_role({"role":"CONTROLLER","limit":200})`. Then call
+> `codebase_search({"query":"controller","table":"java","limit":50,
+> "exclude_roles":["CONTROLLER"]})`. From the second result list,
+> identify any class whose simple name ends in `Controller` /
+> `Resource` / `Endpoint`. Report each as a candidate brownfield
+> override.
+
+**Expected (calibration):** 5 CONTROLLERs (`ChatIngressController`,
+`JoinOperatorController`, `DevAssignmentController`,
+`ChatManagementController`, `OperatorManagementController`). Zero
+`*Controller` classes appear in the second list.
+
+**If failing → fix:** for each candidate not classified, add either
+`@CodebaseRole(CodebaseRoleKind.CONTROLLER)` (README §3a) or a
+`role_overrides.fqn` entry in `.lancedb-mcp.yml`. Rebuild.
+
+### 2.2 ☐ Services and repositories are recognised
+
+**Verification prompt:**
+
+> Call `list_by_role({"role":"SERVICE","limit":200})` and
+> `list_by_role({"role":"REPOSITORY","limit":200})`. Spot-check 3
+> service results: read each via `codebase_search` to confirm they
+> contain business logic (not DTOs). Then call
+> `list_by_role({"role":"OTHER","limit":100})` and report any class
+> whose simple name ends in `Service`, `Repository`, `Dao`, or `Repo`.
+
+**Expected (calibration):** 7 SERVICEs (incl. `ChatManagementService`,
+`DistributionChunkService`, `OperatorSessionService`); REPOSITORYs
+exist in real Spring projects but the fixture has 0 due to in-memory
+stubs. No `*Service` / `*Repository` classes in OTHER.
+
+**If failing → fix:** brownfield override per 2.1.
+
+### 2.3 ☐ Feign clients carry CLIENT + HTTP_CLIENT
+
+**Verification prompt:**
+
+> Call `list_by_role({"role":"CLIENT","capability":"HTTP_CLIENT","limit":50})`.
+> Then call `list_by_annotation({"annotation":"FeignClient","limit":50})`.
+> Every `@FeignClient`-annotated type should appear in the first list.
+> Report any divergence.
+
+**Expected (calibration):** the fixture has Feign-style call sites but
+0 `@FeignClient` classes (it uses RestTemplate); on real projects,
+counts should match exactly.
+
+**If failing → fix:** as of ontology 9 (PR-H1), `@FeignClient` →
+`role=CLIENT` + `capability=HTTP_CLIENT`. If you see drift, run
+`graph_meta` and confirm `ontology_version=9`. If yes and still
+broken, re-index — may be a stale graph.
+
+### 2.4 ☐ Message listeners and producers are detected
+
+**Verification prompt:**
+
+> Call `list_by_capability({"capability":"MESSAGE_LISTENER","limit":50})`
+> and `list_by_capability({"capability":"MESSAGE_PRODUCER","limit":50})`.
+> Then `list_by_annotation({"annotation":"KafkaListener","limit":50})`
+> and confirm all results from the annotation query also appear in the
+> capability query. Repeat for `RabbitListener`, `JmsListener`, and
+> `EventListener` if your project uses them.
+
+**Expected (calibration):** 2 listeners (`DistributionTriggerListener`,
+`ChatKafkaListener`) and 2 producers (`DistributionTriggerPublisher`,
+`FollowUpKafkaPublisher`).
+
+**If failing → fix:** custom listener annotations → meta-annotation
+walk should pick them up automatically (Layer A). If not, add to
+`role_overrides.annotations` in `.lancedb-mcp.yml` (README §"Brownfield
+overrides").
+
+### 2.5 ☐ OTHER role is small relative to type count
+
+**Verification prompt:**
+
+> Call `list_by_role({"role":"OTHER","limit":500})` and report the
+> count. Compute `OTHER / total_types` from `graph_meta().counts.types`.
+> What fraction of OTHER are obviously utility classes (exceptions,
+> records, internal helpers) vs candidates the inference should have
+> handled?
+
+**Expected (calibration):** 43 OTHER out of 92 types (47%) — fixture
+has many record DTOs and helper classes. On a real project this should
+be < 30% if you're well-annotated; 30-50% suggests you need a few
+brownfield overrides.
+
+**If failing → fix:** > 60% OTHER almost always means a non-Spring
+stack the inference doesn't know — add `role_overrides.annotations` for
+your custom stereotypes.
+
+### Red flags for Phase 2
+
+- `*Controller` classes in `OTHER` → JAX-RS or custom web framework
+  not annotated
+- Feign clients without `HTTP_CLIENT` capability → ontology drift,
+  rebuild
+- `MESSAGE_LISTENER` count = 0 in a Kafka-heavy project → meta-walk
+  failed to find your annotation
+
+---
+
+## Phase 3 — Routes (4 items)
+
+### 3.1 ☐ Route count and framework distribution
+
+**Verification prompt:**
+
+> Call `graph_meta()` and report `routes_total`, `routes_by_framework`,
+> `routes_resolved_pct`, `routes_from_brownfield_pct`. Then
+> `list_routes({"limit":500})` to see them. Does the framework mix
+> match what you'd expect (e.g. mostly `spring_mvc` for an HTTP
+> service)?
+
+**Expected (calibration):** `routes_total=17`,
+`routes_by_framework={spring_mvc: 9, kafka: 2}` (the remaining 6 are
+extracted but unframework'd Kafka topic constants),
+`routes_resolved_pct=81.8`, `routes_from_brownfield_pct=0.0`.
+
+**If failing → fix:** `routes_resolved_pct < 60` on a Spring project
+means many `@RequestMapping` paths are SpEL/`${…}` (acceptable) or
+your handler types weren't classified as CONTROLLER (Phase 2.1).
+
+### 3.2 ☐ Every controller exposes ≥1 route
+
+**Verification prompt:**
+
+> Call `list_by_role({"role":"CONTROLLER","limit":200})`. For each
+> result FQN, call `find_callees({"fqn_or_signature":"<fqn>","depth":1,
+> "limit":5})` to confirm it has methods. Then call
+> `list_routes({"limit":500})` and verify each controller appears
+> at least once in the routes' handler set (run
+> `find_route_handlers` on a sample of route ids).
+
+**Expected (calibration):** all 5 controllers in the fixture expose
+at least one HTTP route (9 routes total / 5 controllers).
+
+**If failing → fix:** if a controller has no route, the framework
+isn't recognised on its methods. Add `@CodebaseRoute` per README §3b.
+
+### 3.3 ☐ HTTP routes have non-empty path AND method
+
+**Verification prompt:**
+
+> Call `list_routes({"framework":"spring_mvc","limit":200})`. Report
+> any route where `path` is empty or `method` is empty. (Empty `path`
+> with `framework=spring_mvc` usually means `@RequestMapping` with no
+> path — programmatic routing — which is rare and worth investigating.)
+
+**Expected (calibration):** all 9 spring_mvc routes have non-empty
+`path` and `method`.
+
+**If failing → fix:** unresolvable SpEL paths are normal in some
+`@RequestMapping` forms — accept them. But if a route has
+`framework=spring_mvc` and no path, it's likely a route you should
+override with `@CodebaseRoute`.
+
+### 3.4 ☐ Kafka topics are correct (topics, brokers, kinds)
+
+**Verification prompt:**
+
+> Call `list_routes({"framework":"kafka","limit":200})`. For each
+> result, confirm: `kind=kafka_topic` and `topic` is non-empty. Cross-
+> reference against your project's `application.yml` /
+> `application.properties` Kafka topic names.
+
+**Expected (calibration):** 2 kafka routes
+(`ChatTopics.INCOMING`, `${assign.kafka.distribution-topic}`). The
+6 unframework'd Kafka rows in 3.1 are SpEL constants the extractor
+couldn't resolve — they show up but with empty framework.
+
+**If failing → fix:** for unresolved topics that you DO know the
+literal name of, use brownfield route override:
+`@CodebaseRoute(framework=kafka, kind=kafka_topic, topic="my.topic")`
+on the listener method (README §3b).
+
+### Red flags for Phase 3
+
+- `routes_total = 0` → no controllers were classified or framework not
+  recognised
+- HTTP routes with empty `method` → annotation extractor didn't see
+  `@GetMapping` / `@PostMapping`
+- `routes_from_brownfield_pct` jumped after a refactor → you broke a
+  built-in extraction; check that ontology version is still 9
+
+---
+
+## Phase 4 — Call graph (3 items)
+
+### 4.1 ☐ Pick a known method, verify `find_callers` matches IDE
+
+**Verification prompt:**
+
+> Pick one method in your project that you know has 3-5 callers (for
+> example a service method called by 1-2 controllers and 1-2 other
+> services). State its FQN+signature.
+> Call `find_callers({"fqn_or_signature":"<fqn>#<method>(<args>)","depth":1,"min_confidence":0.9,"limit":50})`.
+> Open your IDE, run "Find Usages" on the same method, and compare:
+> for each IDE caller, does it appear in the MCP result? List
+> mismatches.
+
+**Expected (calibration):** any service method like
+`com.bank.chat.assign.service.DistributionService#assignNext()` should
+have 1-3 callers. Whether your IDE matches MCP exactly depends on:
+reflection (won't show in MCP), generated code (depends on indexing
+config), and JDK external code (filtered by `exclude_external`).
+
+**If failing → fix:** if MCP misses callers your IDE finds, lower
+`min_confidence` to `0.0` and retry. If still missing, the call site
+was resolved as `phantom` — check your generic / reflection-heavy code
+isn't dominating.
+
+### 4.2 ☐ End-to-end chain reproduces via `find_callees`
+
+**Verification prompt:**
+
+> Pick one HTTP entry point. Call `list_routes({"framework":"spring_mvc","limit":1})`,
+> grab the route id, then `find_route_handlers({"route_id":"<id>"})`
+> to get the handler FQN. Then `find_callees` on the handler with
+> `depth=2`. Does the chain reach a service method (depth 1) and then
+> a repository / external call (depth 2)?
+
+**Expected (calibration):** `JoinOperatorController#joinOperator` →
+`ChatOrchestrationService#…` → repository / Kafka publisher.
+
+**If failing → fix:** if depth-2 returns nothing, your service classes
+might be classified as OTHER (back to Phase 2.2). Or `min_confidence`
+is filtering legit edges — try `min_confidence=0.0`.
+
+### 4.3 ☐ Phantom rate is acceptable
+
+**Verification prompt:**
+
+> Look at the `[pass3]` line in `/tmp/verify_build.log`. Report
+> `chained_phantoms %`, `unresolved_callee %`, `phantom_receiver %`.
+
+**Expected (calibration):** chained phantoms 9.6%, unresolved callee
+36.8%, phantom receiver 17.2%. The fixture has many cross-service
+references that legitimately resolve to phantoms (other-service types
+that aren't in the same indexing root).
+
+**If failing → fix:** > 50% unresolved on a single-service indexing
+likely means you didn't include the project's library jars or generated
+sources path. > 30% chained phantoms can mean overly fluent APIs the
+resolver can't follow — usually accept as a known limitation.
+
+### Red flags for Phase 4
+
+- `find_callers` returns 0 with `min_confidence=0.0` → wrong needle
+  shape (use FQN+sig, not simple name)
+- depth-2 closure returns nothing on a real chain → roles wrong (Phase
+  2)
+
+---
+
+## Phase 5 — Cross-service edges (3 items)
+
+### 5.1 ☐ HTTP_CALLS edges exist and resolve correctly
+
+**Verification prompt:**
+
+> Call `graph_meta()` and report `http_calls_total`,
+> `http_calls_by_strategy`, `http_calls_match_breakdown`. Then pick
+> a known cross-service HTTP call site (e.g. a Feign interface method
+> on service A whose target is service B). Call
+> `find_route_callers({"microservice":"<B>","path_template":"<path>"})`
+> and confirm A appears as a caller with `match=cross_service`.
+
+**Expected (calibration):** `http_calls_total=2`,
+`http_calls_match_breakdown={phantom: 2}` (no cross-service in the
+fixture). On a real multi-service project, expect `cross_service > 0`.
+
+**If failing → fix:** if you expected `cross_service` but got
+`phantom`, the target service isn't in the same indexing root, OR the
+`@FeignClient` URL doesn't resolve to a known service. Tag with
+`@CodebaseClient(clientKind="feign_method", targetService="<name>",
+path="…")` (README §3c).
+
+### 5.2 ☐ ASYNC_CALLS edges connect producer → topic → listener
+
+**Verification prompt:**
+
+> Call `graph_meta()` and report `async_calls_total`,
+> `async_calls_by_strategy`, `async_calls_match_breakdown`. Pick a
+> known Kafka topic. Call
+> `find_route_callers({"microservice":"<consumer-service>","path_template":""})`
+> with the route id of the consumer route. Confirm the producer
+> appears as a caller.
+
+**Expected (calibration):** `async_calls_total=5`,
+`async_calls_match_breakdown={intra_service: 1, phantom: 4}`. On real
+projects with multi-service Kafka, expect `cross_service` matches.
+
+**If failing → fix:** mostly `phantom` on real cross-service async
+calls means the consumer side doesn't have a `Route` node for the
+topic. Either the listener isn't classified (Phase 2.4) or the topic
+literal couldn't be resolved (Phase 3.4).
+
+### 5.3 ☐ `cross_service_resolution` flag flips behaviour as expected
+
+**Verification prompt:**
+
+> Pick one cross-service call site that resolved to `cross_service` in
+> the default `auto` mode. Edit `.lancedb-mcp.yml` to add
+> `cross_service_resolution: brownfield_only`. Rebuild
+> (`refresh_code_index({"confirm":true})`) and re-run the same
+> `find_route_callers` query. The previously-cross_service edge
+> should now be `unresolved` (unless your call site is brownfield-
+> tagged). Confirm.
+
+**Expected (calibration):** N/A — fixture has 0 cross-service edges.
+Use this on your real project as a smoke test of the flag.
+
+**If failing → fix:** flag flag has no effect → you didn't actually
+rebuild after editing the YAML. `graph_meta().built_at` should be a
+fresh timestamp.
+
+### Red flags for Phase 5
+
+- `cross_service_calls_total = 0` on a multi-service project →
+  resolver couldn't bind any caller to its target. Check that all
+  services are under one indexing root, and check microservice
+  detection (top-level dirs under `LANCEDB_MCP_PROJECT_ROOT`).
+
+---
+
+## Phase 6 — Semantic search (2 items)
+
+### 6.1 ☐ Concept query returns relevant chunks
+
+**Verification prompt:**
+
+> Pick a behavioural concept that exists in your code (e.g.
+> "operator assignment", "session lifecycle", "retry on Kafka send").
+> Call `codebase_search({"query":"<concept>","limit":8,
+> "exclude_roles":["DTO","ENTITY","CONFIG","OTHER"],
+> "context_neighbors":1})`. The top 3 hits should be in files you'd
+> naturally point at for that concept.
+
+**Expected (calibration):** `query="how chat assigns on operator"` →
+top hits include `DistributionService`, `OperatorSessionService`,
+`JoinOperatorController` (the assignment chain).
+
+**If failing → fix:** top hits are DTOs / configs → you forgot
+`exclude_roles`. Top hits are unrelated → embeddings are off (check
+`SBERT_MODEL` and that `refresh_code_index` actually ran on the
+current code).
+
+### 6.2 ☐ Identifier query benefits from `auto_hybrid`
+
+**Verification prompt:**
+
+> Pick a class your project defines (e.g. `DistributionChunkService`).
+> Run two queries: with `auto_hybrid=false` (default) and with
+> `auto_hybrid=true`. Report the top 3 hits from each.
+
+**Expected (calibration):** without auto_hybrid, top results are still
+relevant but ranked lower; with auto_hybrid=true, the FTS+vector RRF
+pushes the exact-name file to position 1.
+
+**If failing → fix:** auto_hybrid has no effect → `table=all` (it
+requires a single table). Stick to `table=java`.
+
+### Red flags for Phase 6
+
+- Chunk count from `codebase_search` is 0 on a known-good query →
+  LanceDB tables empty or wrong embedding model
+- `graph_expand=true` returns more results than `=false` but they're
+  noise → expand depth too aggressive, set to 1
+
+---
+
+## Phase 7 — Brownfield overrides actually applied (3 items)
+
+> Run this phase **only after** you've explicitly added at least one
+> brownfield annotation (or YAML override) to a real type in your
+> project. Otherwise skip — there's nothing to verify.
+
+### 7.1 ☐ `@CodebaseRole` on a class flips the role
+
+**Verification prompt:**
+
+> Pick one class where you added `@CodebaseRole(CodebaseRoleKind.X)`.
+> State the FQN and the X you set. Call
+> `list_by_role({"role":"X","limit":500})` and confirm the FQN appears
+> in the results. Then call `find_implementors({"name":"<simple-name>"})`
+> (or `codebase_search` if it's a concrete class) to confirm the
+> annotation was picked up.
+
+**Expected (calibration):** N/A — fixture has no brownfield class
+annotations applied. After you add one and rebuild, this verification
+should pass.
+
+**If failing → fix:** the class doesn't appear → either the
+annotation wasn't matched by simple name (typo? wrong package?), or
+the build wasn't rebuilt. `graph_meta` will show
+`routes_from_brownfield_pct > 0` once any brownfield is active.
+
+### 7.2 ☐ `@CodebaseRoute` on a method registers a route
+
+**Verification prompt:**
+
+> Pick one method where you added
+> `@CodebaseRoute(framework=…, kind=…, path="…", method="…")`. State
+> the path/method. Call
+> `get_route_by_path({"microservice":"<your-service>","path_template":"<path>","method":"<method>"})`.
+> The route should resolve. Then `find_route_handlers({"route_id":"<id>"})`
+> — your method's enclosing type should appear.
+
+**Expected (calibration):** N/A — fixture has 0 brownfield routes.
+After you add one, `graph_meta().routes_from_brownfield_pct > 0`.
+
+**If failing → fix:** route doesn't resolve → check that the
+`@CodebaseRoute` annotation has the **correct enum values** (see
+README §3b — `framework` and `kind` enums are case-sensitive
+lowercase). Verify `path_template` matches the normalised servlet form
+(e.g. `/users/{id}` → `/users/{}`).
+
+### 7.3 ☐ `@CodebaseClient` on a method creates an outbound HTTP_CALLS edge
+
+**Verification prompt:**
+
+> Pick one method where you added
+> `@CodebaseClient(clientKind="rest_template", targetService="<svc>", path="…", method="…")`.
+> Call `find_callees({"fqn_or_signature":"<your-method-fqn>","depth":1,"limit":20})`.
+> An outbound edge to a Route node (the target service's endpoint)
+> should appear. Then call `graph_meta()` and report
+> `http_clients_from_brownfield_pct` (should be > 0).
+
+**Expected (calibration):** N/A — fixture has 0 brownfield clients.
+
+**If failing → fix:** edge doesn't appear → most common cause is the
+target service / path doesn't have a `Route` node yet (the consumer
+side has to be indexed too). Verify by
+`get_route_by_path({"microservice":"<svc>","path_template":"<path>"})` —
+if it returns nothing, index the target service alongside.
+
+### Red flags for Phase 7
+
+- `routes_from_brownfield_pct = 0` after adding `@CodebaseRoute` →
+  build wasn't rebuilt, or annotation didn't parse (typo in enum
+  value)
+- Brownfield override "tightens" but doesn't override → this is
+  **intended behaviour** (partial overrides are non-destructive — see
+  README §"Caller-side brownfield overrides")
+
+---
+
+## After completing all phases
+
+If everything is green:
+
+- Save your `.lancedb-mcp.yml` and any `@Codebase*` annotations to
+  source control. They're now part of your project's brownfield
+  contract.
+- Pin the ontology version (9) somewhere in your README so future devs
+  know what shape of graph this MCP produces.
+- Run `graph_meta` weekly (or after big refactors) and diff the
+  `counts` block — surprise drops are the leading indicator of broken
+  indexing.
+
+If something is red and the "→ fix" doesn't help:
+
+- Capture `graph_meta()` output, `/tmp/verify_build.log` last 30
+  lines, and the failing prompt. File an issue against the repo with
+  those three artefacts; they're enough to diagnose 90% of cases.
+
+---
+
+## Appendix — calibration source
+
+All calibration numbers in this checklist come from
+`tests/bank-chat-system` indexed with `master @ d62b48c` (post PR-H1
+merge, ontology version 9). Reproduce with:
+
+```bash
+cd /path/to/java-enterprise-codebase-rag
+rm -rf /tmp/calib_kuzu
+python build_ast_graph.py \
+  --source-root tests/bank-chat-system \
+  --kuzu-path /tmp/calib_kuzu --verbose
+```