diff --git a/README.md b/README.md index bd2a023..6d99bd8 100644 --- a/README.md +++ b/README.md @@ -12,6 +12,17 @@ The product vision for this tooling is proposed in [`propose/PRODUCT-VISION.md`] > for the assumptions this MCP makes about a Java repo (annotations, DI patterns, > service layout, naming) and a per-file map of where to edit the bundle if you > can't or don't want to refactor your codebase to match. +> +> **Driving this MCP from an agent:** +> - [`docs/AGENT-GUIDE.md`](./docs/AGENT-GUIDE.md) — copy-paste-into-`QWEN.md` / +> `CLAUDE.md` block. Forced reasoning preamble, decision tree, full +> reference for all 22 tools, ontology glossary (v9), recovery playbook, +> slash-style aliases. Engineered for weak / mid models that otherwise +> pick the wrong tool. +> - [`docs/MANUAL-VERIFICATION-CHECKLIST.md`](./docs/MANUAL-VERIFICATION-CHECKLIST.md) +> — 7-phase agent-driven verification you run after indexing your real +> project. Each item has a copy-paste prompt and calibration data from +> `tests/bank-chat-system`. ## 1. Install diff --git a/docs/AGENT-GUIDE.md b/docs/AGENT-GUIDE.md new file mode 100644 index 0000000..623e84a --- /dev/null +++ b/docs/AGENT-GUIDE.md @@ -0,0 +1,506 @@ +# Agent Guide — `java-enterprise-codebase-rag` MCP + +> **How to use this file.** Copy the block between the `` markers below into your project's `QWEN.md`, +> `CLAUDE.md`, `AGENTS.md`, or equivalent. The block is self-contained: +> all 22 MCP tools, the ontology glossary (v9), a forced reasoning +> preamble, a decision tree, a recovery playbook, and slash-style prompt +> aliases. Update by re-pulling from this repo when the ontology bumps. +> +> Why this exists: weak / mid models pick the wrong tool, pass simple +> names where FQNs are required, or ask vector search for things the +> graph already knows exactly. This guide is engineered to keep them on +> the rails. +> +> Calibrated against ontology version **9** (see `java_ontology.py`). + +--- + + + +## user-rag MCP — agent operating manual + +This MCP indexes Java enterprise projects into two stores: + +- **LanceDB** — vector + hybrid search over Java/SQL/YAML chunks, scoped + by role / capability / module / microservice. +- **Kuzu graph** — exact symbol graph with edges `EXTENDS`, `IMPLEMENTS`, + `INJECTS`, `DECLARES`, `CALLS`, `EXPOSES`, `HTTP_CALLS`, `ASYNC_CALLS`, + plus `Route` nodes for inbound endpoints (HTTP, Kafka, Feign, …). + +**Use this MCP when** the user asks anything that needs whole-codebase +context: "who calls X", "what handles route Y", "trace the flow when Z +happens", "what breaks if I change this", "where is concept C +implemented", "review this PR diff for blast radius". + +**Do NOT use this MCP when** the answer is fully visible in the file the +user is currently editing, or when the question is about a third-party +library you can answer from training data. Prefer the cheapest tool that +answers the question. + +### Forced reasoning preamble (every tool call) + +Before every MCP tool call, output **one short line** with this shape: + +``` +Q-class: +Pick: Why: <≤8 words> +``` + +Then, **before issuing the call**, sanity-check arguments against +*Argument shapes* below: arrays must be JSON arrays (not stringified), +method needles must be `pkg.Type#method(SimpleArg1,SimpleArg2)`, and +path templates must be the normalised servlet form. Most weak-model +failures here are not wrong-tool-choice but wrong-argument-shape. + +Then make the tool call. If the first call returns nothing useful, do +**not** loop the same tool with random tweaks — go to **Recovery +playbook** at the bottom of this guide. + +### Argument shapes — what the parser actually wants + +Two classes of mistakes burn the most calls. Read this once, then refer +back when a call returns nothing or fails validation. + +#### A. JSON, not stringified JSON + +FastMCP / Pydantic enforce real JSON types. **Pass arrays as JSON arrays +and objects as JSON objects — never as a string containing JSON.** This +is the single most common mistake on weak models because they over-quote +defensively. + +| Param | ✅ Right | ❌ Wrong (will fail or coerce poorly) | +| -------------------- | ----------------------------------------------- | ----------------------------------------------------- | +| `exclude_roles` | `["DTO","ENTITY","CONFIG","OTHER"]` | `"[\"DTO\",\"ENTITY\",\"CONFIG\",\"OTHER\"]"` | +| `edge_types` | `["EXTENDS","IMPLEMENTS"]` | `"EXTENDS,IMPLEMENTS"` or `"[EXTENDS,IMPLEMENTS]"` | +| `confirm` | `true` | `"true"` | +| `limit` | `20` | `"20"` | +| `min_confidence` | `0.9` | `"0.9"` | +| any optional you don't want | omit the key entirely | `null` is OK; empty string `""` is NOT (treated as a real filter that matches nothing) | +| string enums (`role`, `framework`, `capability`, `kind`) | `"CONTROLLER"` | `["CONTROLLER"]` (single value, not a list) | + +**One-line rule:** if the schema says `list[str]`, send `["a","b"]`. If +it says `str`, send `"a"`. Don't wrap arrays in extra quotes "to be +safe." + +#### B. Method needles — FQN + signature, with simple type names + +`find_callers` / `find_callees` accept three needle shapes. The signed +FQN form is the only one that's unambiguous on overloaded methods. + +**The FQN format is exactly:** + +``` +.[.]#(,,…) +``` + +Key rules: + +- **Simple type names only**, no package prefixes inside the parens: + `String`, not `java.lang.String`. `List`, not `java.util.List`. +- **Generics are erased**: `List` → `List`. `Map` → `Map`. +- **Arrays / varargs**: not formally tested in fixture; if your + search misses, try the simple base type without `[]` first. +- **No spaces** between commas and types: `(String,String,String)`. +- **No-arg method**: trailing `()`. +- **Constructor**: methodName is ``. Example: + `com.foo.Bar#(String,int)`. +- **Nested type**: dot-separated under the outer type, before the `#`: + `com.foo.Outer.Inner#method()`. + +**Examples (verbatim from `tests/bank-chat-system`):** + +``` +✅ com.bank.chat.assign.ChatAssignApplication#main(String) +✅ com.bank.chat.assign.config.AssignProperties.ChatCore#setBaseUrl(String) +✅ com.bank.chat.assign.integration.ChatCoreJoinClient#joinOperator(String,String,String) +✅ com.bank.chat.assign.service.OperatorSessionService#openSession(String,List) +✅ com.bank.chat.assign.ChatAssignApplication#() +``` + +**The three needle shapes, ranked by precision:** + +1. **Method FQN with signature** — unambiguous, exact match. Use + whenever you have it. +2. **Type FQN** (e.g. `com.foo.Bar`) — fans out to ALL declared + methods of that type via `DECLARES`. Useful for "who calls anything + on this class." +3. **Simple method name** (e.g. `joinOperator`) — matches every method + of that name across the codebase. May return many rows; only use + when you don't know the type. + +**Overloaded methods — the failure you actually hit.** If a class has +both `bar()` and `bar(String)` and you pass `Foo#bar()` expecting +both, you'll only get the no-arg one. To resolve: + +- Don't know the signature? **Drop the parens** entirely and use just + the simple name (`bar`) — you'll get rows for every overload, then + pick the one(s) you want and re-query with full FQN+sig. +- Or: pass the **type FQN** (`com.foo.Foo`) which fans out via + `DECLARES` and includes every method of every overload. +- Or: call `codebase_search({"query":"Foo bar","auto_hybrid":true,"limit":5})` + to recover the exact stored FQN, then retry with that string. + +**How to find the FQN you need:** + +- From `codebase_search` results: each `CodeChunkHit` carries `fqn` + for the enclosing symbol — copy it verbatim. +- From `list_by_role` / `list_by_annotation` / `find_implementors`: + each `SymbolDto` has an `fqn` field for the type. Then run + `find_callees({"fqn_or_signature":"","depth":1})` to list + its methods with their signed FQNs. +- Phantom rows (`?HashMap<>#(0)`, `?RestTemplate#(0)`) are + internal placeholders for unindexed external types. **Never pass + them as a needle** — they won't match anything. + +#### C. Path templates — the normalised servlet form + +`get_route_by_path` and `find_route_callers` expect `path_template` in +the form the graph stores, NOT the raw `@RequestMapping` value: + +| Source code annotation | What to pass | +| ------------------------------------ | ----------------------- | +| `@GetMapping("/users/{id}")` | `"/users/{id}"` | +| `@PostMapping("/users/{id}/avatar")` | `"/users/{id}/avatar"` | +| `@RequestMapping("/api")` + method `@GetMapping("/me")` | the **concatenated** template `"/api/me"` | +| SpEL only: `@GetMapping("${app.endpoint}")` | empty string — use `list_routes` with `path_prefix` instead | + +If unsure, run `list_routes({"path_prefix":"/users"})` first and copy +the `path` field from a result. + +### Decision tree — pick the first tool + +| User asks… | First tool | Typical follow-up | +| ---------------------------------------------------------------- | --------------------------------------------------- | ---------------------------------------------- | +| "How does X work" / "where is concept Y" (natural language) | `codebase_search` | `find_callers` on the top hit's FQN | +| "What happens when in " (end-to-end behaviour) | `trace_flow` | `find_callees` on stage-1 symbols | +| "Who calls method/class M" | `find_callers` (FQN preferred) | Widen with `depth`, narrow with `microservice` | +| "What does method M call" | `find_callees` | `graph_neighbors` for type wiring | +| "Show me the handler for HTTP path /foo/bar" | `get_route_by_path` then `find_route_handlers` | `trace_request_flow` | +| "List all HTTP endpoints / Kafka topics" | `list_routes` (filter by `framework`) | `find_route_handlers` per id | +| "Who calls route /foo/bar" | `find_route_callers` | `trace_request_flow` | +| "All controllers / services / repositories in service X" | `list_by_role` | `list_by_role` + `capability=` filter | +| "Everything annotated `@Transactional`" | `list_by_annotation` | `find_callers` per result | +| "Everything that produces / listens to messages" | `list_by_capability` (`MESSAGE_PRODUCER` / `_LISTENER`) | `find_callees` | +| "Who implements this interface" | `find_implementors` | `find_callers` on each impl | +| "Who extends this class" | `find_subclasses` | `impact_analysis` | +| "Where is X injected" | `find_injectors` | `find_callers` | +| "What breaks if I change this type" | `impact_analysis` | `analyze_pr` if there's a diff | +| "Review this PR / diff" | `analyze_pr` (paste the unified diff) | `find_route_callers` on touched routes | +| "Why is path X ignored / not indexed" | `diagnose_ignore` | — | +| "Is the index healthy / what version / how big" | `graph_meta` | `list_code_index_tables` | +| "Rebuild the index" (slow, requires confirm) | `refresh_code_index` | `graph_meta` to verify | + +**Two rules of thumb:** + +1. **Graph beats vector for exact questions.** "Who calls `Foo#bar()`" + is a graph question — never use `codebase_search` for that. +2. **Vector beats graph for fuzzy questions.** "How does authentication + work" should start with `codebase_search` (or `trace_flow`); the + graph alone won't surface the right entry point. + +### Tool reference — all 22 tools + +Grouped by purpose. Required arguments are **bold**; common mistakes are +flagged with ⚠. + +#### Search (LanceDB) + +##### `codebase_search` — vector / hybrid search over Java / SQL / YAML chunks + +- **Args:** **`query`** (string, natural language or identifier). + Useful optionals: `table` (`java`|`sql`|`yaml`|`all`, default `java`), + `limit` (1-50, default 5), `role`, `exclude_roles`, `capability`, + `module`, `microservice`, `package_prefix`, `auto_hybrid` (set true + for identifier-ish queries like `DistributionChunkService`), + `graph_expand` (BFS through Kuzu after top-k), `context_neighbors` + (attach 1-2 adjacent chunks for context). +- ⚠ For behavioural questions, set + `exclude_roles=["DTO","ENTITY","CONFIG","OTHER"]` — DTOs and entities + are noisy and rarely the answer. +- ⚠ `hybrid=true` and `auto_hybrid=true` require a single `table` (not + `all`). +- **Example:** `{"query":"how chat assigns on operator","exclude_roles":["DTO","ENTITY","CONFIG","OTHER"],"limit":8}` + +##### `list_code_index_tables` — index health summary + +- **Args:** none. +- Returns LanceDB URI, embedding model, project root, refresh-allowed + flag, graph metadata (use `graph_meta` for just the graph side). + +#### Symbols (Kuzu graph — type wiring) + +##### `find_implementors` — classes implementing an interface + +- **Args:** **`name`** (interface simple name or FQN). Optionals: + `module`, `microservice`, `capability`, `limit`. +- ⚠ Pass simple name (`PaymentService`) **or** FQN + (`com.acme.PaymentService`) — both work via the simple-name index. + +##### `find_subclasses` — classes / interfaces extending a given type + +- **Args:** **`name`**. Same optionals as `find_implementors`. + +##### `find_injectors` — types that inject (field/ctor/setter/Lombok) a given type + +- **Args:** **`name`** (the type **being** injected). Optional + `capability` filters the **consumer** (injecting class), not the + injected type. +- Returns edges with `mechanism`, `annotation`, `field_or_param`. + +##### `graph_neighbors` — generic bidirectional neighbour expansion + +- **Args:** **`name`**, `depth` (1-3, default 1), `direction` + (`out`|`in`|`both`, default `both`), `edge_types` (subset of + `EXTENDS`, `IMPLEMENTS`, `INJECTS`). +- Use this when none of the specialised tools fit (e.g. "find + everything one hop from `Foo` over implements + extends"). + +##### `impact_analysis` — reverse closure over INJECTS+IMPLEMENTS+EXTENDS + +- **Args:** **`name`**, `depth` (1-4, default 2), `limit` (default 300). +- Answers "who breaks if I change this type". Also returns + `cross_service_callers` for any route the impacted symbol exposes. + +#### Routes (inbound entry points) + +##### `list_routes` — list `Route` nodes (HTTP, Feign, Kafka, …) + +- **Args:** none required. Optionals: `microservice`, `framework` + (`spring_mvc`|`webflux`|`feign`|`kafka`|`rabbitmq`|`jms`|`stream`), + `path_prefix`, `method`, `limit`. +- ⚠ Routes with empty `framework` are ones the extractor couldn't + classify — usually annotation-only Kafka topic constants. If you + expected an HTTP route here, check brownfield overrides. + +##### `find_route_handlers` — symbols that EXPOSES a Route id + +- **Args:** **`route_id`** (e.g. `r:0a2bdd…`). +- ⚠ Feign **consumer** routes do NOT emit `EXPOSES` and return empty — + use `find_route_callers` instead. + +##### `get_route_by_path` — resolve one Route by (microservice, path, method) + +- **Args:** **`microservice`**, **`path_template`**, optional `method`. +- ⚠ `path_template` must be the normalised servlet form: `{` `}` placeholders + are kept as `{}` (e.g. `/api/users/{}`). For SpEL-only routes + (`${kafka.topic}`) `path_template` is empty — use `list_routes` with + `path_prefix` instead. + +##### `find_route_callers` — who calls a Route (HTTP_CALLS / ASYNC_CALLS) + +- **Args:** either **`route_id`**, OR **`microservice`** + + **`path_template`** + optional `method`. +- Use this for cross-service dependency questions. + +##### `trace_request_flow` — inbound + outbound around one entry route + +- **Args:** **`entry_route_id`**, optional `max_hops`. +- Returns: callers (HTTP/ASYNC) → handler → outbound CALLS chain. Best + starting point for "what happens when this endpoint is hit". + +#### Calls (CALLS edges between methods) + +##### `find_callers` — inbound CALLS closure for a method or type + +- **Args:** **`fqn_or_signature`**. Three needle shapes (see *Argument shapes §B* for the full format spec): + - method FQN with sig (most precise): `com.foo.Bar#baz(String,int)` — simple type names only, no spaces, generics erased + - type FQN: `com.foo.Bar` (fans out to all methods via DECLARES) + - simple method name: `baz` (matches all overloads everywhere; useful as a recovery step) +- Optionals: `depth` (1-5, default 1), `limit`, `min_confidence` (e.g. + `0.9` to drop low-confidence chained-receiver edges), `exclude_external` + (default true — drops JDK / Spring / Lombok callers), `module`, + `microservice`. +- ⚠ For "who really calls this", set `min_confidence=0.9` and + `depth=1` first; widen if too narrow. + +##### `find_callees` — outbound CALLS closure + +- **Args / optionals:** same shape as `find_callers`. + +#### Roles & capabilities (multi-tag axes) + +##### `list_by_role` — graph symbols with a given role + +- **Args:** **`role`** (one of + `CONTROLLER|SERVICE|REPOSITORY|COMPONENT|CONFIG|ENTITY|CLIENT|MAPPER|OTHER`). + Optionals: `module`, `microservice`, `capability` (AND-filter), `limit`. +- ⚠ Use `OTHER` to find things the inference missed — these are + brownfield candidates. + +##### `list_by_annotation` — symbols whose annotation list contains a simple name + +- **Args:** **`annotation`** (simple name, e.g. `Transactional`, + `Async`). Optionals: `module`, `microservice`, `capability`, `limit`. +- ⚠ Pass the **simple** name without `@`. + +##### `list_by_capability` — symbols carrying a capability + +- **Args:** **`capability`** (one of + `MESSAGE_LISTENER|MESSAGE_PRODUCER|HTTP_CLIENT|SCHEDULED_TASK|EXCEPTION_HANDLER`). + Optionals: `module`, `microservice`, `limit`. + +#### Behavioural / cross-cutting + +##### `trace_flow` — end-to-end behavioural trace from a natural-language query + +- **Args:** **`query`**. Optionals: `microservice`, `module`, + `seed_limit` (default ~5), `stage_limit` (default ~8), `depth` + (hops-per-stage), `follow_calls` (default true). +- Picks seeds via vector search restricted to behavioural roles + (CONTROLLER / COMPONENT / SERVICE / CLIENT + MESSAGE_LISTENER / + SCHEDULED_TASK), then walks the graph in 3 role-ordered stages + (entrypoints → services → integrations). Each result row carries + `via: [{edge_type, from_fqn, hop}]` so you know **why** it's there. +- Use this for "what happens when X" questions instead of chaining 4 + separate tools. + +##### `analyze_pr` — map a unified diff to indexed symbols + risk score + +- **Args:** **`diff_unified`** (string, full `git diff` output). +- Returns: `changed_symbols`, `blast_radius_total`, + `cross_service_callers`, `routes_touched`, `risk_score` (0-1), + `risk_band`, `notes`. Binary hunks and renames are surfaced in + `notes` and skipped for symbol mapping. + +#### Index management & diagnostics + +##### `graph_meta` — Kuzu metadata: counts, ontology version, build timestamp + +- **Args:** none. First tool to run on a fresh index — confirms + `ontology_version=9` and surfaces build counts. + +##### `diagnose_ignore` — explain why a path is ignored + +- **Args:** **`path`** (relative to project root or absolute inside + project). Returns the layer that decided + (`builtin_default`|`project_root`|`nested`|`gitignore`). + +##### `refresh_code_index` — rebuild LanceDB chunks + Kuzu graph (slow) + +- **Args:** **`confirm`** (must be `true`). Requires + `LANCEDB_MCP_ALLOW_REFRESH=1`. +- ⚠ Always call `graph_meta` after to verify the rebuild succeeded. + +### Ontology glossary (version 9) + +Source of truth: `java_ontology.py`. Pass these strings verbatim +(case-sensitive). + +#### Roles (`role` column on type-level Symbol nodes) + +`CONTROLLER`, `SERVICE`, `REPOSITORY`, `COMPONENT`, `CONFIG`, `ENTITY`, +`CLIENT`, `MAPPER`, `DTO`, `OTHER`. + +- `CLIENT` covers Feign clients (`@FeignClient`) and brownfield + `@CodebaseRole(CLIENT)`. As of ontology 9, plain `RestTemplate` + wrappers stay in their natural stereotype role (typically `SERVICE`) + unless you explicitly tag them. +- `OTHER` = the inference didn't recognise the type. Treat as a + brownfield candidate. + +#### Capabilities (multi-tag, may be empty) + +`MESSAGE_LISTENER`, `MESSAGE_PRODUCER`, `HTTP_CLIENT`, `SCHEDULED_TASK`, +`EXCEPTION_HANDLER`. + +- Capabilities are independent of role — a `@Service` can carry + `MESSAGE_PRODUCER` + `MESSAGE_LISTENER` simultaneously. +- `HTTP_CLIENT` fires for `@FeignClient` types and brownfield + `@CodebaseCapability(HTTP_CLIENT)`. RestTemplate-only wrappers do not + auto-promote. +- Capabilities are derived at the **type level**: method-level evidence + is aggregated up to the enclosing type. + +#### Route framework (on `Route` nodes) + +`spring_mvc`, `webflux`, `feign`, `kafka`, `rabbitmq`, `jms`, `stream`. + +#### Route kind + +`http_endpoint`, `http_consumer`, `kafka_topic`, `rabbit_queue`, +`jms_destination`, `stream_binding`. + +- `feign` framework with `http_consumer` kind = a Feign declaration + registers an outbound contract; it does NOT expose an inbound handler + and won't appear in `find_route_handlers`. + +#### Client kind (on `HTTP_CALLS` / `ASYNC_CALLS` edges) + +`feign_method`, `rest_template`, `web_client`, `kafka_send`, +`stream_bridge_send`. + +#### Call match (resolution outcome on cross-service edges) + +`cross_service`, `intra_service`, `ambiguous`, `phantom`, `unresolved`. + +- `phantom` = the called type is referenced by name but has no Symbol + row (external library or unindexed code). Common and not always a + bug. +- `cross_service` = caller and callee are in different microservices + and the resolver had enough information to bind them. Goal is to + maximise this for legitimate inter-service calls. + +### Recovery playbook — when results look wrong + +| Symptom | Likely cause | Fix | +| ------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------- | +| `find_callers`/`find_callees` returns 0 rows | Wrong needle shape: pass FQN with sig (`com.foo.Bar#baz(String,int)`), not just `baz` | Run `codebase_search` with the simple name to recover the FQN, then retry | +| `find_callers`/`find_callees` returns LESS than expected on an overloaded method | Needle was `Foo#bar()` but the overload you wanted is `Foo#bar(String)` — the resolver only matched the no-arg one | Drop the parens (`bar`) to list all overloads, then re-query with the full FQN+sig of the right one. Or pass the type FQN to fan out via DECLARES. See *Argument shapes §B*. | +| Tool returns a validation / type error mentioning a list field | Stringified JSON: `"[\"DTO\"]"` instead of `["DTO"]` | Pass real JSON arrays. See *Argument shapes §A* table. | +| `path_template` filter returns nothing | Passed the raw annotation value, but the graph stores the concatenated servlet form | Run `list_routes({"path_prefix":"/your/prefix"})` and copy the exact `path` field, then retry | +| Tool says "graph unavailable" | Index not built or `LANCEDB_MCP_PROJECT_ROOT` not set | Run `graph_meta` to confirm; `refresh_code_index({"confirm":true})` if needed | +| Expected route is missing from `list_routes` | Framework not recognised by built-in extractor | Add `@CodebaseRoute(framework=…, kind=…, path=…, method=…)` per README §3b, then `refresh_code_index` | +| `list_by_role` shows a `*Controller` class as `OTHER` | Non-Spring web stack (JAX-RS, custom) | Add `@CodebaseRole(CodebaseRoleKind.CONTROLLER)` per README §3a, or `role_overrides.fqn` in YAML | +| `cross_service_calls_total = 0` but you know there are inter-service calls | Resolution mode is `brownfield_only` and call sites have no brownfield tag, OR target services unindexed | Switch to `cross_service_resolution: auto` in YAML, or tag with `@CodebaseClient` | +| `codebase_search` returns DTOs / config classes instead of behaviour | Default ranking; no role filter | Add `exclude_roles=["DTO","ENTITY","CONFIG","OTHER"]` | +| Identifier search returns junk | Pure vector lookup is fuzzy on identifiers | Set `auto_hybrid=true` (FTS + vector RRF) | +| Same query returns different results across runs | None — graph build is deterministic | If you actually see this, file a bug with `graph_meta` `built_at` from both runs | + +If two consecutive recovery attempts on the same intent fail, **stop +and report** the failure to the user with the tool name, the args you +tried, and what you got back. Do not loop further. + +### Slash-style aliases (prompt templates, not real commands) + +Paste these into your prompt to nudge a weak model. They are just +shorthand for the right tool + args. + +- `/who-calls ` → `find_callers({"fqn_or_signature":"","depth":1,"min_confidence":0.9})`. **Pass the full signed FQN** (e.g. `com.foo.Bar#baz(String,int)`) — see *Argument shapes §B* for format. If you only have the simple name, query that first and re-issue with the exact FQN. +- `/calls-from ` → `find_callees({"fqn_or_signature":"","depth":1})`. Same FQN-with-signature rule — simple name will match all overloads but not let you target one. +- `/route [microservice]` → `list_routes({"path_prefix":"","method":"","microservice":""})` +- `/handler ` → `find_route_handlers({"route_id":""})` +- `/who-hits ` → `find_route_callers({"microservice":"","path_template":""})` +- `/why-no-route ` → 1) `list_by_role({"role":"OTHER"})` to confirm the type wasn't classified, 2) `list_by_annotation` for any custom annotation, 3) suggest brownfield `@CodebaseRoute` +- `/role-of ` → `find_implementors({"name":""})` if it's an interface; `list_by_role({"role":"…"})` to scan +- `/impact ` → `impact_analysis({"name":"","depth":2})` +- `/cross-service ` → 1) `impact_analysis`, 2) inspect `cross_service_callers`, 3) `find_route_callers` per route +- `/flow ` → `trace_flow({"query":"","seed_limit":5,"stage_limit":8})` +- `/diff-risk ` → `analyze_pr({"diff_unified":""})` +- `/health` → `graph_meta()` then `list_code_index_tables()` + +### One-liner: the canonical workflow for "explain feature X" + +1. `trace_flow({"query":"","seed_limit":5})` — get the role-ordered chain. +2. For each stage symbol whose hop is interesting: `find_callees` (depth 1) to fan out, `find_callers` (depth 1) to fan in. +3. If a `Route` shows up in stage 0: `trace_request_flow({"entry_route_id":""})` for the full inbound + outbound picture. +4. If anything looks wrong, run **Recovery playbook** before re-querying. + + + +--- + +## Maintenance notes (for the repo, not the agent) + +- Bump the **ontology version** sentence at the top of the BEGIN block + whenever `ONTOLOGY_VERSION` changes in `kuzu_queries.py`. +- When a new MCP tool is added in `server.py`, add it to (a) the + decision tree, (b) the tool reference, (c) a slash alias if the use + case is common. +- The forced-reasoning preamble adds ~30 tokens per tool call. That's + intentional cost for substantially better tool selection on weak + models. Remove it if you're driving with Opus / GPT-5 / Sonnet 4.6 + and don't need the scaffolding. +- For the per-tool `Skills/` split (one file per tool / per workflow), + see the follow-up plan once usage patterns shake out from real + enterprise project use. diff --git a/docs/MANUAL-VERIFICATION-CHECKLIST.md b/docs/MANUAL-VERIFICATION-CHECKLIST.md new file mode 100644 index 0000000..7087f32 --- /dev/null +++ b/docs/MANUAL-VERIFICATION-CHECKLIST.md @@ -0,0 +1,636 @@ +# Manual Verification Checklist — `java-enterprise-codebase-rag` + +Use this **after** you've read `README.md` + `CODEBASE_REQUIREMENTS.md`, +applied any brownfield annotations, and built the index against your +real project. The checklist drives an MCP-aware agent (Qwen Code, +Claude Code, Cursor, …) through 7 phases of progressively deeper +verification. + +Each item has: + +- ☐ a checkbox +- a **Verification prompt** — paste verbatim into your agent +- **Expected (calibration)** — what the same prompt produces on + `tests/bank-chat-system` (the in-repo fixture, ontology v9). If your + numbers diverge wildly from the calibration column, that's a signal, + not a verdict — your project just is bigger or smaller; what matters + is the **shape** (proportions, error rates, presence of expected + edges). +- **If failing → fix** — concrete next step + +Calibration was captured against `tests/bank-chat-system` on +`master @ d62b48c` (post PR-H1, ontology version 9): 84 files, 92 +types, 474 members, 0 parse errors, 17 routes, 793 calls, 2 HTTP_CALLS, +5 ASYNC_CALLS, microservices = `chat-core` + `chat-assign`. + +--- + +## Pre-flight — build the graph and prepare the agent + +Run **once** before working through the phases: + +```bash +# 1. Build the graph against your project (verbose, deterministic) +rm -rf /tmp/verify_kuzu +python build_ast_graph.py \ + --source-root /path/to/your/project \ + --kuzu-path /tmp/verify_kuzu --verbose 2>&1 | tee /tmp/verify_build.log + +# 2. Read the summary lines (last ~10 lines of the log) +tail -12 /tmp/verify_build.log + +# 3. Point the MCP server at the new graph + run it from the agent of choice +export LANCEDB_MCP_PROJECT_ROOT=/path/to/your/project +export LANCEDB_MCP_KUZU_PATH=/tmp/verify_kuzu +# … then start your MCP client (Qwen Code / Claude Code) so it sees this MCP +``` + +> **Quick read of the build log.** The `[pass3]` line tells you +> call-resolution health. The `[pass4]` line is route extraction +> (`routes_resolved_pct` is the headline number — a fully Spring MVC +> service hits 95-100; Kafka-heavy services drop to 70-90 because some +> topics are SpEL `${…}`). `[pass6]` shows cross-service match results. + +**Calibration on `tests/bank-chat-system`:** + +``` +[pass1] parsed 84 files in 0.24s: 92 types, 474 members, 0 parse errors, 0 skipped +[pass2] emitted 10 EXTENDS, 14 IMPLEMENTS, 71 INJECTS, 8 phantoms in 0.00s +[pass3] Call resolution: 800 sites, 77 chained phantoms (9.6%), 294 unresolved callee (36.8%), 138 phantom receiver (17.2%), … +[pass4] Route extraction: emitted=11, exposes=11, skipped_unresolved=0, routes_resolved_pct=81.8, by_framework={'spring_mvc': 9, 'kafka': 2} +[pass5] HTTP_CALLS: 2 edges, ASYNC_CALLS: 5 edges +[pass6] http_match={'phantom': 2}, async_match={'intra_service': 1, 'phantom': 4}, cross_service_calls_total=0 +``` + +> Note: `cross_service_calls_total=0` on the fixture is **expected** — +> the fixture is intra-service-heavy. On a real multi-service project +> this should be > 0 (otherwise see Phase 5). + +--- + +## Phase 1 — Index health (4 items) + +### 1.1 ☐ Ontology version is 9 + +**Verification prompt:** + +> Call `graph_meta()`. Report `ontology_version`, `built_at`, +> `source_root`, and `parse_errors`. Does `ontology_version` equal `9`? + +**Expected (calibration):** `ontology_version: 9`, +`source_root: /home/user/workspace/user-rag/tests/bank-chat-system`, +`parse_errors: 0`. + +**If failing → fix:** older ontology means you're running a stale wheel +or an old graph file. Re-pull the repo, `git rev-parse HEAD`, then +rebuild from scratch with `rm -rf /tmp/verify_kuzu && python +build_ast_graph.py …`. + +### 1.2 ☐ Parse error rate is acceptable + +**Verification prompt:** + +> Call `graph_meta()`. Look at `counts.files` and `parse_errors`. Compute +> `parse_errors / files * 100`. If above 1%, name the most likely +> culprit by inspecting the build log (`/tmp/verify_build.log`) for +> `[parse-error]` lines. + +**Expected (calibration):** `0 / 84 = 0%`. + +**If failing → fix:** > 5% means tree-sitter is choking — usually +non-UTF-8 files or generated sources you forgot to ignore. Add to +`.gitignore` or to the project's `lancedb_mcp_ignore`. Re-run +`diagnose_ignore({"path":"src/main/generated"})` to confirm the rule +took effect. + +### 1.3 ☐ Symbol counts match the project's rough scale + +**Verification prompt:** + +> Call `graph_meta()`. Report `counts.types`, `counts.members`, +> `counts.injects`. For a back-of-envelope sanity check, run +> `wc -l src/**/*.java` outside the agent and compare: types should be +> ~1 per non-trivial file. + +**Expected (calibration):** 92 types from 84 files (= 1.10 types/file — +nested classes account for the slight overshoot), 474 members, 71 +injects. + +**If failing → fix:** types ≪ files usually means tree-sitter parser +errors swallowed type declarations. Cross-check Phase 1.2. + +### 1.4 ☐ LanceDB tables exist and are readable + +**Verification prompt:** + +> Call `list_code_index_tables()`. Report `lancedb_uri`, +> `embedding_model`, the list of tables, and `refresh_enabled`. Then +> run `codebase_search({"query":"main","table":"java","limit":1})`. +> Did it return at least 1 hit? + +**Expected (calibration):** tables include `java`, `sql`, `yaml`; the +search returns ≥1 chunk. + +**If failing → fix:** missing tables → run +`refresh_code_index({"confirm":true})` (slow, requires +`LANCEDB_MCP_ALLOW_REFRESH=1`). Empty results from `codebase_search` → +the embedding model didn't load; check `SBERT_MODEL` env and disk +space. + +### Red flags for Phase 1 + +- `parse_errors / files > 5%` → ignore rules wrong +- `routes = 0` and you have controllers → see Phase 3 +- `injects = 0` and you have any DI → built-in inference broken, + rebuild + +--- + +## Phase 2 — Roles & capabilities (5 items) + +### 2.1 ☐ Controllers are recognised + +**Verification prompt:** + +> Call `list_by_role({"role":"CONTROLLER","limit":200})`. Then call +> `codebase_search({"query":"controller","table":"java","limit":50, +> "exclude_roles":["CONTROLLER"]})`. From the second result list, +> identify any class whose simple name ends in `Controller` / +> `Resource` / `Endpoint`. Report each as a candidate brownfield +> override. + +**Expected (calibration):** 5 CONTROLLERs (`ChatIngressController`, +`JoinOperatorController`, `DevAssignmentController`, +`ChatManagementController`, `OperatorManagementController`). Zero +`*Controller` classes appear in the second list. + +**If failing → fix:** for each candidate not classified, add either +`@CodebaseRole(CodebaseRoleKind.CONTROLLER)` (README §3a) or a +`role_overrides.fqn` entry in `.lancedb-mcp.yml`. Rebuild. + +### 2.2 ☐ Services and repositories are recognised + +**Verification prompt:** + +> Call `list_by_role({"role":"SERVICE","limit":200})` and +> `list_by_role({"role":"REPOSITORY","limit":200})`. Spot-check 3 +> service results: read each via `codebase_search` to confirm they +> contain business logic (not DTOs). Then call +> `list_by_role({"role":"OTHER","limit":100})` and report any class +> whose simple name ends in `Service`, `Repository`, `Dao`, or `Repo`. + +**Expected (calibration):** 7 SERVICEs (incl. `ChatManagementService`, +`DistributionChunkService`, `OperatorSessionService`); REPOSITORYs +exist in real Spring projects but the fixture has 0 due to in-memory +stubs. No `*Service` / `*Repository` classes in OTHER. + +**If failing → fix:** brownfield override per 2.1. + +### 2.3 ☐ Feign clients carry CLIENT + HTTP_CLIENT + +**Verification prompt:** + +> Call `list_by_role({"role":"CLIENT","capability":"HTTP_CLIENT","limit":50})`. +> Then call `list_by_annotation({"annotation":"FeignClient","limit":50})`. +> Every `@FeignClient`-annotated type should appear in the first list. +> Report any divergence. + +**Expected (calibration):** the fixture has Feign-style call sites but +0 `@FeignClient` classes (it uses RestTemplate); on real projects, +counts should match exactly. + +**If failing → fix:** as of ontology 9 (PR-H1), `@FeignClient` → +`role=CLIENT` + `capability=HTTP_CLIENT`. If you see drift, run +`graph_meta` and confirm `ontology_version=9`. If yes and still +broken, re-index — may be a stale graph. + +### 2.4 ☐ Message listeners and producers are detected + +**Verification prompt:** + +> Call `list_by_capability({"capability":"MESSAGE_LISTENER","limit":50})` +> and `list_by_capability({"capability":"MESSAGE_PRODUCER","limit":50})`. +> Then `list_by_annotation({"annotation":"KafkaListener","limit":50})` +> and confirm all results from the annotation query also appear in the +> capability query. Repeat for `RabbitListener`, `JmsListener`, and +> `EventListener` if your project uses them. + +**Expected (calibration):** 2 listeners (`DistributionTriggerListener`, +`ChatKafkaListener`) and 2 producers (`DistributionTriggerPublisher`, +`FollowUpKafkaPublisher`). + +**If failing → fix:** custom listener annotations → meta-annotation +walk should pick them up automatically (Layer A). If not, add to +`role_overrides.annotations` in `.lancedb-mcp.yml` (README §"Brownfield +overrides"). + +### 2.5 ☐ OTHER role is small relative to type count + +**Verification prompt:** + +> Call `list_by_role({"role":"OTHER","limit":500})` and report the +> count. Compute `OTHER / total_types` from `graph_meta().counts.types`. +> What fraction of OTHER are obviously utility classes (exceptions, +> records, internal helpers) vs candidates the inference should have +> handled? + +**Expected (calibration):** 43 OTHER out of 92 types (47%) — fixture +has many record DTOs and helper classes. On a real project this should +be < 30% if you're well-annotated; 30-50% suggests you need a few +brownfield overrides. + +**If failing → fix:** > 60% OTHER almost always means a non-Spring +stack the inference doesn't know — add `role_overrides.annotations` for +your custom stereotypes. + +### Red flags for Phase 2 + +- `*Controller` classes in `OTHER` → JAX-RS or custom web framework + not annotated +- Feign clients without `HTTP_CLIENT` capability → ontology drift, + rebuild +- `MESSAGE_LISTENER` count = 0 in a Kafka-heavy project → meta-walk + failed to find your annotation + +--- + +## Phase 3 — Routes (4 items) + +### 3.1 ☐ Route count and framework distribution + +**Verification prompt:** + +> Call `graph_meta()` and report `routes_total`, `routes_by_framework`, +> `routes_resolved_pct`, `routes_from_brownfield_pct`. Then +> `list_routes({"limit":500})` to see them. Does the framework mix +> match what you'd expect (e.g. mostly `spring_mvc` for an HTTP +> service)? + +**Expected (calibration):** `routes_total=17`, +`routes_by_framework={spring_mvc: 9, kafka: 2}` (the remaining 6 are +extracted but unframework'd Kafka topic constants), +`routes_resolved_pct=81.8`, `routes_from_brownfield_pct=0.0`. + +**If failing → fix:** `routes_resolved_pct < 60` on a Spring project +means many `@RequestMapping` paths are SpEL/`${…}` (acceptable) or +your handler types weren't classified as CONTROLLER (Phase 2.1). + +### 3.2 ☐ Every controller exposes ≥1 route + +**Verification prompt:** + +> Call `list_by_role({"role":"CONTROLLER","limit":200})`. For each +> result FQN, call `find_callees({"fqn_or_signature":"","depth":1, +> "limit":5})` to confirm it has methods. Then call +> `list_routes({"limit":500})` and verify each controller appears +> at least once in the routes' handler set (run +> `find_route_handlers` on a sample of route ids). + +**Expected (calibration):** all 5 controllers in the fixture expose +at least one HTTP route (9 routes total / 5 controllers). + +**If failing → fix:** if a controller has no route, the framework +isn't recognised on its methods. Add `@CodebaseRoute` per README §3b. + +### 3.3 ☐ HTTP routes have non-empty path AND method + +**Verification prompt:** + +> Call `list_routes({"framework":"spring_mvc","limit":200})`. Report +> any route where `path` is empty or `method` is empty. (Empty `path` +> with `framework=spring_mvc` usually means `@RequestMapping` with no +> path — programmatic routing — which is rare and worth investigating.) + +**Expected (calibration):** all 9 spring_mvc routes have non-empty +`path` and `method`. + +**If failing → fix:** unresolvable SpEL paths are normal in some +`@RequestMapping` forms — accept them. But if a route has +`framework=spring_mvc` and no path, it's likely a route you should +override with `@CodebaseRoute`. + +### 3.4 ☐ Kafka topics are correct (topics, brokers, kinds) + +**Verification prompt:** + +> Call `list_routes({"framework":"kafka","limit":200})`. For each +> result, confirm: `kind=kafka_topic` and `topic` is non-empty. Cross- +> reference against your project's `application.yml` / +> `application.properties` Kafka topic names. + +**Expected (calibration):** 2 kafka routes +(`ChatTopics.INCOMING`, `${assign.kafka.distribution-topic}`). The +6 unframework'd Kafka rows in 3.1 are SpEL constants the extractor +couldn't resolve — they show up but with empty framework. + +**If failing → fix:** for unresolved topics that you DO know the +literal name of, use brownfield route override: +`@CodebaseRoute(framework=kafka, kind=kafka_topic, topic="my.topic")` +on the listener method (README §3b). + +### Red flags for Phase 3 + +- `routes_total = 0` → no controllers were classified or framework not + recognised +- HTTP routes with empty `method` → annotation extractor didn't see + `@GetMapping` / `@PostMapping` +- `routes_from_brownfield_pct` jumped after a refactor → you broke a + built-in extraction; check that ontology version is still 9 + +--- + +## Phase 4 — Call graph (3 items) + +### 4.1 ☐ Pick a known method, verify `find_callers` matches IDE + +**Verification prompt:** + +> Pick one method in your project that you know has 3-5 callers (for +> example a service method called by 1-2 controllers and 1-2 other +> services). State its FQN+signature. +> Call `find_callers({"fqn_or_signature":"#()","depth":1,"min_confidence":0.9,"limit":50})`. +> Open your IDE, run "Find Usages" on the same method, and compare: +> for each IDE caller, does it appear in the MCP result? List +> mismatches. + +**Expected (calibration):** any service method like +`com.bank.chat.assign.service.DistributionService#assignNext()` should +have 1-3 callers. Whether your IDE matches MCP exactly depends on: +reflection (won't show in MCP), generated code (depends on indexing +config), and JDK external code (filtered by `exclude_external`). + +**If failing → fix:** if MCP misses callers your IDE finds, lower +`min_confidence` to `0.0` and retry. If still missing, the call site +was resolved as `phantom` — check your generic / reflection-heavy code +isn't dominating. + +### 4.2 ☐ End-to-end chain reproduces via `find_callees` + +**Verification prompt:** + +> Pick one HTTP entry point. Call `list_routes({"framework":"spring_mvc","limit":1})`, +> grab the route id, then `find_route_handlers({"route_id":""})` +> to get the handler FQN. Then `find_callees` on the handler with +> `depth=2`. Does the chain reach a service method (depth 1) and then +> a repository / external call (depth 2)? + +**Expected (calibration):** `JoinOperatorController#joinOperator` → +`ChatOrchestrationService#…` → repository / Kafka publisher. + +**If failing → fix:** if depth-2 returns nothing, your service classes +might be classified as OTHER (back to Phase 2.2). Or `min_confidence` +is filtering legit edges — try `min_confidence=0.0`. + +### 4.3 ☐ Phantom rate is acceptable + +**Verification prompt:** + +> Look at the `[pass3]` line in `/tmp/verify_build.log`. Report +> `chained_phantoms %`, `unresolved_callee %`, `phantom_receiver %`. + +**Expected (calibration):** chained phantoms 9.6%, unresolved callee +36.8%, phantom receiver 17.2%. The fixture has many cross-service +references that legitimately resolve to phantoms (other-service types +that aren't in the same indexing root). + +**If failing → fix:** > 50% unresolved on a single-service indexing +likely means you didn't include the project's library jars or generated +sources path. > 30% chained phantoms can mean overly fluent APIs the +resolver can't follow — usually accept as a known limitation. + +### Red flags for Phase 4 + +- `find_callers` returns 0 with `min_confidence=0.0` → wrong needle + shape (use FQN+sig, not simple name) +- depth-2 closure returns nothing on a real chain → roles wrong (Phase + 2) + +--- + +## Phase 5 — Cross-service edges (3 items) + +### 5.1 ☐ HTTP_CALLS edges exist and resolve correctly + +**Verification prompt:** + +> Call `graph_meta()` and report `http_calls_total`, +> `http_calls_by_strategy`, `http_calls_match_breakdown`. Then pick +> a known cross-service HTTP call site (e.g. a Feign interface method +> on service A whose target is service B). Call +> `find_route_callers({"microservice":"","path_template":""})` +> and confirm A appears as a caller with `match=cross_service`. + +**Expected (calibration):** `http_calls_total=2`, +`http_calls_match_breakdown={phantom: 2}` (no cross-service in the +fixture). On a real multi-service project, expect `cross_service > 0`. + +**If failing → fix:** if you expected `cross_service` but got +`phantom`, the target service isn't in the same indexing root, OR the +`@FeignClient` URL doesn't resolve to a known service. Tag with +`@CodebaseClient(clientKind="feign_method", targetService="", +path="…")` (README §3c). + +### 5.2 ☐ ASYNC_CALLS edges connect producer → topic → listener + +**Verification prompt:** + +> Call `graph_meta()` and report `async_calls_total`, +> `async_calls_by_strategy`, `async_calls_match_breakdown`. Pick a +> known Kafka topic. Call +> `find_route_callers({"microservice":"","path_template":""})` +> with the route id of the consumer route. Confirm the producer +> appears as a caller. + +**Expected (calibration):** `async_calls_total=5`, +`async_calls_match_breakdown={intra_service: 1, phantom: 4}`. On real +projects with multi-service Kafka, expect `cross_service` matches. + +**If failing → fix:** mostly `phantom` on real cross-service async +calls means the consumer side doesn't have a `Route` node for the +topic. Either the listener isn't classified (Phase 2.4) or the topic +literal couldn't be resolved (Phase 3.4). + +### 5.3 ☐ `cross_service_resolution` flag flips behaviour as expected + +**Verification prompt:** + +> Pick one cross-service call site that resolved to `cross_service` in +> the default `auto` mode. Edit `.lancedb-mcp.yml` to add +> `cross_service_resolution: brownfield_only`. Rebuild +> (`refresh_code_index({"confirm":true})`) and re-run the same +> `find_route_callers` query. The previously-cross_service edge +> should now be `unresolved` (unless your call site is brownfield- +> tagged). Confirm. + +**Expected (calibration):** N/A — fixture has 0 cross-service edges. +Use this on your real project as a smoke test of the flag. + +**If failing → fix:** flag flag has no effect → you didn't actually +rebuild after editing the YAML. `graph_meta().built_at` should be a +fresh timestamp. + +### Red flags for Phase 5 + +- `cross_service_calls_total = 0` on a multi-service project → + resolver couldn't bind any caller to its target. Check that all + services are under one indexing root, and check microservice + detection (top-level dirs under `LANCEDB_MCP_PROJECT_ROOT`). + +--- + +## Phase 6 — Semantic search (2 items) + +### 6.1 ☐ Concept query returns relevant chunks + +**Verification prompt:** + +> Pick a behavioural concept that exists in your code (e.g. +> "operator assignment", "session lifecycle", "retry on Kafka send"). +> Call `codebase_search({"query":"","limit":8, +> "exclude_roles":["DTO","ENTITY","CONFIG","OTHER"], +> "context_neighbors":1})`. The top 3 hits should be in files you'd +> naturally point at for that concept. + +**Expected (calibration):** `query="how chat assigns on operator"` → +top hits include `DistributionService`, `OperatorSessionService`, +`JoinOperatorController` (the assignment chain). + +**If failing → fix:** top hits are DTOs / configs → you forgot +`exclude_roles`. Top hits are unrelated → embeddings are off (check +`SBERT_MODEL` and that `refresh_code_index` actually ran on the +current code). + +### 6.2 ☐ Identifier query benefits from `auto_hybrid` + +**Verification prompt:** + +> Pick a class your project defines (e.g. `DistributionChunkService`). +> Run two queries: with `auto_hybrid=false` (default) and with +> `auto_hybrid=true`. Report the top 3 hits from each. + +**Expected (calibration):** without auto_hybrid, top results are still +relevant but ranked lower; with auto_hybrid=true, the FTS+vector RRF +pushes the exact-name file to position 1. + +**If failing → fix:** auto_hybrid has no effect → `table=all` (it +requires a single table). Stick to `table=java`. + +### Red flags for Phase 6 + +- Chunk count from `codebase_search` is 0 on a known-good query → + LanceDB tables empty or wrong embedding model +- `graph_expand=true` returns more results than `=false` but they're + noise → expand depth too aggressive, set to 1 + +--- + +## Phase 7 — Brownfield overrides actually applied (3 items) + +> Run this phase **only after** you've explicitly added at least one +> brownfield annotation (or YAML override) to a real type in your +> project. Otherwise skip — there's nothing to verify. + +### 7.1 ☐ `@CodebaseRole` on a class flips the role + +**Verification prompt:** + +> Pick one class where you added `@CodebaseRole(CodebaseRoleKind.X)`. +> State the FQN and the X you set. Call +> `list_by_role({"role":"X","limit":500})` and confirm the FQN appears +> in the results. Then call `find_implementors({"name":""})` +> (or `codebase_search` if it's a concrete class) to confirm the +> annotation was picked up. + +**Expected (calibration):** N/A — fixture has no brownfield class +annotations applied. After you add one and rebuild, this verification +should pass. + +**If failing → fix:** the class doesn't appear → either the +annotation wasn't matched by simple name (typo? wrong package?), or +the build wasn't rebuilt. `graph_meta` will show +`routes_from_brownfield_pct > 0` once any brownfield is active. + +### 7.2 ☐ `@CodebaseRoute` on a method registers a route + +**Verification prompt:** + +> Pick one method where you added +> `@CodebaseRoute(framework=…, kind=…, path="…", method="…")`. State +> the path/method. Call +> `get_route_by_path({"microservice":"","path_template":"","method":""})`. +> The route should resolve. Then `find_route_handlers({"route_id":""})` +> — your method's enclosing type should appear. + +**Expected (calibration):** N/A — fixture has 0 brownfield routes. +After you add one, `graph_meta().routes_from_brownfield_pct > 0`. + +**If failing → fix:** route doesn't resolve → check that the +`@CodebaseRoute` annotation has the **correct enum values** (see +README §3b — `framework` and `kind` enums are case-sensitive +lowercase). Verify `path_template` matches the normalised servlet form +(e.g. `/users/{id}` → `/users/{}`). + +### 7.3 ☐ `@CodebaseClient` on a method creates an outbound HTTP_CALLS edge + +**Verification prompt:** + +> Pick one method where you added +> `@CodebaseClient(clientKind="rest_template", targetService="", path="…", method="…")`. +> Call `find_callees({"fqn_or_signature":"","depth":1,"limit":20})`. +> An outbound edge to a Route node (the target service's endpoint) +> should appear. Then call `graph_meta()` and report +> `http_clients_from_brownfield_pct` (should be > 0). + +**Expected (calibration):** N/A — fixture has 0 brownfield clients. + +**If failing → fix:** edge doesn't appear → most common cause is the +target service / path doesn't have a `Route` node yet (the consumer +side has to be indexed too). Verify by +`get_route_by_path({"microservice":"","path_template":""})` — +if it returns nothing, index the target service alongside. + +### Red flags for Phase 7 + +- `routes_from_brownfield_pct = 0` after adding `@CodebaseRoute` → + build wasn't rebuilt, or annotation didn't parse (typo in enum + value) +- Brownfield override "tightens" but doesn't override → this is + **intended behaviour** (partial overrides are non-destructive — see + README §"Caller-side brownfield overrides") + +--- + +## After completing all phases + +If everything is green: + +- Save your `.lancedb-mcp.yml` and any `@Codebase*` annotations to + source control. They're now part of your project's brownfield + contract. +- Pin the ontology version (9) somewhere in your README so future devs + know what shape of graph this MCP produces. +- Run `graph_meta` weekly (or after big refactors) and diff the + `counts` block — surprise drops are the leading indicator of broken + indexing. + +If something is red and the "→ fix" doesn't help: + +- Capture `graph_meta()` output, `/tmp/verify_build.log` last 30 + lines, and the failing prompt. File an issue against the repo with + those three artefacts; they're enough to diagnose 90% of cases. + +--- + +## Appendix — calibration source + +All calibration numbers in this checklist come from +`tests/bank-chat-system` indexed with `master @ d62b48c` (post PR-H1 +merge, ontology version 9). Reproduce with: + +```bash +cd /path/to/java-enterprise-codebase-rag +rm -rf /tmp/calib_kuzu +python build_ast_graph.py \ + --source-root tests/bank-chat-system \ + --kuzu-path /tmp/calib_kuzu --verbose +```