Copy the block between <!-- BEGIN and <!-- END into your project's AGENTS.md, CLAUDE.md, or equivalent. It is self-contained: five MCP tools, shared NodeFilter, edge taxonomy, tool-selection rules, and recovery moves.
Tools: search, find, describe, neighbors, resolve.
Node kinds: Symbol (types and methods), Route (HTTP and messaging entry points), Client (outbound HTTP call sites), Producer (outbound async call sites).
Indexed content: Java production sources plus SQL and YAML (use search table: java, sql, yaml, or all).
Ontology: 15 — if results look structurally wrong or empty across tools, the index may be missing, stale, or built with a different ontology_version; you cannot re-index via MCP — ask the operator to rebuild.
Responses: On success, search, find, describe, neighbors, and resolve may include two top-level fields: hints_structured (≤5 suggested next-tool calls) and advisories (≤5 pure informational strings). Each hints_structured entry has tool, args, actionable, label, and reason. actionable=true means you can call the tool directly with args; actionable=false means partial/advisory — fill missing values or use as guidance. reason explains why the hint was emitted. advisories carry context education (fuzzy strategy warnings, role collision explanations, etc.) with no tool call suggestion. For search/find, echoed limit/offset. Hints are advisory; ignore them when success is false.
Use this MCP when you need whole-codebase structure: callers/callees, route handlers, HTTP/async seams, clients/producers, or fuzzy entry points for a concept.
Do not use this MCP when the answer is already in the open file, or for third-party library trivia from training data alone. Prefer the smallest call that answers the question.
- Test files, build files, CI/deploy — read those files directly in the repo.
- Reflection and dynamic dispatch —
CALLSis static analysis only; the resolved set is a lower bound. - Proof of absence — an empty result may mean the project was not indexed, the wrong
table, or a filter that matches nothing. - Git history — use
git log/git blamefor "who changed" / "when".
When MCP disagrees with the open file, the file wins; treat the mismatch as a likely stale or incomplete index.
If a method has any of these (including plural containers @CodebaseHttpRoutes, @CodebaseAsyncRoutes, @CodebaseHttpClients, @CodebaseProducers), that annotation is the only source for the facets it declares — framework inference on the same method is not merged for that axis:
| Annotation | Declares | Framework rows bypassed (examples) |
|---|---|---|
@CodebaseHttpRoute |
inbound HTTP path / verb | Spring MVC/WebFlux mapping annotations |
@CodebaseAsyncRoute |
inbound async topic / route | @KafkaListener, @RabbitListener, … |
@CodebaseHttpClient |
outbound HTTP client call site | @FeignClient method mappings, RestTemplate-style inference |
@CodebaseProducer |
outbound async producer call site | KafkaTemplate / StreamBridge producer inference |
Trust the indexed brownfield row over a framework-only reading of the source.
- Locate —
resolvefor identifier-shaped strings;searchfor natural language or code fragments;findfor structuredNodeFilterdiscovery. - Inspect —
describe(id)for the full record andedge_summary(per-labelin/outcounts). - Walk —
neighborsin a loop with explicitdirectionandedge_types. Multi-hop traces are your reasoning, not a separate tool.
Before each MCP call, output one short line:
Q-class: <semantic | structured | inspect | walk>
Pick: <search|find|describe|neighbors|resolve> Why: <≤8 words>
Then use real JSON shapes (see below). If the call fails or returns nothing useful, use the Recovery playbook — do not thrash.
Use these strings verbatim in neighbors(..., edge_types=[...]).
Stored edges (one hop):
| Group | Edge types | Semantics |
|---|---|---|
| Type wiring | EXTENDS, IMPLEMENTS, INJECTS |
in = who depends on this type; out = what this type depends on |
| Containment | DECLARES, DECLARES_CLIENT, DECLARES_PRODUCER |
in = owner; out = owned member, client, or producer |
| Method overrides | OVERRIDES |
Subtype method → supertype declaration (same signature, one IMPLEMENTS/EXTENDS hop) |
| Method calls | CALLS |
in = callers; out = callees (method Symbol → method Symbol only) |
| Service boundary | EXPOSES |
method Symbol → Route (handler exposes route) |
| Cross-service | HTTP_CALLS, ASYNC_CALLS |
HTTP_CALLS: Client → Route; ASYNC_CALLS: Producer → Route |
Composed edges — type Symbol origin (direction="out" only):
| Edge type | Meaning |
|---|---|
DECLARES.DECLARES_CLIENT |
Members' HTTP clients in one hop |
DECLARES.DECLARES_PRODUCER |
Members' async producers in one hop |
DECLARES.EXPOSES |
Members' exposed routes in one hop |
Composed edges — non-static method Symbol origin (direction="out" only):
| Edge type | Meaning |
|---|---|
OVERRIDDEN_BY |
Concrete overrider methods (stored [:OVERRIDES] dispatch hop) |
OVERRIDDEN_BY.DECLARES_CLIENT |
Clients declared on overriders (via_id = overrider method) |
OVERRIDDEN_BY.DECLARES_PRODUCER |
Producers on overriders |
OVERRIDDEN_BY.EXPOSES |
Routes exposed by overriders |
Stored vs virtual direction (base override axis): neighbors(decl_id, "out", ["OVERRIDDEN_BY"]) returns the same overrider method ids as neighbors(decl_id, "in", ["OVERRIDES"]) on the same declaration method. Prefer the dot-key when describe.edge_summary advertises OVERRIDDEN_BY.
Do not mix DECLARES.* and OVERRIDDEN_BY.* in one edge_types list on a single origin id — the handler rejects the whole request (only one axis applies per node).
describe edge_summary counts for OVERRIDDEN_BY* use the same stored [:OVERRIDES] dispatch hop as neighbors (ontology 13+ graphs with materialized override edges). Rebuild the index if counts look wrong or dot-key walks return fewer rows than advertised.
Pagination: default neighbors limit=25 slices the merged flat + composed edge list. When edge_summary shows a large out count for a composed key, raise limit (and use offset) or issue separate calls per key.
| Param | Right | Wrong |
|---|---|---|
edge_types |
["CALLS"] |
"CALLS" or "[\"CALLS\"]" |
exclude_roles |
["DTO","OTHER"] |
stringified array |
filter |
{"role":"CONTROLLER"} |
nested string JSON |
ids (batch) |
["sym:…","sym:…"] |
comma-joined string |
Omit keys you do not need. Empty string "" is often a real filter that matches nothing.
| Kind | Prefixes |
|---|---|
| Symbol | sym: |
| Route | route: or r: |
| Client | client: or c: |
| Producer | producer: or p: |
Use exact ids from search.symbol_id, find, describe, or neighbors.other.id.
<package>.<Type>[.<NestedType>]#<methodName>(<SimpleType1>,<SimpleType2>,…)
Simple types in parentheses; generics erased (List<String> → List). No spaces after commas. No-arg: (). Constructor: #<init>(…).
direction:"in"or"out"(no default).edge_types: non-empty list from the taxonomy above.
Optional filter applies to each other endpoint; populated fields must match that neighbor's kind (strict frame).
Batching: multiple ids expand first; limit/offset slice the merged edge list — raise limit when batching.
Mixed flat + composed edge_types: flat edges are listed before composed edges, then pagination applies. A small limit with e.g. ["DECLARES","DECLARES.DECLARES_CLIENT"] may return only member Symbols and no Clients — use the dot-key alone to list terminals.
For find, filter is required — {} means no predicates (all nodes of that kind, subject to pagination).
| Keys | Applies to |
|---|---|
microservice, module, source_layer |
All kinds (source_layer mainly client / producer) |
role, exclude_roles, annotation, capability, fqn_prefix, symbol_kind, symbol_kinds |
symbol |
http_method, path_prefix, framework |
route |
client_kind, target_service, target_path_prefix, http_method |
client |
producer_kind, topic_prefix |
producer |
http_method filters HTTP verbs on routes (declared method) and on clients (outbound call method). Not applicable to symbol rows.
Strict frame: one populated field → one stored attribute for that kind. Unknown keys or inapplicable populated fields → success=false with a teaching message. No wildcards in fqn_prefix, path_prefix, or target_path_prefix (* / ? rejected) — use search(query=…) for ranked text instead. search.query is opaque text, not a DSL.
Input: FQN or suffix, sym:/route:/client:/producer: id, METHOD /path, route path template, client target_service, target_service + path prefix, or producer topic.
hint_kind: optional symbol | route | client | producer. When omitted, generators run across all four kinds (narrow with hint_kind when you know the kind).
status |
Action |
|---|---|
one |
describe(id=node.id) |
many |
pick from candidates (reason, score, NodeRef), then describe |
none |
fall back to search(query=…) for NL/fuzzy discovery |
Prefer resolve → describe(id=…) over describe(fqn=…) when an FQN may collide (describe(fqn=…) returns the first row).
microservice — service where the node lives. target_service (clients only) — remote service being called. source_layer (clients/producers) — which extraction layer produced the row (builtin, layer_a_meta, layer_b_ann, layer_c_source, layer_b_fqn, …). role (symbols only) — architectural stereotype (CONTROLLER, SERVICE, …).
| User asks… | First step | Typical follow-up |
|---|---|---|
| Identifier-shaped string | resolve (+ optional hint_kind) |
describe → neighbors |
| Fuzzy / NL "where is X" | search |
describe → neighbors |
| All controllers in service S | find(kind="symbol", filter={"microservice":"S","role":"CONTROLLER"}) |
neighbors CALLS / EXPOSES |
| Interfaces in service S | find(..., filter={"microservice":"S","symbol_kind":"interface"}) |
neighbors / describe |
| HTTP / messaging entry points | find(kind="route", filter={…}) |
describe |
| Outbound HTTP clients | find(kind="client", filter={…}) |
neighbors(..., "out", ["HTTP_CALLS"]) from client id |
| Outbound async producers | find(kind="producer", filter={…}) |
neighbors(..., "out", ["ASYNC_CALLS"]) from producer id |
| Who calls method M? | id via resolve / find / search |
neighbors(ids, "in", ["CALLS"]) |
| What does M call? | same | neighbors(ids, "out", ["CALLS"]) |
| Who hits this route? | route id | neighbors(ids, "in", ["HTTP_CALLS","ASYNC_CALLS","EXPOSES"]) |
| Handler for route | route id | neighbors(ids, "in", ["EXPOSES"]) |
| Who implements interface T? | type symbol id | neighbors(ids, "in", ["IMPLEMENTS"]) |
| Who injects type T? | type symbol id | neighbors(ids, "in", ["INJECTS"]) |
| Impact / "what breaks if I change X"? | no magic tool | loop neighbors in with CALLS, INJECTS, … until bounded |
Rules of thumb:
- Structure beats vector for exact questions — use
resolve/find+neighbors, notsearch, for "who calls …". - Vector beats structure for fuzzy discovery —
searchfirst, then pivot todescribe/neighbors.
Ranked chunk retrieval. Args: query, table (java|sql|yaml|all, default java), hybrid (bool), limit (default 5), offset, path_contains, optional filter (symbol-applicable NodeFilter only).
Exact listing for one kind. Args: kind (symbol|route|client|producer), filter (required object), limit (default 25), offset. Returns NodeRef rows (id, kind, fqn, microservice, module, role on symbols, symbol_kind on symbols).
Full node + edge_summary. Args: id (any kind) or fqn (symbol only; id wins).
- Stored keys — counts for edges that exist in the graph.
- Type symbols (
class,interface,enum,record,annotation) may add composed keysDECLARES.DECLARES_CLIENT,DECLARES.DECLARES_PRODUCER,DECLARES.EXPOSES— navigable vianeighborswith those dot-keys (outonly). - Method symbols may add virtual keys
OVERRIDDEN_BY,OVERRIDDEN_BY.DECLARES_*,OVERRIDDEN_BY.EXPOSES(navigable vianeighborson non-static method origins,outonly), plus anOVERRIDESrow merging stored[:OVERRIDES]incident counts with the rollup dispatch-up count (maxper direction). Rollup and dot-key traversal both use stored[:OVERRIDES]for the dispatch hop. Static methods and constructors do not get override-axis keys.
Composed counts are edge rows, not distinct methods; count > 0 means "there is something to walk".
Identifier lookup; three statuses above. Args: identifier, optional hint_kind.
One hop. Args: ids (string or array), direction, edge_types, limit (default 25), offset, optional filter on the other node, optional edge_filter (edge_types must be exactly ['CALLS'] — no composed dot-keys or second stored label; fail-loud otherwise).
Multiple origin ids: each id loads the full CALLS stream (or generic hop) in list order; offset/limit apply to the concatenated edge list (ids[0] edges first, then ids[1], …), not global source order across origins — a large first origin can leave no rows for later ids within the same page. High fan-out methods are slow; prefer one id per call or a smaller limit. Hints: TPL_NEIGHBORS_CALLS_HIGH_FANOUT / TPL_NEIGHBORS_CALLS_HAS_UNRESOLVED fire only for a single origin id (multi-origin CALLS skips those nudges).
Returns edges with attrs (confidence, strategy, match, … on cross-service edges) and other node.
Cross-service edges (HTTP_CALLS, ASYNC_CALLS): read attrs.confidence and attrs.match — low confidence or unresolved/phantom/ambiguous means treat as a resolver signal, not ground truth.
CALLS edges: source-ordered (call_site_line, call_site_byte). After ontology 15 PR-3, true receiver-failure sites are not on CALLS — they are UnresolvedCallSite nodes (reason: chained_receiver or phantom_unresolved_receiver; ids use the ucs: prefix, other.kind=unresolved_call_site — not describable via describe(id=…)). UNRESOLVED_AT is graph storage only (not in EDGE_SCHEMA / neighbors edge_types). attrs.resolved=false on remaining CALLS rows means known-receiver-external (JDK/Spring) callees, not receiver failure. include_unresolved=True (CALLS + direction=out only) interleaves unresolved sites with resolved CALLS (row_kind discriminator); mutually exclusive with edge_filter. dedup_calls=True collapses identical (origin, callee) CALLS to one row with call_site_lines. filter + edge_filter together load the ordered CALLS stream then apply callee NodeFilter in Python — expect higher latency on hot methods than edge_filter alone. Optional edge_filter projects before pagination: min_confidence; include_strategies / exclude_strategies (mutually exclusive); callee_declaring_role, callee_declaring_roles, exclude_callee_declaring_roles (["OTHER"] also drops known-external rows). filter.role filters the neighbor method (usually OTHER), not the callee stereotype — use edge_filter.callee_declaring_role for repository/service hops. exclude_external applies to find_callers / find_callees only (FQN-prefix); trim JDK noise on neighbors CALLS via edge_filter. Accessor noise: role excludes help; getter/setter heuristics in propose/completed/AGENT-SKILLS-AND-COMMANDS-PROPOSE.md /mini-map.
Roles (filter.role / exclude_roles): CONTROLLER, SERVICE, REPOSITORY, COMPONENT, CONFIG, ENTITY, CLIENT, MAPPER, DTO, OTHER.
Capabilities (filter.capability): MESSAGE_LISTENER, MESSAGE_PRODUCER, HTTP_CLIENT, SCHEDULED_TASK, EXCEPTION_HANDLER.
Symbol kinds (symbol_kind / symbol_kinds): class, interface, enum, record, annotation, method, constructor.
Route framework (examples on stored routes): spring_mvc, webflux, kafka, rabbitmq, jms, stream, codebase_async_route, …
Client kinds: feign_method, rest_template, web_client.
Producer kinds: kafka_send, stream_bridge_send.
HTTP call attrs.match / async attrs.match: cross_service, intra_service, ambiguous, phantom, unresolved.
| Symptom | Likely cause | Fix |
|---|---|---|
neighbors validation error |
Missing direction or edge_types |
Add both explicitly |
Empty neighbors |
Wrong edge type or direction | Read describe.edge_summary; EXPOSES is Symbol→Route; OVERRIDES is method↔method only; HTTP_CALLS starts from Client ids |
| Cannot find symbol | Wrong id or empty index | resolve / search; try find with fqn_prefix |
find returns too much |
Broad filter | Add microservice, fqn_prefix, path_prefix, topic_prefix, … |
| Route not found | Path mismatch | find(kind="route", filter={"path_prefix":…}) |
Empty search |
Wrong table, no index, or chunk miss |
Try table="all"; find with fqn_prefix; read source files directly |
| Empty results across several tools | Index missing, stale, or wrong project | You cannot rebuild the index via MCP — ask the operator; meanwhile use open files / rg |
| Result vs open file disagree | Stale or partial index | Trust the file; say index may be stale |
| Mixed composed families on one id | DECLARES.* + OVERRIDDEN_BY.* together |
Split calls — type keys need a type id; override keys need a method id |
| Override dot-key on type / DECLARES on method | Wrong Symbol origin for axis | Read describe.edge_summary; use the axis that matches the node kind |
After two failed attempts on the same intent, stop and report tool name, args, and response snippet.
These patterns combine the five tools above. Use the decision tree to pick the right starting tool.
| Intent | Tool chain |
|---|---|
| Natural-language "find X" | search(query=…, limit=8) → describe(top_hit.symbol_id) |
| List controllers in service S | find(kind="symbol", filter={microservice:"S", role:"CONTROLLER"}) |
| List routes in service S | find(kind="route", filter={microservice:"S"}) |
| List clients in service S | find(kind="client", filter={microservice:"S"}, limit=100) |
| List producers in service S | find(kind="producer", filter={microservice:"S"}, limit=100) |
| Who calls method M | resolve → neighbors(ids, "in", ["CALLS"]) |
| What does M call | resolve → neighbors(ids, "out", ["CALLS"]) |
| Handler for route R | neighbors(route_id, "in", ["EXPOSES"]) |
| All inbound to route R | neighbors(route_id, "in", ["HTTP_CALLS","ASYNC_CALLS","EXPOSES"]) |
| Implementors of interface T | neighbors(type_id, "in", ["IMPLEMENTS"]) |
| Where is T injected | neighbors(type_id, "in", ["INJECTS"]) |
| Impact of changing X | resolve → describe → bounded neighbors(in, ["CALLS","INJECTS","IMPLEMENTS","EXTENDS"]) depth ≤2 |
searchwith a short query; pick 1–3 hits with strongsymbol_id/ role fit.describeon the chosen id; readedge_summary.- Walk with
neighborsusing smalledge_typessets (e.g.CALLSout, orEXPOSES/ cross-service edges for boundaries). - Stop when you can answer; do not prefetch unrelated subgraphs.
When MCP behaviour, NodeFilter keys, edge labels, or node kinds change:
- Update this file's copy block and bump the Ontology: line to match
ast_java.ONTOLOGY_VERSION. - Update the five-tool cheat sheet in
README.mdand the "Driving the MCP from an agent" bullet there. - If enrichment semantics changed, add a "Re-index required" callout in
docs/CONFIGURATION.md§3.