diff --git a/AGENTS.md b/AGENTS.md index 51147b0..b58e7f6 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -31,6 +31,8 @@ for tools that don't read `.cursor/rules/`. `plans/completed/CURSOR-PROMPTS-TIER1B.md`. The two CURSOR-PROMPTS files are kept as reference templates for future per-PR Cursor work. - Older completed: `propose/completed/CALL-GRAPH-PROPOSE.md`, + `propose/completed/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md`, + `plans/completed/PLAN-DESCRIBE-MEMBER-EDGE-ROLLUP.md`, `propose/completed/MCP-API-V2-REDESIGN-PROPOSE.md` (four-tool MCP + `java-codebase-rag` CLI), `plans/completed/PLAN-CALL-GRAPH.md`, `plans/completed/PLAN-CAPABILITIES-MODEL.md`, diff --git a/README.md b/README.md index c125f2d..e9413a9 100644 --- a/README.md +++ b/README.md @@ -263,7 +263,7 @@ Edit `claude_desktop_config.json` (macOS: `~/Library/Application Support/Claude/ |---|---|---|---| | `search` | Locate nodes by NL/code text. | `query: str`, `table: str="java"`, `hybrid: bool=False`, `limit: int=5`, `offset: int=0`, `path_contains: str \| None`, `filter: NodeFilter \| str \| None` | `{"query":"join operator flow","limit":5}` | | `find` | Locate nodes by structured filter. | `kind: "symbol"\|"route"\|"client"`, `filter: NodeFilter \| str`, `limit: int=25`, `offset: int=0` | `{"kind":"symbol","filter":{"role":"CONTROLLER"}}` | -| `describe` | Full record + edge counts for one node. | `id: str` | `{"id":"sym:com.bank.chat.core.api.ChatController#joinOperator(JoinOperatorRequest)"}` | +| `describe` | Full record + edge counts for one node. For **type** symbols, `edge_summary` may also include composed dot-keys (`DECLARES.DECLARES_CLIENT`, `DECLARES.EXPOSES`); see [`docs/AGENT-GUIDE.md`](./docs/AGENT-GUIDE.md) (`describe`). | `id: str` | `{"id":"sym:com.bank.chat.core.api.ChatController#joinOperator(JoinOperatorRequest)"}` | | `neighbors` | One-hop walk. **Required**: `direction` and `edge_types`. | `ids: str \| list[str]`, `direction: "in"\|"out"`, `edge_types: list[str]`, `limit: int=25`, `offset: int=0`, `filter: NodeFilter \| str \| None` | `{"ids":"route:chat-core:POST:/chat/joinOperator","direction":"in","edge_types":["HTTP_CALLS","ASYNC_CALLS"]}` | **`NodeFilter` notes:** diff --git a/docs/AGENT-GUIDE.md b/docs/AGENT-GUIDE.md index e6a9ea5..ad2e937 100644 --- a/docs/AGENT-GUIDE.md +++ b/docs/AGENT-GUIDE.md @@ -66,7 +66,7 @@ When a method carries **`@CodebaseHttpRoute`** or **`@CodebaseHttpClient`** (inc **Workflow (GPS model):** 1. **Locate** — `search` (natural language / fragment) or `find` (structured `NodeFilter`). -2. **Inspect** — `describe(id)` to see the full record and `edge_summary` (per-edge-type in/out counts). +2. **Inspect** — `describe(id)` to see the full record and `edge_summary` (per stored edge label `in`/`out` counts, plus optional composed dot-keys for type Symbols — see `describe` below). 3. **Walk** — `neighbors` in a loop with explicit **`direction`** and **`edge_types`** until you have enough evidence. Multi-hop “trace” and “impact” are **your** reasoning, not a separate tool. ### Forced reasoning preamble (every tool call) @@ -195,9 +195,18 @@ Exact allowed values for roles, capabilities, client kinds, etc. live in `java_o #### `describe` -- **Purpose:** Full node payload + `edge_summary` (counts only: per edge type, `in` / `out`). +- **Purpose:** Full node payload + `edge_summary`: `in` / `out` counts **per stored graph edge label** (what exists as edges in Kuzu). For **type** Symbols only (`class`, `interface`, `enum`, `record`, `annotation`), the same map may also include **describe-time composed** dot-keys — summaries of member edges, not stored labels — see the next bullets (`DECLARES.DECLARES_CLIENT`, `DECLARES.EXPOSES`); those keys are **not** valid in `neighbors(edge_types=…)`. - **Args:** `id` (symbol, route, or client id). +**Composed `edge_summary` keys (type Symbols).** Keys use dot notation: `.`. Two are emitted today: + +- `DECLARES.DECLARES_CLIENT` — the type's methods declare brownfield HTTP clients (count is the number of `Client` rows reached through `DECLARES → DECLARES_CLIENT`). To enumerate them: `neighbors(ids=, direction="out", edge_types=["DECLARES"])` → for each method id, `neighbors(ids=, direction="out", edge_types=["DECLARES_CLIENT"])`. +- `DECLARES.EXPOSES` — the type's methods expose routes. Same walk shape with `EXPOSES`. + +Composed keys are **read-only**: they cannot be passed to `neighbors(edge_types=…)` (the dot is not a valid `EdgeType` literal — the call fails with a Pydantic `ValidationError`). Use them as a hop affordance only. + +Note on counting semantics: composed counts measure **edge rows**, not distinct member methods. One method that declares multiple `Client` rows (e.g. a `rest_template` method with several call sites) contributes its full edge count to `DECLARES.DECLARES_CLIENT`. The "does this class have any clients?" predicate is answered by `count > 0`; the count itself is an affordance for how rich the downstream walk will be. + #### `neighbors` - **Purpose:** One hop over explicit edge types; returns **edges** with attributes (`confidence`, `strategy`, `match`, …) and the **`other`** node. diff --git a/kuzu_queries.py b/kuzu_queries.py index a7210f4..6a4293f 100644 --- a/kuzu_queries.py +++ b/kuzu_queries.py @@ -597,6 +597,27 @@ def edge_counts_for(self, node_id: str) -> dict[str, dict[str, int]]: if int(dirs.get("in", 0)) > 0 or int(dirs.get("out", 0)) > 0 } + def member_edge_rollup_for(self, type_id: str) -> dict[str, dict[str, int]]: + """2-hop DECLARES member edge counts for a type Symbol (describe-time only). + + Keys use dot notation and are not stored graph edge labels. + """ + params = {"id": type_id} + rollup: dict[str, dict[str, int]] = {} + for key, rel in ( + ("DECLARES.DECLARES_CLIENT", "DECLARES_CLIENT"), + ("DECLARES.EXPOSES", "EXPOSES"), + ): + rows = self._rows( + f"MATCH (t:Symbol {{id: $id}})-[:DECLARES]->(m:Symbol)-[e:{rel}]->() " + "RETURN count(e) AS n", + params, + ) + n = sum(int(r.get("n") or 0) for r in rows) if rows else 0 + if n > 0: + rollup[key] = {"in": 0, "out": n} + return rollup + def _scope_counts(self, column: str) -> dict[str, int]: """Generic helper: count resolved type symbols grouped by `column`. diff --git a/mcp_v2.py b/mcp_v2.py index 95515d8..36e9919 100644 --- a/mcp_v2.py +++ b/mcp_v2.py @@ -15,6 +15,9 @@ from search_lancedb import TABLES, run_search DeclarationSymbolKind = Literal["class", "interface", "enum", "record", "annotation", "method", "constructor"] + +# Composed describe-time keys in edge_summary (e.g. DECLARES.DECLARES_CLIENT) are +# intentionally not EdgeType literals — neighbors(edge_types=...) rejects them. EdgeType = Literal[ "EXTENDS", "IMPLEMENTS", @@ -34,6 +37,10 @@ _st_lock = threading.Lock() _st_model: SentenceTransformer | None = None +_TYPE_SYMBOL_KINDS_FOR_EDGE_ROLLUP = frozenset( + {"class", "interface", "enum", "record", "annotation"} +) + def _get_sentence_transformer(model_name: str, device: str | None) -> SentenceTransformer: global _st_model @@ -115,7 +122,16 @@ class NodeRecord(BaseModel): kind: Literal["symbol", "route", "client"] fqn: str data: dict[str, Any] = Field(default_factory=dict) - edge_summary: dict[str, dict[str, int]] | None = None + edge_summary: dict[str, dict[str, int]] | None = Field( + default=None, + description=( + "Per graph edge label, in/out incident counts. For type Symbols (class, interface, " + "enum, record, annotation), may also include composed dot-keys " + "`DECLARES.DECLARES_CLIENT` and `DECLARES.EXPOSES`: 2-hop summaries " + "(DECLARES to member, then that edge) — edge-row counts, not EdgeType literals; " + "do not pass them to neighbors(edge_types=…)." + ), + ) class Edge(BaseModel): @@ -315,8 +331,13 @@ def _load_node_record(graph: KuzuGraph, node_id: str, kind: Literal["symbol", "r return rows[0] -def _edge_summary_for_node(graph: KuzuGraph, node_id: str) -> dict[str, dict[str, int]]: - return graph.edge_counts_for(node_id) +def _edge_summary_for_node( + graph: KuzuGraph, node_id: str, *, kind: str, row: dict[str, Any] +) -> dict[str, dict[str, int]]: + summary = dict(graph.edge_counts_for(node_id)) + if kind == "symbol" and str(row.get("kind") or "") in _TYPE_SYMBOL_KINDS_FOR_EDGE_ROLLUP: + summary.update(graph.member_edge_rollup_for(node_id)) + return summary def _node_matches_filter(kind: Literal["symbol", "route", "client"], row: dict[str, Any], f: NodeFilter | None) -> bool: @@ -478,7 +499,7 @@ def describe_v2(id: str, graph: KuzuGraph | None = None) -> DescribeOutput: if row is None: return DescribeOutput(success=False, message=f"No node found for `{id}`") ref = _node_ref_from_row(kind, row) - edge_summary = _edge_summary_for_node(g, id) + edge_summary = _edge_summary_for_node(g, id, kind=kind, row=row) return DescribeOutput( success=True, record=NodeRecord(id=ref.id, kind=kind, fqn=ref.fqn, data=row, edge_summary=edge_summary), diff --git a/plans/PLAN-DESCRIBE-MEMBER-EDGE-ROLLUP.md b/plans/completed/PLAN-DESCRIBE-MEMBER-EDGE-ROLLUP.md similarity index 89% rename from plans/PLAN-DESCRIBE-MEMBER-EDGE-ROLLUP.md rename to plans/completed/PLAN-DESCRIBE-MEMBER-EDGE-ROLLUP.md index ddcf19b..906127a 100644 --- a/plans/PLAN-DESCRIBE-MEMBER-EDGE-ROLLUP.md +++ b/plans/completed/PLAN-DESCRIBE-MEMBER-EDGE-ROLLUP.md @@ -1,11 +1,11 @@ # Plan: describe member edge rollup (`edge_summary` composed keys) -Status: **active (planning)**. This plan implements -[`propose/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md`](../propose/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md). +Status: **complete** (PR-1 landed). Source propose: +[`propose/completed/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md`](../../propose/completed/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md). Depends on: **none** for graph or indexer work (read-path only). -**Coordinate with:** [`propose/DESCRIBE-OVERRIDE-ROLLUP-PROPOSE.md`](../propose/DESCRIBE-OVERRIDE-ROLLUP-PROPOSE.md) if both describe rollups land in the same release window. That propose extends the **same** `_edge_summary_for_node` hook for **method and constructor** symbols with different composed keys. Prefer **one** `kind` + `row` signature change and one composable `_edge_summary_for_node` body with **disjoint** branches (type rollup vs override rollup), or land two PRs in an order where the second PR only adds the method/constructor branch without reshaping the signature again. +**Coordinate with:** [`propose/DESCRIBE-OVERRIDE-ROLLUP-PROPOSE.md`](../../propose/DESCRIBE-OVERRIDE-ROLLUP-PROPOSE.md) if both describe rollups land in the same release window. That propose extends the **same** `_edge_summary_for_node` hook for **method and constructor** symbols with different composed keys. Prefer **one** `kind` + `row` signature change and one composable `_edge_summary_for_node` body with **disjoint** branches (type rollup vs override rollup), or land two PRs in an order where the second PR only adds the method/constructor branch without reshaping the signature again. ## Goal @@ -100,14 +100,14 @@ If the session graph ever lacks a row for scenario (1) or (4), **do not** relax ## Definition of done (PR-1) -- [ ] `member_edge_rollup_for` exists and returns only positive-count composed keys. -- [ ] `describe_v2` merges rollup for eligible type symbols only. -- [ ] `neighbors_v2(..., edge_types=["DECLARES.DECLARES_CLIENT"])` still fails validation (same class of error as today for invalid literals). -- [ ] Four tests above pass, e.g. +- [x] `member_edge_rollup_for` exists and returns only positive-count composed keys. +- [x] `describe_v2` merges rollup for eligible type symbols only. +- [x] `neighbors_v2(..., edge_types=["DECLARES.DECLARES_CLIENT"])` still fails validation (same class of error as today for invalid literals). +- [x] Four tests above pass, e.g. `.venv/bin/python -m pytest tests/test_mcp_v2_compose.py::test_describe_class_with_brownfield_clients_emits_composed_key tests/test_mcp_v2_compose.py::test_describe_controller_class_emits_composed_exposes tests/test_mcp_v2_compose.py::test_describe_method_symbol_no_composed_keys tests/test_mcp_v2_compose.py::test_describe_pojo_no_composed_keys -v` (adjust module path if tests land in `test_mcp_v2.py`), or run full `.venv/bin/python -m pytest tests -v`. -- [ ] `.venv/bin/ruff check .` clean. -- [ ] AGENT-GUIDE updated; README updated if the optional bullet is taken. +- [x] `.venv/bin/ruff check .` clean. +- [x] AGENT-GUIDE updated; README updated if the optional bullet is taken. ## Implementation step list @@ -143,9 +143,9 @@ If the session graph ever lacks a row for scenario (1) or (4), **do not** relax ## Whole-plan done definition -1. Merged PR satisfies **Definition of done (PR-1)**. -2. Propose moved to `propose/completed/` when the PR lands (repo convention). +1. **Definition of done (PR-1)** — satisfied (implementation landed). +2. Propose archived at [`propose/completed/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md`](../../propose/completed/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md). ## Tracking -- `PR-1`: _pending_ +- `PR-1`: **done** (code + docs + tests landed; propose in `propose/completed/`) diff --git a/propose/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md b/propose/completed/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md similarity index 99% rename from propose/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md rename to propose/completed/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md index e8ced78..aa391ef 100644 --- a/propose/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md +++ b/propose/completed/DESCRIBE-MEMBER-EDGE-ROLLUP-PROPOSE.md @@ -1,6 +1,6 @@ # DESCRIBE-MEMBER-EDGE-ROLLUP — Surface method-level `DECLARES_CLIENT` / `EXPOSES` in the class's `edge_summary` -**Status**: under review (v2.1) +**Status**: **completed** — landed as PR-1 (read-path rollup; see [`plans/completed/PLAN-DESCRIBE-MEMBER-EDGE-ROLLUP.md`](../../plans/completed/PLAN-DESCRIBE-MEMBER-EDGE-ROLLUP.md)). **Author**: Dmitriy Teriaev + Perplexity Computer **Date**: 2026-05-12 diff --git a/server.py b/server.py index 19cf9c7..b8e2b80 100644 --- a/server.py +++ b/server.py @@ -19,7 +19,7 @@ _COCOINDEX_TARGET = "java_index_flow_lancedb.py:JavaCodeIndexLance" _INSTRUCTIONS = ( "Java codebase graph navigator (LanceDB + Kuzu). " - "Tools: search (NL/code locate), find (structured NodeFilter), describe (one node + edge counts), " + "Tools: search (NL/code locate), find (structured NodeFilter), describe (one node + edge_summary: stored edge-label counts and optional composed keys for type Symbols), " "neighbors (one hop; you MUST pass direction in|out AND edge_types list — no defaults). " "NodeFilter `filter` is a JSON object (preferred); a JSON-encoded string is also accepted as a fallback. " "Edge labels: EXTENDS, IMPLEMENTS, INJECTS, DECLARES, DECLARES_CLIENT, CALLS, EXPOSES, HTTP_CALLS, ASYNC_CALLS. " @@ -328,7 +328,14 @@ async def find( ) -> mcp_v2.FindOutput: return await asyncio.to_thread(mcp_v2.find_v2, kind, filter, limit, offset, None) - @mcp.tool(name="describe", description="full record + edge counts for one node") + @mcp.tool( + name="describe", + description=( + "full record + edge_summary: in/out per stored edge label; " + "type Symbols may add composed keys DECLARES.DECLARES_CLIENT, DECLARES.EXPOSES " + "(describe-time 2-hop member summaries; not valid in neighbors edge_types)" + ), + ) async def describe( id: str = Field( description=( diff --git a/tests/test_mcp_v2.py b/tests/test_mcp_v2.py index 9cea3b4..3474236 100644 --- a/tests/test_mcp_v2.py +++ b/tests/test_mcp_v2.py @@ -368,6 +368,17 @@ def test_neighbors_invalid_edge_type_rejected(kuzu_graph) -> None: neighbors_v2(mid, direction="in", edge_types=["calls"], graph=kuzu_graph) +def test_neighbors_rejects_composed_edge_summary_key(kuzu_graph) -> None: + mid = _method_id_with_calls(kuzu_graph, "out") + with pytest.raises(ValidationError): + neighbors_v2( + mid, + direction="out", + edge_types=["DECLARES.DECLARES_CLIENT"], + graph=kuzu_graph, + ) + + async def test_find_invalid_kind_rejected(mcp_server) -> None: with pytest.raises(ToolError, match="Input should be"): await mcp_server.call_tool("find", {"kind": "method", "filter": {}}) diff --git a/tests/test_mcp_v2_compose.py b/tests/test_mcp_v2_compose.py index cde4c76..f0d5476 100644 --- a/tests/test_mcp_v2_compose.py +++ b/tests/test_mcp_v2_compose.py @@ -2,7 +2,12 @@ from typing import Any -from mcp_v2 import describe_v2, neighbors_v2, search_v2 +from mcp_v2 import ( + _TYPE_SYMBOL_KINDS_FOR_EDGE_ROLLUP, + describe_v2, + neighbors_v2, + search_v2, +) from server import _graph_meta_output @@ -18,6 +23,8 @@ "INJECTS", ) +_ROLLUP_TYPE_KINDS = sorted(_TYPE_SYMBOL_KINDS_FOR_EDGE_ROLLUP) + def _controller_method_with_calls(kuzu_graph) -> tuple[str, str]: rows = kuzu_graph._rows( # noqa: SLF001 @@ -185,3 +192,69 @@ def test_search_describe_neighbors_chain_end_to_end(kuzu_graph, monkeypatch) -> neighbors_out = neighbors_v2(top_symbol_id, direction="in", edge_types=["CALLS"], graph=kuzu_graph) assert neighbors_out.success is True assert neighbors_out.results + + +def test_describe_class_with_brownfield_clients_emits_composed_key(kuzu_graph) -> None: + rows = kuzu_graph._rows( # noqa: SLF001 + "MATCH (t:Symbol)-[:DECLARES]->(m:Symbol)-[e:DECLARES_CLIENT]->(:Client) " + "WHERE t.kind IN $kinds " + "RETURN t.id AS id, count(e) AS n ORDER BY n DESC LIMIT 1", + {"kinds": _ROLLUP_TYPE_KINDS}, + ) + assert rows + tid = str(rows[0]["id"]) + n = int(rows[0]["n"] or 0) + assert n >= 1 + out = describe_v2(tid, graph=kuzu_graph) + assert out.success is True + assert out.record is not None + assert out.record.edge_summary is not None + assert out.record.edge_summary["DECLARES.DECLARES_CLIENT"]["out"] == n + + +def test_describe_controller_class_emits_composed_exposes(kuzu_graph) -> None: + rows = kuzu_graph._rows( # noqa: SLF001 + "MATCH (t:Symbol)-[:DECLARES]->(m:Symbol)-[e:EXPOSES]->(:Route) " + "WHERE t.role = 'CONTROLLER' AND t.kind = 'class' " + "RETURN t.id AS id, count(e) AS n ORDER BY n DESC LIMIT 1", + ) + assert rows + tid = str(rows[0]["id"]) + n = int(rows[0]["n"] or 0) + assert n >= 1 + out = describe_v2(tid, graph=kuzu_graph) + assert out.success is True + assert out.record is not None + assert out.record.edge_summary is not None + assert out.record.edge_summary["DECLARES.EXPOSES"]["out"] == n + + +def test_describe_method_symbol_no_composed_keys(kuzu_graph) -> None: + node_id, _ = _controller_method_with_calls(kuzu_graph) + out = describe_v2(node_id, graph=kuzu_graph) + assert out.success is True + assert out.record is not None + assert out.record.edge_summary is not None + es = out.record.edge_summary + assert "DECLARES.DECLARES_CLIENT" not in es + assert "DECLARES.EXPOSES" not in es + + +def test_describe_pojo_no_composed_keys(kuzu_graph) -> None: + rows = kuzu_graph._rows( # noqa: SLF001 + "MATCH (t:Symbol)-[:DECLARES]->(:Symbol) " + "WHERE t.kind IN $kinds " + "AND NOT EXISTS { MATCH (t)-[:DECLARES]->(m:Symbol)-[:DECLARES_CLIENT]->() } " + "AND NOT EXISTS { MATCH (t)-[:DECLARES]->(m:Symbol)-[:EXPOSES]->() } " + "RETURN t.id AS id LIMIT 1", + {"kinds": _ROLLUP_TYPE_KINDS}, + ) + assert rows + tid = str(rows[0]["id"]) + out = describe_v2(tid, graph=kuzu_graph) + assert out.success is True + assert out.record is not None + assert out.record.edge_summary is not None + es = out.record.edge_summary + assert "DECLARES.DECLARES_CLIENT" not in es + assert "DECLARES.EXPOSES" not in es