Skip to content

Commit ff84fb9

Browse files
feat: filter frame counters + tool descriptions (PR-FRAME-3) (#133)
* add fail-loud counters and refresh mcp filter tool descriptions Co-authored-by: Cursor <cursoragent@cursor.com> * move mcp filter frame propose and plan to completed Land the 3-PR migration docs under propose/completed/ and plans/completed/; point mcp_v2 revisit trigger at the new path. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent 3576769 commit ff84fb9

7 files changed

Lines changed: 140 additions & 34 deletions

File tree

docs/AGENT-GUIDE.md

Lines changed: 17 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -139,21 +139,32 @@ Use `search` to recover the stored symbol id / FQN if you only have a simple nam
139139

140140
Omitting them is a validation error. This is intentional: it prevents huge accidental fan-out.
141141

142-
Optional `filter` applies to the **other** endpoint node (same `NodeFilter` keys as `find`; keys irrelevant to that node kind are ignored).
142+
Optional `filter` applies to the **other** endpoint node using the same `NodeFilter` schema as `find`. Populated fields must be applicable to that neighbor's kind; mixed-kind neighborhoods fail loud on the first neighbor row whose kind rejects the filter.
143143

144144
#### E. Shared `NodeFilter` (for `find`, `search.filter`, `neighbors.filter`)
145145

146-
One object shape everywhere. **For `find`, `filter` is required**use at least one key (e.g. `{"microservice":"chat-core"}`) or `{}` is valid Pydantic but may be expensive at scale; prefer narrowing keys.
146+
One object shape everywhere. **For `find`, `filter` is required**`{}` is valid (no predicates; returns the full kind up to pagination) but may be expensive at scale; prefer narrowing keys when you can.
147147

148148
| Keys | Applies to |
149149
| ---- | ---------- |
150150
| `microservice`, `module`, `source_layer` | All kinds (`source_layer` mainly **client**: `builtin` / brownfield) |
151-
| `role`, `exclude_roles`, `annotation`, `capability`, `fqn_prefix`, `symbol_kind`, `symbol_kinds` | **symbol** (ignored for route/client) |
151+
| `role`, `exclude_roles`, `annotation`, `capability`, `fqn_prefix`, `symbol_kind`, `symbol_kinds` | **symbol** |
152152
| `http_method`, `path_prefix`, `framework` | **route** |
153153
| `client_kind`, `target_service`, `target_path_prefix`, `http_method` | **client** |
154154

155155
The same `http_method` key filters HTTP verbs on **routes** (server-side declared method) and on **clients** (caller-side method on the outbound call). It is not applicable to **symbol** rows.
156156

157+
### Strict frame contract (`find`, `search.filter`, `neighbors.filter`)
158+
159+
- **One populated field, one stored attribute** for the evaluated kind. Inapplicable fields or `extra` keys are never silently dropped: the tool returns `success=false` with a teaching message (and applicable-field list for cross-kind mistakes).
160+
- **No wildcards** in `fqn_prefix`, `path_prefix`, or `target_path_prefix` (`*` / `?` rejected). Use `search(query=…)` for ranked text discovery instead.
161+
- **`search.query` is not a DSL** — treat it as opaque text scored against the index. Structured predicates belong in `find`.
162+
- **`neighbors` filters neighbor rows by kind** — the first neighbor whose kind rejects the filter fails the whole call (no per-row silent skip).
163+
164+
### Identifier resolution (pre-`resolve`)
165+
166+
For identifier-shaped lookups without a stable graph id or exact symbol FQN, use **`search(query=…)`** for ranked candidates, then **`describe(id=…)`** (or `describe(fqn=…)` when you have an exact FQN) on each promising row until you confirm the right node. A dedicated **`resolve`** tool is planned separately; until it ships, this multi-call pattern is the supported fallback.
167+
157168
**`source_layer` vs `role`:** On **Client** nodes, `source_layer` records which brownfield or built-in layer produced the client declaration (`builtin`, `layer_a_meta`, `layer_b_ann`, `layer_c_source`, `layer_b_fqn`, …). On **Symbol** nodes, `role` is the inferred architectural stereotype (`CONTROLLER`, `SERVICE`, `REPO`, …). They answer different questions; names stay distinct.
158169

159170
**`target_service` vs `microservice`:** `microservice` is the service **where the node lives** (home service / owning module). `target_service` (clients only) is the **remote service being called**. A client in `operator-api` may list `partner-api` as `target_service`.
@@ -190,7 +201,8 @@ Exact allowed values for roles, capabilities, client kinds, etc. live in `java_o
190201
#### `search`
191202

192203
- **Purpose:** Locate chunk hits by NL or code fragment; use `symbol_id` when present to jump into the graph.
193-
- **Args:** `query`, `table` (`java`|`sql`|`yaml`|`all`, default `java`), `hybrid` (bool), `limit` (default 5), `offset`, `path_contains`, optional `filter` (`NodeFilter` — post-filters hits using symbol-oriented fields on the row).
204+
- **Args:** `query`, `table` (`java`|`sql`|`yaml`|`all`, default `java`), `hybrid` (bool), `limit` (default 5), `offset`, `path_contains`, optional `filter` (`NodeFilter` — post-filters hits using **symbol-applicable** fields only).
205+
- **Strict frame:** `query` is opaque ranked text (no structured DSL inside the string). Optional `filter` follows the same strict applicability and wildcard rules as `find` for symbols.
194206
- **Tip:** For behaviour questions, narrow noise with `filter.exclude_roles` or `filter.role` when you know the shape you want.
195207

196208
#### `find`
@@ -202,7 +214,7 @@ Exact allowed values for roles, capabilities, client kinds, etc. live in `java_o
202214
#### `describe`
203215

204216
- **Purpose:** Full node payload + `edge_summary`: `in` / `out` counts **per stored graph edge label** (what exists as edges in Kuzu). For **type** Symbols only (`class`, `interface`, `enum`, `record`, `annotation`), the same map may also include **describe-time composed** dot-keys — summaries of member edges, not stored labels — see the next bullets (`DECLARES.DECLARES_CLIENT`, `DECLARES.EXPOSES`); those keys are **not** valid in `neighbors(edge_types=…)`. For **method** Symbols, the map may include **override-axis** virtual keys (`OVERRIDDEN_BY`, `OVERRIDDEN_BY.DECLARES_CLIENT`, `OVERRIDDEN_BY.EXPOSES`, `OVERRIDES`); see **Override-axis keys (method Symbols)** below — also not `EdgeType` literals.
205-
- **Args:** `id` (symbol, route, or client id).
217+
- **Args:** `id` (symbol, route, or client id) or **`fqn`** (exact symbol FQN when you do not have the graph id). When both are set, `id` wins. For ambiguous identifiers without an exact id/FQN, see **Identifier resolution (pre-`resolve`)** above.
206218

207219
**Composed `edge_summary` keys (type Symbols).** Keys use dot notation: `<parent_relation>.<projected_relation>`. Two are emitted today:
208220

mcp_v2.py

Lines changed: 28 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@
88
prefix fields (``fqn_prefix``, ``path_prefix``, ``target_path_prefix``) reject ``*``
99
and ``?`` — see ``_validate_no_wildcards``.
1010
11-
Revisit trigger (``propose/MCP-FILTER-FRAME-PROPOSE.md`` section 3.4.6)
11+
Revisit trigger (``propose/completed/MCP-FILTER-FRAME-PROPOSE.md`` section 3.4.6)
1212
--------------------------------------------------------------
1313
If **three** legitimate issue-tracker workflows appear within **six months** of frame
1414
lock where the strict frame has no clean analog under ``search``, deferred
@@ -19,6 +19,7 @@
1919

2020
import json
2121
import os
22+
import sys
2223
from pathlib import Path
2324
import threading
2425
from typing import Annotated, Any, Literal
@@ -60,6 +61,23 @@
6061

6162
_METHOD_SYMBOL_KINDS_FOR_OVERRIDE_ROLLUP = frozenset({"method"})
6263

64+
_fail_loud_counts: dict[str, int] = {}
65+
_fail_loud_lock = threading.Lock()
66+
67+
68+
def _log_fail_loud(category: str) -> None:
69+
"""Increment process-local fail-loud counter and emit one stderr line (PR-FRAME-3)."""
70+
with _fail_loud_lock:
71+
_fail_loud_counts[category] = _fail_loud_counts.get(category, 0) + 1
72+
n = _fail_loud_counts[category]
73+
print(f"[filter-frame] fail-loud category={category} count={n}", file=sys.stderr, flush=True)
74+
75+
76+
def filter_frame_counters() -> dict[str, int]:
77+
"""Snapshot of fail-loud counts (tests / local diagnostics; not an MCP tool)."""
78+
with _fail_loud_lock:
79+
return dict(_fail_loud_counts)
80+
6381

6482
def _get_sentence_transformer(model_name: str, device: str | None) -> SentenceTransformer:
6583
global _st_model
@@ -529,10 +547,13 @@ def search_v2(
529547
else raw_filter
530548
)
531549
except ValidationError as exc:
550+
_log_fail_loud("unknown_key")
532551
return SearchOutput(success=False, message=_filter_validation_error_message(exc))
533552
if nf and (err := _nodefilter_applicability_error("symbol", nf)):
553+
_log_fail_loud("applicability")
534554
return SearchOutput(success=False, message=err)
535555
if nf and (err := _validate_no_wildcards(nf)):
556+
_log_fail_loud("wildcard")
536557
return SearchOutput(success=False, message=err)
537558
model_name = resolved_sbert_model_for_process_env(SBERT_MODEL)
538559
device = os.environ.get("SBERT_DEVICE") or None
@@ -585,10 +606,13 @@ def find_v2(
585606
try:
586607
nf = NodeFilter.model_validate(raw_filter) if not isinstance(raw_filter, NodeFilter) else raw_filter
587608
except ValidationError as exc:
609+
_log_fail_loud("unknown_key")
588610
return FindOutput(success=False, message=_filter_validation_error_message(exc))
589611
if err := _nodefilter_applicability_error(kind, nf):
612+
_log_fail_loud("applicability")
590613
return FindOutput(success=False, message=err)
591614
if err := _validate_no_wildcards(nf):
615+
_log_fail_loud("wildcard")
592616
return FindOutput(success=False, message=err)
593617
if kind == "symbol":
594618
where, params = _symbol_where_from_filter(nf)
@@ -700,8 +724,10 @@ def neighbors_v2(
700724
else raw_filter
701725
)
702726
except ValidationError as exc:
727+
_log_fail_loud("unknown_key")
703728
return NeighborsOutput(success=False, message=_filter_validation_error_message(exc))
704729
if nf and (err := _validate_no_wildcards(nf)):
730+
_log_fail_loud("wildcard")
705731
return NeighborsOutput(success=False, message=err)
706732
origins = [ids] if isinstance(ids, str) else list(ids)
707733
results: list[Edge] = []
@@ -739,6 +765,7 @@ def neighbors_v2(
739765
if other_rec is None:
740766
continue
741767
if nf and (err := _nodefilter_applicability_error(other_kind, nf)):
768+
_log_fail_loud("applicability")
742769
return NeighborsOutput(success=False, message=err)
743770
if not _node_matches_filter(other_kind, other_rec, nf):
744771
continue

plans/CURSOR-PROMPTS-MCP-FILTER-FRAME.md renamed to plans/completed/CURSOR-PROMPTS-MCP-FILTER-FRAME.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,9 @@
11
# Cursor task prompts — MCP Filter Frame (PR-FRAME-1 → PR-FRAME-3)
22

3-
Status: **active**. One prompt per PR; each prompt is self-contained.
3+
Status: **completed** — reference template for the landed PR-FRAME-1 → PR-FRAME-3
4+
sequence. Plan:
5+
[`PLAN-MCP-FILTER-FRAME.md`](PLAN-MCP-FILTER-FRAME.md); propose:
6+
[`propose/completed/MCP-FILTER-FRAME-PROPOSE.md`](../../propose/completed/MCP-FILTER-FRAME-PROPOSE.md).
47

58
One prompt per PR. Each is **self-contained**: copy the prompt verbatim
69
into Cursor, attach the files listed in its `@-files` block, and let
Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
11
# Plan: MCP Filter Frame — typed query language migration
22

3-
Status: **active (planning)**. This plan implements
4-
[`propose/MCP-FILTER-FRAME-PROPOSE.md`](../propose/MCP-FILTER-FRAME-PROPOSE.md)
5-
as a 3-PR sequence. This file is plan-only and does not implement code.
3+
Status: **completed — shipped via PR-FRAME-1 → PR-FRAME-3** (merged 2026-05).
4+
This plan implemented
5+
[`propose/completed/MCP-FILTER-FRAME-PROPOSE.md`](../../propose/completed/MCP-FILTER-FRAME-PROPOSE.md)
6+
as a 3-PR sequence. Per-PR Cursor prompts:
7+
[`CURSOR-PROMPTS-MCP-FILTER-FRAME.md`](CURSOR-PROMPTS-MCP-FILTER-FRAME.md).
68

79
Depends on: **none** (builds on already-shipped #122`extra="forbid"` +
810
per-kind applicability validation).
@@ -395,6 +397,6 @@ Landing order: **FRAME-1 → FRAME-2 → FRAME-3**.
395397

396398
# Tracking
397399

398-
- `PR-FRAME-1`: _pending_
399-
- `PR-FRAME-2`: _pending_
400-
- `PR-FRAME-3`: _pending_
400+
- `PR-FRAME-1`: merged
401+
- `PR-FRAME-2`: merged
402+
- `PR-FRAME-3`: merged (#133)

propose/MCP-FILTER-FRAME-PROPOSE.md renamed to propose/completed/MCP-FILTER-FRAME-PROPOSE.md

Lines changed: 7 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,12 @@
11
# MCP Filter Frame — typed query language with one named carve-out
22

3-
**Status**: draft
3+
**Status**: **completed — shipped via PR-FRAME-1 → PR-FRAME-3** (merged 2026-05).
4+
Moved to `propose/completed/` once the 3-PR migration landed. The
5+
implementable plan lives at
6+
[`plans/completed/PLAN-MCP-FILTER-FRAME.md`](../../plans/completed/PLAN-MCP-FILTER-FRAME.md);
7+
per-PR Cursor prompts at
8+
[`plans/completed/CURSOR-PROMPTS-MCP-FILTER-FRAME.md`](../../plans/completed/CURSOR-PROMPTS-MCP-FILTER-FRAME.md).
9+
410
**Author**: Dmitriy Teriaev + Computer
511
**Date**: 2026-05-14
612
**Issue**: #117

server.py

Lines changed: 45 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -329,7 +329,18 @@ async def run_refresh_pipeline(*, quiet: bool = False) -> RefreshIndexOutput:
329329
def create_mcp_server() -> FastMCP:
330330
mcp = FastMCP("java-codebase-rag", instructions=_INSTRUCTIONS)
331331

332-
@mcp.tool(name="search", description="locate nodes by NL/code text")
332+
@mcp.tool(
333+
name="search",
334+
description=(
335+
"Ranked chunk retrieval: `query` is opaque text (natural language or code fragments); "
336+
"results are score-ranked, not boolean-matched. Optional `filter` uses the same NodeFilter "
337+
"schema as `find` but only **symbol-applicable** fields apply (strict frame). Wildcards "
338+
"(`*`, `?`) in prefix fields are rejected—use ranked `query` text instead. There is **no** "
339+
"structured DSL inside `query`; structured predicates belong in `find`. For "
340+
"identifier-shaped lookups without an exact symbol id/FQN, use `search(query=…)` and "
341+
"`describe` on promising candidates until a dedicated `resolve` tool exists."
342+
),
343+
)
333344
async def search(
334345
query: str = Field(description="Search query"),
335346
table: Literal["java", "sql", "yaml", "all"] = Field(
@@ -349,9 +360,8 @@ async def search(
349360
filter: dict[str, Any] | str | None = Field(
350361
default=None,
351362
description=(
352-
"Optional NodeFilter (symbol applicability). Unknown keys and populated non-symbol fields return success=false "
353-
"with a teaching message. "
354-
"Prefer a JSON object; a JSON-encoded string is accepted as a fallback."
363+
"Optional NodeFilter post-filter on symbol-oriented hit rows. Unknown keys or populated fields not "
364+
"applicable to symbols return success=false. Prefer a JSON object; a JSON-encoded string is accepted."
355365
),
356366
),
357367
) -> mcp_v2.SearchOutput:
@@ -367,7 +377,17 @@ async def search(
367377
None,
368378
)
369379

370-
@mcp.tool(name="find", description="locate nodes by structured filter")
380+
@mcp.tool(
381+
name="find",
382+
description=(
383+
"Exact structured listing for one node kind. Per-kind applicable fields: **symbol** — "
384+
"microservice, module, role, exclude_roles, annotation, capability, fqn_prefix, symbol_kind, symbol_kinds; "
385+
"**route** — microservice, module, http_method, path_prefix, framework; **client** — microservice, module, "
386+
"source_layer, client_kind, target_service, target_path_prefix, http_method. "
387+
"Wildcards in prefix fields are rejected. An empty filter (`{}`) or `filter=None` means no predicate (all nodes of "
388+
"that kind; use pagination). Unknown keys or inapplicable populated fields return success=false."
389+
),
390+
)
371391
async def find(
372392
kind: Literal["symbol", "route", "client"] = Field(
373393
description=(
@@ -378,10 +398,8 @@ async def find(
378398
filter: dict[str, Any] | str = Field(
379399
...,
380400
description=(
381-
"Required NodeFilter (shared schema, strict extras). Unknown keys and populated fields not applicable to "
382-
"the selected kind return success=false with a teaching message. Symbol filters also support symbol_kind "
383-
"and symbol_kinds. "
384-
"Prefer a JSON object; a JSON-encoded string is accepted as a fallback."
401+
"Required NodeFilter dict (extra keys forbidden). Fields must be applicable to `kind`. "
402+
"Prefer a JSON object; a JSON-encoded string is accepted."
385403
),
386404
),
387405
limit: int = Field(default=25, ge=1, le=500, description="Max nodes to return"),
@@ -392,12 +410,13 @@ async def find(
392410
@mcp.tool(
393411
name="describe",
394412
description=(
395-
"full record + edge_summary: in/out per stored edge label; "
396-
"type Symbols may add composed keys DECLARES.DECLARES_CLIENT, DECLARES.EXPOSES "
397-
"(describe-time 2-hop member summaries; not valid in neighbors edge_types); "
398-
"method Symbols may add override-axis virtual keys OVERRIDDEN_BY, "
399-
"OVERRIDDEN_BY.DECLARES_CLIENT, OVERRIDDEN_BY.EXPOSES, OVERRIDES (same restriction). "
400-
"Pass id for any node kind, or fqn as an alternative identifier for Symbol nodes only."
413+
"Full node record plus `edge_summary` (in/out counts per stored edge label). Type Symbols may add "
414+
"describe-time composed keys such as DECLARES.DECLARES_CLIENT and DECLARES.EXPOSES; method Symbols may "
415+
"add override-axis virtual keys (OVERRIDDEN_BY, OVERRIDDEN_BY.DECLARES_CLIENT, OVERRIDDEN_BY.EXPOSES, "
416+
"OVERRIDES). Those dot-keys are read-only summaries—not valid `neighbors(edge_types=…)` values. "
417+
"Pass `id` for any kind, or exact `fqn` for Symbol lookup (`id` wins when both are set). "
418+
"For identifier-shaped lookups without an exact id/FQN, use `search(query=…)` then `describe` per candidate "
419+
"until `resolve` ships."
401420
),
402421
)
403422
async def describe(
@@ -416,7 +435,15 @@ async def describe(
416435
) -> mcp_v2.DescribeOutput:
417436
return await asyncio.to_thread(mcp_v2.describe_v2, id, fqn, None)
418437

419-
@mcp.tool(name="neighbors", description="one-hop walk; REQUIRED direction + edge_types")
438+
@mcp.tool(
439+
name="neighbors",
440+
description=(
441+
"One-hop graph walk: **direction** (`in` | `out`) and non-empty **edge_types** are required. "
442+
"Optional `filter` applies to each neighbor endpoint row; populated fields must be applicable to that "
443+
"neighbor's kind—mixed-kind result sets fail on the first inapplicable neighbor (strict frame). "
444+
"Wildcards in prefix fields are rejected. Unknown NodeFilter keys return success=false."
445+
),
446+
)
420447
async def neighbors(
421448
ids: str | list[str] = Field(description="Origin symbol/route/client id, or list for batch"),
422449
direction: Literal["in", "out"] = Field(
@@ -440,10 +467,8 @@ async def neighbors(
440467
filter: dict[str, Any] | str | None = Field(
441468
default=None,
442469
description=(
443-
"Optional NodeFilter applied to the other endpoint of each edge. Unknown keys and populated fields not "
444-
"applicable to an evaluated neighbor kind return success=false with a teaching message. For mixed "
445-
"neighbor kinds, evaluation fails on the first inapplicable row. "
446-
"Prefer a JSON object; a JSON-encoded string is accepted as a fallback."
470+
"Optional NodeFilter on the neighbor node. Same applicability rules as `find` for that node's kind. "
471+
"Prefer a JSON object; a JSON-encoded string is accepted."
447472
),
448473
),
449474
) -> mcp_v2.NeighborsOutput:

0 commit comments

Comments
 (0)