Skip to content

Commit bb5a8df

Browse files
add cross-service resolution mode flag for pass6 matching (#30)
* add cross-service resolution mode flag for pass6 matching Introduce `cross_service_resolution` (auto|brownfield_only), persist it in GraphMeta, and gate pass6 cross-service promotion so brownfield-only mode keeps only fully brownfield-sourced edges. Co-authored-by: Cursor <cursoragent@cursor.com> * moved completed plan * split suppressed pass6 debug examples by channel Keep brownfield-only suppression counts unchanged but log separate first-five examples for HTTP and async to make pass6 debugging clearer. Co-authored-by: Cursor <cursoragent@cursor.com> --------- Co-authored-by: Cursor <cursoragent@cursor.com>
1 parent b5eb607 commit bb5a8df

12 files changed

Lines changed: 435 additions & 17 deletions

.cursor/rules/agent-workflow.mdc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -66,9 +66,9 @@ When you're given a per-PR task prompt from `plans/CURSOR-PROMPTS-*.md`:
6666
`java_ontology.py`. Don't sprinkle role / capability / client-kind /
6767
strategy / match string literals across other modules.
6868
- Schema changes that affect the Lance index or Kuzu graph need a
69-
matching update to the README "Re-index required" callout. Bump
70-
`ontology_version` when enrichment semantics change. The current
71-
version is **7**.
69+
matching update to the README "Re-index required" callout. Bump
70+
`ontology_version` when enrichment semantics change. The current
71+
version is **8**.
7272
- Brownfield is a first-class surface: any new auto-detection
7373
(route, role, capability, http client, async producer) must
7474
compose with the matching `BrownfieldOverrides` layer. Last writer

.cursor/rules/project-overview.mdc

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -20,8 +20,9 @@ when needed.
2020

2121
- `README.md` — feature surface, env vars, ranking, capabilities,
2222
tool list, "Re-index required" callouts. The current
23-
`ontology_version` is **7** (HTTP_CALLS / ASYNC_CALLS caller
24-
edges + brownfield client/producer composition). Earlier
23+
`ontology_version` is **8** (HTTP_CALLS / ASYNC_CALLS caller
24+
edges + brownfield client/producer composition + cross-service
25+
resolution mode on GraphMeta). Earlier
2526
ontology bumps are described inline in the README's
2627
callouts list.
2728
- `CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and per-file

AGENTS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ for tools that don't read `.cursor/rules/`.
88
## Where to look
99

1010
- `README.md` — feature surface, env vars, ranking, capabilities,
11-
tool list, "Re-index required" callouts. **`ontology_version` is
12-
currently 7.**
11+
tool list, "Re-index required" callouts. **`ontology_version` is
12+
currently 8.**
1313
- `CODEBASE_REQUIREMENTS.md` — Java-repo assumptions and tuning map.
1414
- `propose/` and `plans/` (plus their `completed/` subdirs) —
1515
in-flight scope and the rationale behind current design.

README.md

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -100,7 +100,9 @@ Resolution order for `microservice`:
100100
> 4. **`ontology_version` 7** adds caller-side edge extraction (`HTTP_CALLS`, `ASYNC_CALLS`) and
101101
> brownfield caller composition (`http_client_overrides`, `async_producer_overrides`,
102102
> `@CodebaseClient`, `@CodebaseProducer`) — rebuild Kuzu after upgrading.
103-
> rebuild the Kuzu graph (`build_ast_graph.py` or `refresh_code_index`).
103+
> 5. **`ontology_version` 8** adds `GraphMeta.cross_service_resolution` (from
104+
> `cross_service_resolution` in `.lancedb-mcp.yml`) — rebuild the Kuzu graph
105+
> (`build_ast_graph.py` or `refresh_code_index`) after upgrading.
104106
>
105107
> Any index built before these changes must be rebuilt via
106108
> `cocoindex update ... --full-reprocess -f` or `refresh_code_index`. Until
@@ -289,6 +291,22 @@ route_overrides:
289291
```
290292
291293
Unknown `framework` / `kind` strings are dropped with a stderr warning.
294+
295+
**Cross-service resolution mode** — optional top-level key in the same file:
296+
297+
```yaml
298+
cross_service_resolution: auto # default when omitted
299+
# cross_service_resolution: brownfield_only
300+
```
301+
302+
With `brownfield_only`, pass 6 does not promote auto-detected call sites to
303+
`cross_service` matches: only edges where both the caller strategy and every
304+
matched route’s `source_layer` come from brownfield (`@CodebaseRoute`,
305+
`@CodebaseClient`, YAML overrides, meta-annotation closure, or FQN maps) stay
306+
`cross_service`. Everything else that would have been a cross-service match
307+
becomes `unresolved`. `intra_service`, `phantom`, and `ambiguous` behaviour is
308+
unchanged. Unknown values log a warning and behave like `auto`.
309+
292310
Resolution order for each method mirrors role brownfield: built-in extraction,
293311
then annotation map, then meta-annotation closure (same `collect_annotation_meta_chain`
294312
index as roles — see `plans/completed/PLAN-BROWNFIELD-ROLE-OVERRIDES-design-fixes.md`),

ast_java.py

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -70,8 +70,9 @@
7070
"EqualsAndHashCode", "ToString",
7171
})
7272

73-
# Phase 5: HTTP_CALLS + ASYNC_CALLS (B2b); bumps whenever extraction / enrichment semantics change.
74-
ONTOLOGY_VERSION = 7
73+
# Phase 5: HTTP_CALLS + ASYNC_CALLS (B2b); Phase 6: cross-service resolution mode on GraphMeta.
74+
# Bumps whenever extraction / enrichment semantics change.
75+
ONTOLOGY_VERSION = 8
7576

7677
ROLE_ANNOTATIONS: dict[str, str] = {
7778
# Spring Web

build_ast_graph.py

Lines changed: 84 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
parse_java,
5252
)
5353
from graph_enrich import (
54+
_load_config_cross_service_resolution,
5455
collect_annotation_meta_chain,
5556
load_brownfield_overrides,
5657
microservice_for_path,
@@ -254,6 +255,7 @@ class GraphTables:
254255
parse_errors: int = 0
255256
skipped_files: int = 0
256257
pass3_skipped_cross_service: int = 0
258+
cross_service_resolution: str = "auto"
257259

258260

259261
# ---------- file walk (see `path_filtering.iter_java_source_files`) ----------
@@ -1276,6 +1278,7 @@ def pass4_routes(
12761278
prs = str(source_root.resolve())
12771279
except OSError:
12781280
prs = str(source_root)
1281+
tables.cross_service_resolution = _load_config_cross_service_resolution(prs)
12791282
meta_chain = collect_annotation_meta_chain(prs)
12801283

12811284
for ast in asts.values():
@@ -1411,6 +1414,7 @@ def pass5_imperative_edges(
14111414
prs = str(source_root.resolve())
14121415
except OSError:
14131416
prs = str(source_root)
1417+
tables.cross_service_resolution = _load_config_cross_service_resolution(prs)
14141418
meta_chain = collect_annotation_meta_chain(prs)
14151419
routes_by_id = {r.id: r for r in tables.routes_rows}
14161420
existing_route_ids = set(routes_by_id)
@@ -1648,6 +1652,30 @@ def _match_call_edge(
16481652
return "cross_service", candidates
16491653

16501654

1655+
_BROWNFIELD_LAYERS = frozenset({
1656+
"layer_c_source",
1657+
"layer_b_ann",
1658+
"layer_b_fqn",
1659+
"layer_a_meta",
1660+
})
1661+
1662+
1663+
def _is_brownfield_sourced(
1664+
call_strategy: str,
1665+
candidates: list[RouteRow],
1666+
) -> bool:
1667+
"""Both sides must come from brownfield layers for an edge to count as
1668+
authoritative under brownfield_only mode."""
1669+
if not candidates:
1670+
return False
1671+
if call_strategy not in _BROWNFIELD_LAYERS:
1672+
return False
1673+
return all(
1674+
getattr(c, "source_layer", "builtin") in _BROWNFIELD_LAYERS
1675+
for c in candidates
1676+
)
1677+
1678+
16511679
def pass6_match_edges(
16521680
tables: GraphTables,
16531681
*,
@@ -1670,6 +1698,11 @@ def pass6_match_edges(
16701698
tables.call_edge_stats.async_calls_match_breakdown.clear()
16711699
tables.call_edge_stats.cross_service_calls_total = 0
16721700

1701+
brownfield_only = tables.cross_service_resolution == "brownfield_only"
1702+
suppressed_auto_cross_http: list[str] = []
1703+
suppressed_auto_cross_async: list[str] = []
1704+
suppressed_auto_cross_count = 0
1705+
16731706
def _micro_factor(member: MemberEntry | None) -> float:
16741707
return 1.0 if (member and member.microservice) else 0.85
16751708

@@ -1679,10 +1712,18 @@ def _micro_factor(member: MemberEntry | None) -> float:
16791712
member = member_by_id.get(row.symbol_id)
16801713
base = row.confidence / max(1e-9, (0.3 * _micro_factor(member)))
16811714
src_route = route_by_id.get(row.route_id)
1715+
# Declared Feign client methods use `http_consumer` routes; synthetic phantoms from
1716+
# imperative clients are `http_endpoint` even when `feign_name` is populated from
1717+
# `@CodebaseClient.targetService` / YAML hints — those must path-match like RestTemplate.
1718+
_feign_like = (
1719+
src_route is not None
1720+
and src_route.kind == "http_consumer"
1721+
and bool(src_route.feign_name)
1722+
)
16821723
call = OutgoingCallDecl(
16831724
method_fqn=f"{member.parent_fqn}#{member.decl.signature}" if member else "",
16841725
method_sig=member.decl.signature if member else "",
1685-
client_kind="feign_method" if (src_route and src_route.feign_name) else "rest_template",
1726+
client_kind="feign_method" if _feign_like else "rest_template",
16861727
channel="http",
16871728
feign_target_name=src_route.feign_name if src_route else "",
16881729
feign_target_url=src_route.feign_url if src_route else "",
@@ -1700,6 +1741,16 @@ def _micro_factor(member: MemberEntry | None) -> float:
17001741
end_line=member.decl.end_line if member else 0,
17011742
)
17021743
outcome, candidates = _match_call_edge(call, all_routes, member.microservice if member else "")
1744+
if (
1745+
brownfield_only
1746+
and outcome == "cross_service"
1747+
and not _is_brownfield_sourced(row.strategy, candidates)
1748+
):
1749+
outcome = "unresolved"
1750+
candidates = []
1751+
suppressed_auto_cross_count += 1
1752+
if len(suppressed_auto_cross_http) < 5:
1753+
suppressed_auto_cross_http.append(call.method_fqn)
17031754
if outcome in VALID_CALL_MATCHES:
17041755
row.match = outcome
17051756
if outcome in ("cross_service", "intra_service") and len(candidates) == 1:
@@ -1736,6 +1787,16 @@ def _micro_factor(member: MemberEntry | None) -> float:
17361787
end_line=member.decl.end_line if member else 0,
17371788
)
17381789
outcome, candidates = _match_call_edge(call, all_routes, member.microservice if member else "")
1790+
if (
1791+
brownfield_only
1792+
and outcome == "cross_service"
1793+
and not _is_brownfield_sourced(row.strategy, candidates)
1794+
):
1795+
outcome = "unresolved"
1796+
candidates = []
1797+
suppressed_auto_cross_count += 1
1798+
if len(suppressed_auto_cross_async) < 5:
1799+
suppressed_auto_cross_async.append(call.method_fqn)
17391800
if outcome in VALID_CALL_MATCHES:
17401801
row.match = outcome
17411802
if outcome in ("cross_service", "intra_service") and len(candidates) == 1:
@@ -1760,6 +1821,18 @@ def _micro_factor(member: MemberEntry | None) -> float:
17601821
)
17611822

17621823
if verbose:
1824+
if brownfield_only:
1825+
n_bf = tables.call_edge_stats.cross_service_calls_total
1826+
first_http = ", ".join(suppressed_auto_cross_http)
1827+
first_async = ", ".join(suppressed_auto_cross_async)
1828+
print(
1829+
f"[pass6] cross_service_resolution=brownfield_only:\n"
1830+
f" {n_bf} cross_service edges from brownfield layers,\n"
1831+
f" {suppressed_auto_cross_count} auto-cross-service candidates suppressed -> unresolved\n"
1832+
f" (first 5 http: {first_http})\n"
1833+
f" (first 5 async: {first_async})",
1834+
file=sys.stderr,
1835+
)
17631836
print(
17641837
f"[pass6] http_match={dict(sorted(tables.call_edge_stats.http_calls_match_breakdown.items()))}, "
17651838
f"async_match={dict(sorted(tables.call_edge_stats.async_calls_match_breakdown.items()))}, "
@@ -1805,7 +1878,8 @@ def _micro_factor(member: MemberEntry | None) -> float:
18051878
"http_calls_match_breakdown STRING, "
18061879
"async_calls_match_breakdown STRING, "
18071880
"cross_service_calls_total INT64, "
1808-
"pass3_skipped_cross_service INT64"
1881+
"pass3_skipped_cross_service INT64, "
1882+
"cross_service_resolution STRING"
18091883
")"
18101884
)
18111885

@@ -1925,6 +1999,11 @@ def _write_nodes(
19251999
meta_chain: dict[str, frozenset[str]] | None,
19262000
) -> None:
19272001
overrides = load_brownfield_overrides(project_root)
2002+
try:
2003+
prs = str(project_root.resolve())
2004+
except OSError:
2005+
prs = str(project_root)
2006+
tables.cross_service_resolution = _load_config_cross_service_resolution(prs)
19282007
mch = meta_chain
19292008
# packages
19302009
for pkg, pid in tables.packages.items():
@@ -2183,7 +2262,8 @@ def _write_meta(conn: kuzu.Connection, tables: GraphTables, source_root: Path) -
21832262
"http_calls_match_breakdown: $http_calls_match_breakdown, "
21842263
"async_calls_match_breakdown: $async_calls_match_breakdown, "
21852264
"cross_service_calls_total: $cross_service_calls_total, "
2186-
"pass3_skipped_cross_service: $pass3_skipped_cross_service})",
2265+
"pass3_skipped_cross_service: $pass3_skipped_cross_service, "
2266+
"cross_service_resolution: $cross_service_resolution})",
21872267
{
21882268
"k": "graph",
21892269
"ov": ONTOLOGY_VERSION,
@@ -2209,6 +2289,7 @@ def _write_meta(conn: kuzu.Connection, tables: GraphTables, source_root: Path) -
22092289
"async_calls_match_breakdown": json.dumps(async_match),
22102290
"cross_service_calls_total": int(call_stats.cross_service_calls_total),
22112291
"pass3_skipped_cross_service": int(tables.pass3_skipped_cross_service),
2292+
"cross_service_resolution": str(tables.cross_service_resolution),
22122293
},
22132294
)
22142295

graph_enrich.py

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,6 +143,41 @@ def _load_config_microservice_roots(project_root_str: str) -> tuple[str, ...]:
143143
return ()
144144

145145

146+
@lru_cache(maxsize=64)
147+
def _load_config_cross_service_resolution(project_root_str: str) -> str:
148+
"""Read `cross_service_resolution` from `.lancedb-mcp.yml` at project_root.
149+
150+
Returns "auto" or "brownfield_only". Defaults to "auto" when the key is absent
151+
or the file is missing / malformed. Unknown values warn on stderr and fall back
152+
to "auto".
153+
"""
154+
root = Path(project_root_str)
155+
for name in CONFIG_FILENAMES:
156+
candidate = root / name
157+
if not candidate.is_file():
158+
continue
159+
try:
160+
import yaml # PyYAML; already a transitive dep of cocoindex
161+
except ImportError:
162+
return "auto"
163+
try:
164+
data = yaml.safe_load(candidate.read_text(encoding="utf-8"))
165+
except Exception:
166+
return "auto"
167+
if not isinstance(data, dict):
168+
return "auto"
169+
val = data.get("cross_service_resolution", "auto")
170+
if val not in {"auto", "brownfield_only"}:
171+
print(
172+
f"[lancedb-mcp] cross_service_resolution: unknown value "
173+
f"{val!r}, falling back to 'auto'",
174+
file=sys.stderr,
175+
)
176+
return "auto"
177+
return val
178+
return "auto"
179+
180+
146181
def load_microservice_overrides(project_root: str | Path | None) -> tuple[str, ...]:
147182
"""Combined override list (env var ++ config file).
148183

kuzu_queries.py

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -346,7 +346,8 @@ def meta(self) -> dict[str, Any]:
346346
"m.http_calls_match_breakdown AS http_calls_match_breakdown, "
347347
"m.async_calls_match_breakdown AS async_calls_match_breakdown, "
348348
"m.cross_service_calls_total AS cross_service_calls_total, "
349-
"m.pass3_skipped_cross_service AS pass3_skipped_cross_service"
349+
"m.pass3_skipped_cross_service AS pass3_skipped_cross_service, "
350+
"m.cross_service_resolution AS cross_service_resolution"
350351
)
351352
_META_PRE_E3 = (
352353
"MATCH (m:GraphMeta) RETURN m.key AS key, m.ontology_version AS ontology_version, "
@@ -422,6 +423,7 @@ def meta(self) -> dict[str, Any]:
422423
async_calls_match_breakdown: dict[str, Any] = {}
423424
cross_service_calls_total = 0
424425
pass3_skipped_cross_service = 0
426+
cross_service_resolution: str | None = None
425427
if meta_mode != "legacy":
426428
rfw_raw = row.get("routes_by_framework") or "{}"
427429
try:
@@ -478,6 +480,11 @@ def meta(self) -> dict[str, Any]:
478480
async_calls_match_breakdown = {}
479481
cross_service_calls_total = int(row.get("cross_service_calls_total") or 0)
480482
pass3_skipped_cross_service = int(row.get("pass3_skipped_cross_service") or 0)
483+
if meta_mode == "pr_e3":
484+
raw_csr = row.get("cross_service_resolution")
485+
cross_service_resolution = (
486+
str(raw_csr) if raw_csr not in (None, "") else None
487+
)
481488
return {
482489
"ontology_version": int(row.get("ontology_version") or 0),
483490
"built_at": int(row.get("built_at") or 0),
@@ -502,6 +509,7 @@ def meta(self) -> dict[str, Any]:
502509
"async_calls_match_breakdown": async_calls_match_breakdown,
503510
"cross_service_calls_total": cross_service_calls_total,
504511
"pass3_skipped_cross_service": pass3_skipped_cross_service,
512+
"cross_service_resolution": cross_service_resolution,
505513
"db_path": self.db_path,
506514
}
507515

File renamed without changes.

server.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -343,6 +343,7 @@ class GraphMetaOutput(BaseModel):
343343
http_calls_match_breakdown: dict[str, int] = Field(default_factory=dict)
344344
async_calls_match_breakdown: dict[str, int] = Field(default_factory=dict)
345345
cross_service_calls_total: int = 0
346+
cross_service_resolution: str | None = None
346347
message: str | None = None
347348

348349

@@ -523,6 +524,7 @@ def _graph_meta_output() -> GraphMetaOutput:
523524
http_calls_match_breakdown={str(k): int(v) for k, v in (meta.get("http_calls_match_breakdown") or {}).items()},
524525
async_calls_match_breakdown={str(k): int(v) for k, v in (meta.get("async_calls_match_breakdown") or {}).items()},
525526
cross_service_calls_total=int(meta.get("cross_service_calls_total") or 0),
527+
cross_service_resolution=meta.get("cross_service_resolution"),
526528
)
527529

528530

0 commit comments

Comments
 (0)