Skip to content

feat(mcp): EdgeFilter on neighbors_v2#182

Merged
HumanBean17 merged 3 commits into
masterfrom
feat/calls-noise-edge-filter
May 19, 2026
Merged

feat(mcp): EdgeFilter on neighbors_v2#182
HumanBean17 merged 3 commits into
masterfrom
feat/calls-noise-edge-filter

Conversation

@HumanBean17
Copy link
Copy Markdown
Owner

Summary

  • Add typed EdgeFilter on neighbors_v2 (CALLS-only, fail-loud when an attribute is not on every requested edge label): min_confidence, include_strategies / exclude_strategies (xor), callee_declaring_role / lists / exclusions.
  • Dedicated CALLS Kuzu path: predicate pushdown in WHERE, ORDER BY e.call_site_line, e.call_site_byte, then pagination (no pre-filter row cap).
  • Docs + hints: AGENT-GUIDE role-filter trap and HV37 (exclude_callee_declaring_roles: ['OTHER'] drops known-external rows); Decision 20 OTHER-fallback and Decision 30 NodeFilter.role collision hints; MCP_HINTS_FIELD_DESCRIPTION updated.
  • Supersedes MCP-API-V2 “no per-edge filter on neighbors” (Decision 16). Ontology stays 15; no PR-3 surfaces (include_unresolved, UnresolvedCallSite, CLI stub).

Test plan

  • .venv/bin/python -m pytest tests/test_mcp_v2.py -k "ordered_by_call_site or edge_filter" -v
  • .venv/bin/python -m pytest tests/test_mcp_hints.py -k "role_collision or other_fallback" -v
  • tests/test_mcp_v2.py + tests/test_mcp_hints.py + tests/test_mcp_v2_compose.py (277 passed)
  • PR-2 sentinels (ORDER BY positive; no UnresolvedCallSite / include_unresolved / pass3 deletions)
  • test_neighbors_calls_perf_empty_filter_client_message_processor with JAVA_CODEBASE_RAG_RUN_HEAVY=1 (skipped in default CI)

Implements plans/PLAN-CALLS-NOISE.md PR-2 / propose/CALLS-NOISE-AND-RESOLUTION-PROPOSE.md §3.4.1–§3.4.2.

Made with Cursor

Add typed edge_filter for CALLS-only projection (confidence, strategies,
callee_declaring_role) with fail-loud schema validation, ORDER BY call site
in Kuzu before pagination, and Decision 20/30 hints. Supersedes MCP-V2
no per-edge filter on neighbors (Decision 16).

Co-authored-by: Cursor <cursoragent@cursor.com>
@HumanBean17 HumanBean17 force-pushed the feat/calls-noise-edge-filter branch from 4337de8 to bdba206 Compare May 19, 2026 18:26
Copy link
Copy Markdown
Owner Author

@HumanBean17 HumanBean17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review (PR-2 — EdgeFilter on neighbors_v2)

Verdict: Strong PR-2 core — dedicated CALLS path, Cypher pushdown, source ordering, fail-loud mixed stored edge types, and role-trap hints match the plan. Not merge-ready until the composed-edge + edge_filter gap is fixed or fail-louded; perf test should match Decision 31.


What works well

  • Scope discipline: no PR-3 surfaces (include_unresolved, UnresolvedCallSite, pass3 deletions); ontology stays 15; Decision 16 supersession noted in PR body.
  • CALLS-only fast path: pushdown + ORDER BY e.call_site_line, e.call_site_byte + SQL pagination when no NodeFilter — matches propose §3.4.1 and the filter-before-limit invariant.
  • Fail-loud applicability for mixed stored types (e.g. CALLS + OVERRIDES); strategy include/exclude xor.
  • Hints (Decisions 20 & 30) and AGENT-GUIDE updates (exclude_external not on neighbors, HV37 OTHER trap) are clear.
  • Named tests + pushdown monkeypatch are good regression guards.

Blockers

1. edge_filter silently ignored when CALLS is paired with composed dot-keys

use_calls_path requires flat_labels == ["CALLS"] and not composed_keys. For edge_types=['CALLS', 'DECLARES.EXPOSES'] + edge_filter={...}:

  • Applicability passes (only CALLS is in flat_types).
  • Execution uses the generic untyped [e] matcher — no pushdown and no ORDER BY.

That contradicts the propose contract (“every CALLS path uses ORDER BY”) and reintroduces a silent no-op the plan explicitly reversed. Agents can combine type rollup + CALLS projection and get unfiltered, unordered CALLS while success=true.

Suggested fix (pick one): fail loud when edge_filter is populated and edge_types is not exactly ['CALLS'] (no composed keys, no second stored label); or run the dedicated CALLS path and merge composed results separately.

2. Perf test is not Decision 31

Plan: pinned method empty-filter within 1.5× pre-PR-2 median. Test uses median_sec < 3.0 * 1.5 (fixed 4.5s), not a relative baseline — won’t catch pushdown regressions on the hot path. Heavy-gate is fine; assertion should compare to an actual baseline or document manual evidence only.


Important (non-blocking)

  • Conflicting role axes: only strategy include/exclude are xor’d; callee_declaring_role + callee_declaring_roles (or include + exclude lists) can AND contradictory WHERE clauses — consider a @model_validator.
  • Multi-origin ids: each origin loads the full CALLS stream then one slice — undocumented footgun for high fan-out; worth a doc line or fail-loud cap.
  • NodeFilter + edge_filter: pushdown + Python pagination after full load — expected, but hot methods may spike latency.
  • README: re-index callout mentions edge_filter; tool table row for neighbors (~247) still omits it.
  • MCP_HINTS_FIELD_DESCRIPTION: documents PR-3 include_unresolved / dedup_calls before they exist — LLM clients may try non-existent params until PR-3.

Nits

  • Duplicate _client_message_processor_process_id in two test modules.
  • Fail-loud stderr category: Pydantic EdgeFilterunknown_key, applicability → edge_filter.
  • No ontology validation for CALLS strategy strings (typos → empty results) — acceptable if documented.

Merge checklist

  1. Fail loud (or dual-path) when edge_filter is set with composed keys or any second stored label beyond pure CALLS.
  2. Fix or relabel perf test to match 1.5× baseline language.
  3. Optional: README neighbors row + role-axis xor validator.

Happy to re-review after (1).

Fail loud when edge_filter is paired with composed keys or extra stored
labels; add role-axis xor, baseline-relative perf gate, docs, and tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
@HumanBean17
Copy link
Copy Markdown
Owner Author

Review (assume green CI)

Verdict: approve with reservations. Core PR-2 contract looks solid: dedicated CALLS path, ORDER BY before pagination, Cypher pushdown for edge_filter, fail-loud on mixed/composed edge_types, and useful docs/hints. The follow-up commit (composed keys + edge_filter, role-axis XOR) closes real gaps.

What works well

  • neighbor_calls_for_symbol centralizes ordering + pushdown; single-origin + edge_filter only gets SQL SKIP/LIMIT after predicates (matches propose “filter before slice”).
  • Fail-loud framing matches NodeFilter (mixed labels, composed dot-keys, strategy/role XOR).
  • AGENT-GUIDE is honest about filter vs edge_filter, HV37 (exclude_callee_declaring_roles: ['OTHER'] drops known-external), and exclude_external not on neighbors.
  • Hints fix (fuzzy meta + neighbors_calls_meta_hints always) looks correct.
  • Scope discipline: no PR-3 surfaces; ontology stays 15.

Issues worth addressing (non-blocking unless you want polish pre-merge)

  1. Multi-id + high fan-out — Each origin loads the full CALLS stream (or full stream + Python NodeFilter), then one global offset/limit on the merged list. Slowness is documented; worth one explicit line that pagination is over concatenated ids order (first origin can starve later ones), not global source order.

  2. No enum validation on edge_filter — Typos in callee_declaring_role / strategies fail silently (empty results). Consider validating against VALID_ROLES / call-strategy sets from java_ontology.py with fail-loud messages (same spirit as NodeFilter).

  3. Decision 30 collision hint — Fires when filter.role is set and ≥75% of method neighbors are OTHER. The common mistake (filter.role='SERVICE' expecting callee stereotype) often yields empty results, not dominantly OTHER — so the hint may miss when agents need it. Test uses role: 'OTHER' to force the threshold (proves template, not the real confusion case). Consider triggering on stereotype-like node_role values or suppressing when node_role == 'OTHER'.

  4. Decision 20 OTHER-fallback — Uses total count_calls_for_symbol ≥ 5 when exact SERVICE/REPOSITORY filter returns empty; can over-fire on methods with many calls but legitimately zero repo/service callees. Tighter signal (e.g. count non-OTHER callees) would help.

  5. Fixture FQN in production_CLIENT_MESSAGE_PROCESSOR_PROCESS_FQN in mcp_v2.py couples runtime code to bank-chat; prefer keeping the anchor in tests/pinned_ids.py only.

  6. Minorserver.py edge_filter description doesn’t mention JSON-string coercion though _coerce_edge_filter supports it (unlike filter). Plan/propose mention an EDGE_SCHEMA CI test; delivery is indirect (pushdown + fail-loud tests).

  7. Perf — Heavy-gated median test is fine per plan; CI won’t catch ORDER BY regressions on the pinned hot method.

Merge recommendation

Merge if we accept PR-2 limits above. Fast-follow: move fixture FQN out of mcp_v2.py, one AGENT-GUIDE batching sentence, optional enum validation + hint tightening.

Document multi-id concatenation pagination; align server edge_filter
descriptions; move bank-chat perf FQN to tests/pinned_ids only.
Hint tuning tracked in #183 and #184.

Co-authored-by: Cursor <cursoragent@cursor.com>
@HumanBean17 HumanBean17 merged commit 38b0819 into master May 19, 2026
1 check passed
@HumanBean17 HumanBean17 deleted the feat/calls-noise-edge-filter branch May 23, 2026 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant