Research: hybrid retrieval (vector × graph fusion via RRF)

## Type: Research (not implementation)

This issue is a **research request**, not an implementation ticket. The expected output is a written proposal in `propose/HYBRID-RETRIEVAL-PROPOSE.md` (or similar), not code. Decision about whether/when to implement is deferred until the research is reviewed.

## Background

`propose/PRODUCT-VISION.md` § 1.2 and § 2.1 commit to hybrid retrieval as a core architectural goal:

> "A **hybrid retrieval** system that runs both [vector + graph] in parallel and fuses results via Reciprocal Rank Fusion (RRF) consistently outperforms either layer alone."

…and the architecture diagram explicitly shows a "Context merge (RRF: vector, FTS, graph-expanded chunks)" step before the LLM. **Today, this fusion does not exist as a tool surface.** The MCP exposes vector search (`search`) and graph traversal (`neighbors`/`find`/`describe`) as separate tools. The model has to chain them manually and reconcile results in its own context — which is exactly the workflow the PRODUCT-VISION document argues against.

## Why "research" first, not "implement"

RRF fusion across heterogeneous result types (LanceDB hybrid hits with cosine scores + Kuzu graph nodes from BFS expansion) is **not a standard library call**. Several non-trivial design questions must be answered before any plan is implementable:

1. **What is the fused unit?** Vector search returns `(table_name, chunk_id, score, text)` rows. Graph traversal returns `(node_id, kind, fqn, edge_type, distance)`. RRF fuses ranked lists — what's the canonical "thing" being ranked when one source returns chunks and the other returns symbols?
2. **How do graph results get a rank?** RRF requires ordered lists. Vector hits have a natural order (cosine similarity). Graph BFS results are a set, not a ranking — does distance from seed serve as inverse rank? Edge-type weighting? Centrality?
3. **What seeds the graph traversal?** The vector hits themselves (vector → graph as a re-ranker)? An NL query parsed into entity mentions? Both, in parallel?
4. **Which queries actually benefit?** PRODUCT-VISION's table claims hybrid wins for "trace the full call path from REST endpoint to DB." What's the query taxonomy where pure vector or pure graph already suffices, and the additional latency of fusion is wasted?
5. **What's the API shape?** A new tool (`search_hybrid`)? A `mode="hybrid"` flag on the existing `search`? An invisible internal upgrade where `search` always fuses?
6. **What's the latency budget?** Vector search is ~50ms, graph BFS is ~10–50ms; fusion adds RRF computation + double the IO. Is this acceptable for the AMA agent's interactive loop, or is it a batch-time enrichment only?
7. **Does this displace the existing tool surface?** If `search` becomes hybrid-by-default, do `find`/`neighbors`/`describe` lose value, or do they remain as the "I want only graph" escape hatch?

These are design decisions, not implementation details. Answering them requires reading the literature, prototyping at least two RRF formulations on a real fixture, and benchmarking against the existing single-source paths.

## Inputs the research should read

- `propose/PRODUCT-VISION.md` — especially § 1.2 (retrieval gap analysis), § 2.1 (architecture diagram), and the cited 2026 benchmark on Java codebases (Shopizer/ThingsBoard/OpenMRS)
- The 4 footnote-cited papers (`[^1]`–`[^4]`) on DKB / GraphRAG benchmarks — the propose doc references them but the design conclusions weren't carried into a follow-up proposal
- Existing implementations of RRF in IR/RAG systems: Microsoft GraphRAG (community-summary fusion), LangChain's `EnsembleRetriever`, LlamaIndex's `RouterRetriever`/`HybridRetriever`, Vespa's hybrid scoring
- LanceDB's existing hybrid score (it already fuses vector + FTS via RRF internally — `search_lancedb.py` returns hybrid hits) — understand what's *already* fused before adding a third source
- Current MCP surface: `search`, `find`, `describe`, `neighbors` — how each is wired in `mcp_v2.py` / `search_lancedb.py` / `kuzu_queries.py`

## Expected research output

A `propose/HYBRID-RETRIEVAL-PROPOSE.md` (or `propose/HYBRID-RETRIEVAL-RESEARCH.md`) document containing:

1. **Problem framing** — what queries fail today on each isolated layer, with concrete examples from the bank-chat-system fixture
2. **Survey of fusion strategies** — RRF (with `k` constant), weighted linear, learned reranker, cascade (graph-as-rerank-of-vector vs vector-as-seed-of-graph), with pros/cons
3. **Result-type unification** — proposal for a canonical "fused result row" schema; how chunks map to symbols and vice versa
4. **API shape recommendation** — new tool vs flag vs invisible default, with reasoning
5. **Latency model** — back-of-envelope for each fusion strategy on a 100K-LOC fixture
6. **Prototype results** — at minimum, RRF run on the bank-chat-system fixture for 5 representative queries, comparing fused vs vector-only vs graph-only top-10s
7. **Decision points** that need explicit answers before any plan is implementable (the §11 [TBD] pattern from `propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md` is a good template)
8. **Out-of-scope list** — what this proposal explicitly does NOT decide

## Non-goals for this issue

- **No code changes.** No new tool, no `search` modifications, no Kuzu queries.
- **No `plans/PLAN-*.md` file.** Plans come after the propose is reviewed and decisions are locked.
- **No commitment to ship.** Research may conclude that current single-source tools are good enough, and that fusion is best left as a model-side concern (the agent chains tools and fuses in its own reasoning). That's a valid outcome.

## Suggested rollout

Single propose doc, no branch needed for research-only. Once the propose lands and is reviewed, derive an implementable `plans/PLAN-HYBRID-RETRIEVAL.md` if the decision is to proceed. If the decision is to defer, close this issue and reference the propose doc as the rationale.

## Related

- `propose/PRODUCT-VISION.md` § 1.2, § 2.1
- `propose/TIER2-INCREMENTAL-REBUILD-PROPOSE.md` — same "research-first" pattern, good template for structure
- Issue #56 (`NodeFilter` symbol_kind split) — adjacent v2.x polish; this issue is the v3 architectural counterpart


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Research: hybrid retrieval (vector × graph fusion via RRF) #57

Type: Research (not implementation)

Background

Why "research" first, not "implement"

Inputs the research should read

Expected research output

Non-goals for this issue

Suggested rollout

Related

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Research: hybrid retrieval (vector × graph fusion via RRF) #57

Description

Type: Research (not implementation)

Background

Why "research" first, not "implement"

Inputs the research should read

Expected research output

Non-goals for this issue

Suggested rollout

Related

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions