MCP Server Specification — codeagent-index-engine

1. Overview

The codeagent-mcp server exposes the index engine and basic file system navigation as a single MCP (Model Context Protocol) endpoint. Any MCP-compatible client — LLM agent, desktop app, IDE extension, CLI — connects to the same server and uses the same tools.

Transport: stdio (primary) and SSE (for network-accessible scenarios).

Architecture:

┌─────────────────────────────────────────────────────┐
│                  MCP Clients                        │
│  (LLM agents, Tauri app, VS Code, CLI, etc.)        │
└──────────────────────┬──────────────────────────────┘
                       │  MCP (stdio or SSE)
                       ▼
              ┌─────────────────┐
              │  codeagent-mcp  │   ← thin Rust crate (new)
              │   (MCP server)  │
              └────────┬────────┘
                       │
          ┌────────────┼────────────┐
          ▼            ▼            ▼
   ┌────────────┐ ┌─────────┐ ┌─────────┐
   │ codeagent- │ │  std::fs │ │ Engine  │
   │   core     │ │ (sandboxed│ │ lifecycle│
   │ (queries)  │ │  to repo) │ │ (status)│
   └────────────┘ └─────────┘ └─────────┘

The MCP server is a thin wrapper. All indexing logic remains in codeagent-core. File system operations use std::fs scoped to the repository root. No business logic lives in the MCP layer.

2. Server Lifecycle

Event	Behaviour
Startup	Load config from `.codeagent/config.json`. Open SQLite writer + reader pool. Start file watcher. Register sqlite-vec extension. Run migrations.
Shutdown	Cancel watcher. Drain write queue. Checkpoint WAL. Close connections.
Health	Exposed via `get_status` tool (see §4.5).

The server process owns the single-writer SQLite connection. Multiple MCP clients can connect simultaneously; all reads go through the reader pool, all writes are serialised through the writer channel.

3. Tool Categories

Tools are grouped into four categories:

File System — raw file/directory navigation (no engine involvement)
Search & Discovery — finding symbols by text, name, or similarity
Inspection & Navigation — examining symbols and traversing the graph
Engine Management — indexing triggers and status

4. Tool Definitions

4.1 File System

`list_directory`

List files and directories at a path within the repository.

Parameter	Type	Required	Description
`path`	string	No	Repo-relative path. Defaults to repo root (`""`).

Returns: Array of entries, each with:

name (string) — file or directory name
type ("file" | "directory")
size (number) — file size in bytes (files only)

Constraints:

Path must resolve within the repo root (no .. escape).
Respects .gitignore rules.
Does not follow symlinks outside the repo root.

`read_file`

Read the contents of a file.

Parameter	Type	Required	Description
`path`	string	Yes	Repo-relative path to the file.
`line_start`	number	No	1-based start line (inclusive).
`line_end`	number	No	1-based end line (inclusive).

Returns:

content (string) — file contents (or line range)
total_lines (number) — total line count of the file
truncated (boolean) — true if content was capped

Constraints:

Path must resolve within the repo root.
Maximum output: 10,000 lines or 500 KB (whichever is smaller). Returns truncated: true if capped.
Binary files return an error with the detected MIME type.

`get_directory_tree`

Recursive directory structure.

Parameter	Type	Required	Description
`path`	string	No	Repo-relative root. Defaults to `""`.
`depth`	number	No	Max recursion depth. Default 3, max 10.

Returns: Nested tree structure:

{
  "name": "src",
  "type": "directory",
  "children": [
    { "name": "main.rs", "type": "file" },
    { "name": "lib", "type": "directory", "children": [...] }
  ]
}

Constraints:

Respects .gitignore.
Directories with >1,000 entries return the first 1,000 with a truncated: true flag.

4.2 Search & Discovery

`search_symbols`

Full-text search across all indexed symbols using FTS5 BM25 ranking.

Parameter	Type	Required	Description
`query`	string	Yes	FTS5 query string (e.g. `"Authenticate*"`, `"UserService"`)
`project_id`	string	No	Scope to a specific project
`node_type`	string	No	Filter by type: `class`, `method`, `interface`, `property`, `component`, `file`, `module`, `type`, `constructor`
`language`	string	No	Filter: `csharp` or `typescript`
`limit`	number	No	Max results (default 20, max 50)

Returns: Array of results ranked by BM25 relevance:

{
  "node_id": "...",
  "name": "Authenticate",
  "qualified_name": "MyApp.Auth.AuthService.Authenticate",
  "node_type": "method",
  "language": "csharp",
  "file_path": "src/Auth/AuthService.cs",
  "line_start": 42,
  "line_end": 58,
  "access_modifier": "public",
  "parameter_signature": "(string username, string password)",
  "return_type": "Task<AuthResult>",
  "rank": -8.32
}

Backed by: query::filter_nodes with fts_query.

`lookup_symbol`

Find symbol(s) by exact qualified name. May return multiple results (overloads, partial classes).

Parameter	Type	Required	Description
`qualified_name`	string	Yes	Exact qualified name (e.g. `"MyApp.Auth.AuthService.Authenticate"`)
`language`	string	No	Filter: `csharp` or `typescript`
`project_id`	string	No	Scope to a specific project

Returns: Array of matching nodes (same shape as search_symbols results).

Backed by: query::get_node_by_qualified_name.

`find_similar`

Find symbols semantically similar to a given symbol using embedding similarity.

Parameter	Type	Required	Description
`node_id`	string	Yes	The reference symbol's node ID
`limit`	number	No	Max results (default 10, max 50)

Returns: Array of nodes with similarity scores.

Status: Deferred until Phase 4 ANN search is implemented. Currently, vec_nodes uses a regular table (brute-force scan), which is acceptable for small codebases but not production-ready.

Backed by: Embedding lookup + cosine similarity over vec_nodes.

4.3 Inspection & Navigation

`get_symbol`

Get full metadata for a single symbol by ID.

Parameter	Type	Required	Description
`node_id`	string	Yes	The symbol's node ID

Returns: Complete node metadata:

{
  "node_id": "...",
  "name": "Authenticate",
  "qualified_name": "MyApp.Auth.AuthService.Authenticate",
  "node_type": "method",
  "language": "csharp",
  "file_path": "src/Auth/AuthService.cs",
  "line_start": 42,
  "line_end": 58,
  "access_modifier": "public",
  "is_public_api": true,
  "is_static": false,
  "is_abstract": false,
  "is_async": true,
  "is_override": false,
  "is_deprecated": false,
  "has_doc_comment": true,
  "parse_status": "full",
  "parameter_signature": "(string username, string password)",
  "parameter_count": 2,
  "return_type": "Task<AuthResult>",
  "reference_count": 14
}

Backed by: query::get_node.

`get_source_spans`

Get all source locations for a symbol (supports partial classes / multi-file symbols).

Parameter	Type	Required	Description
`node_id`	string	Yes	The symbol's node ID

Returns: Array of source spans ordered by primary-first, then file path + line:

[
  {
    "file_path": "src/Auth/AuthService.cs",
    "line_start": 42,
    "line_end": 58,
    "is_primary": true
  }
]

Backed by: query::get_source.

`get_file_outline`

List all symbols defined in a file, ordered by line number.

Parameter	Type	Required	Description
`path`	string	Yes	Repo-relative file path

Returns: Array of symbols in line order:

[
  {
    "node_id": "...",
    "name": "AuthService",
    "qualified_name": "MyApp.Auth.AuthService",
    "node_type": "class",
    "line_start": 10,
    "line_end": 120,
    "access_modifier": "public"
  },
  {
    "node_id": "...",
    "name": "Authenticate",
    "node_type": "method",
    "line_start": 42,
    "line_end": 58,
    "access_modifier": "public",
    "parameter_signature": "(string, string)",
    "return_type": "Task<AuthResult>"
  }
]

Backed by: query::get_outline (requires FileId derivation from path via derive_file_id).

`get_callers`

Find all symbols that call a given symbol.

Parameter	Type	Required	Description
`node_id`	string	Yes	The target symbol's node ID

Returns: Array of calling symbols with edge metadata:

[
  {
    "node_id": "...",
    "name": "LoginController.HandleLogin",
    "qualified_name": "MyApp.Controllers.LoginController.HandleLogin",
    "node_type": "method",
    "file_path": "src/Controllers/LoginController.cs",
    "line_start": 25,
    "confidence": "exact"
  }
]

Backed by: query::get_neighbors(node_id, Some(EdgeType::Calls), EdgeDirection::Incoming).

`get_callees`

Find all symbols that a given symbol calls.

Parameter	Type	Required	Description
`node_id`	string	Yes	The source symbol's node ID

Returns: Same shape as get_callers.

Backed by: query::get_neighbors(node_id, Some(EdgeType::Calls), EdgeDirection::Outgoing).

`get_implementations`

Find all symbols that implement a given interface or extend a base class.

Parameter	Type	Required	Description
`node_id`	string	Yes	The interface or base class node ID

Returns: Array of implementing/extending symbols with edge metadata. Includes both Implements and Extends edge types.

Backed by: query::get_neighbors(node_id, None, EdgeDirection::Incoming) filtered to Implements and Extends edges.

`get_references`

Find all symbols that reference a given symbol (broader than callers — includes type references, imports, etc.).

Parameter	Type	Required	Description
`node_id`	string	Yes	The referenced symbol's node ID

Returns: Array of referencing symbols with edge type and confidence.

Backed by: query::get_neighbors(node_id, Some(EdgeType::References), EdgeDirection::Incoming).

`get_dependencies`

Get all outgoing relationships from a symbol (what it depends on).

Parameter	Type	Required	Description
`node_id`	string	Yes	The source symbol's node ID
`edge_type`	string	No	Filter to a specific edge type: `calls`, `inherits`, `implements`, `imports`, `overrides`, `references`, `contains`, `accepts`, `extends`

Returns: Array of dependency symbols with edge type and direction.

Backed by: query::get_neighbors(node_id, edge_type_filter, EdgeDirection::Outgoing).

`get_dependents`

Get all incoming relationships to a symbol (what depends on it).

Parameter	Type	Required	Description
`node_id`	string	Yes	The target symbol's node ID
`edge_type`	string	No	Filter to a specific edge type

Returns: Same shape as get_dependencies.

Backed by: query::get_neighbors(node_id, edge_type_filter, EdgeDirection::Incoming).

4.4 Engine Management

`index_files`

Trigger indexing for a set of file paths. Creates a ChangeBatch and runs it through the ingest pipeline.

Parameter	Type	Required	Description
`paths`	string[]	Yes	Repo-relative file paths to index

Returns:

indexed (number) — files successfully processed
errors (array) — per-file errors, if any

Constraints:

Paths must resolve within the repo root.
Maximum 100 paths per call.

Backed by: ingest::pipeline::IngestPipeline::process_batch.

`get_status`

Engine health and indexing status.

Parameter	Type	Required	Description
—	—	—	No parameters

Returns:

{
  "healthy": true,
  "schema_version": 4,
  "indexed_files": 1247,
  "indexed_symbols": 18392,
  "languages": ["csharp", "typescript"],
  "watcher_active": true,
  "last_batch_at": "2026-02-24T10:15:30Z",
  "embedding_model": "all-MiniLM-L6-v2",
  "embeddings_count": 17500
}

Backed by: Metadata queries against _metadata, nodes count, vec_nodes count.

5. Error Handling

All tools return errors in a consistent format:

{
  "error": {
    "code": "not_found",
    "message": "No symbol found with node_id '...'"
  }
}

Standard error codes:

Code	Meaning
`not_found`	Requested resource does not exist
`invalid_parameter`	Parameter value is invalid or out of range
`path_escape`	Path resolves outside the repository root
`binary_file`	Attempted to read a binary file
`too_large`	Request exceeds size limits
`engine_unavailable`	Engine not initialised or shutting down
`index_error`	Indexing failed (details in message)

6. Security Constraints

Repo-root sandboxing — all file system tools resolve paths relative to the repo root. Paths containing .. that escape the root are rejected with path_escape.
No writes through MCP — the MCP server exposes read-only file system access. Graph writes only happen through the ingest pipeline (index_files), never through direct graph mutation tools.
No credential exposure — get_status does not expose file system paths, config file contents, or any authentication material.
Symlink safety — symlinks that resolve outside the repo root are not followed (consistent with the file watcher's existing symlink/junction guard).

7. Implementation Crate

New workspace member: crates/codeagent-mcp

codeagent-engine/
  crates/
    codeagent-core/    ← existing (unchanged)
    codeagent-cli/     ← existing (unchanged)
    codeagent-mcp/     ← new
      src/
        main.rs        ← MCP server entry point, transport setup
        tools/
          mod.rs       ← tool registration
          filesystem.rs ← list_directory, read_file, get_directory_tree
          search.rs    ← search_symbols, lookup_symbol, find_similar
          navigation.rs ← get_symbol, get_source_spans, get_file_outline,
                          get_callers, get_callees, get_implementations,
                          get_references, get_dependencies, get_dependents
          management.rs ← index_files, get_status
        state.rs       ← shared engine state (writer, reader pool, config, watcher)

Dependencies: codeagent-core, an MCP SDK crate (e.g. rmcp or equivalent), tokio, serde_json.

8. Relationship to Existing Phases

This MCP server replaces Phase 5 (RLM Orchestration). The orchestration layer is no longer part of the index engine — any LLM agent that connects via MCP brings its own orchestration.

Phase numbering becomes:

Phase 1 — Foundation (complete)
Phase 2 — Semantic Enrichment & Rename Detection (complete)
Phase 3 — Summary & Embeddings (complete)
Phase 4 — Retrieval & Eval
Phase 5 — MCP Server
Phase 6 — Hardening & Observability

The old Phase 5 (RLM Orchestration) and related concepts (root LM, sub-LM, visited-node tracking, system prompt authoring, cost hierarchy, rate limiting) are removed from the engine scope. The .NET backend integration for LLM completion is also removed — the engine is now a pure local tool server.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MCP Server Specification — codeagent-index-engine

1. Overview

2. Server Lifecycle

3. Tool Categories

4. Tool Definitions

4.1 File System

`list_directory`

`read_file`

`get_directory_tree`

4.2 Search & Discovery

`search_symbols`

`lookup_symbol`

`find_similar`

4.3 Inspection & Navigation

`get_symbol`

`get_source_spans`

`get_file_outline`

`get_callers`

`get_callees`

`get_implementations`

`get_references`

`get_dependencies`

`get_dependents`

4.4 Engine Management

`index_files`

`get_status`

5. Error Handling

6. Security Constraints

7. Implementation Crate

8. Relationship to Existing Phases

FilesExpand file tree

MCP_SERVER_SPEC.md

Latest commit

History

MCP_SERVER_SPEC.md

File metadata and controls

MCP Server Specification — codeagent-index-engine

1. Overview

2. Server Lifecycle

3. Tool Categories

4. Tool Definitions

4.1 File System

list_directory

read_file

get_directory_tree

4.2 Search & Discovery

search_symbols

lookup_symbol

find_similar

4.3 Inspection & Navigation

get_symbol

get_source_spans

get_file_outline

get_callers

get_callees

get_implementations

get_references

get_dependencies

get_dependents

4.4 Engine Management

index_files

get_status

5. Error Handling

6. Security Constraints

7. Implementation Crate

8. Relationship to Existing Phases

`list_directory`

`read_file`

`get_directory_tree`

`search_symbols`

`lookup_symbol`

`find_similar`

`get_symbol`

`get_source_spans`

`get_file_outline`

`get_callers`

`get_callees`

`get_implementations`

`get_references`

`get_dependencies`

`get_dependents`

`index_files`

`get_status`