Skip to content

docs(rfd): Сustom LLM endpoints#648

Open
xtmq wants to merge 11 commits intoagentclientprotocol:mainfrom
xtmq:evgeniy.stepanov/rfd-custom-url
Open

docs(rfd): Сustom LLM endpoints#648
xtmq wants to merge 11 commits intoagentclientprotocol:mainfrom
xtmq:evgeniy.stepanov/rfd-custom-url

Conversation

@xtmq
Copy link

@xtmq xtmq commented Mar 4, 2026


title: "Custom LLM Endpoint Configuration"

Elevator pitch

What are you proposing to change?

Add the ability for clients to pass custom LLM endpoint URLs and authentication credentials to agents via a dedicated setLlmEndpoints method, with support for multiple LLM protocols. This allows clients to route LLM requests through their own infrastructure (proxies, gateways, or self-hosted models) without agents needing to know about this configuration in advance.

Status quo

How do things work today and what problems does this cause? Why would we change things?

Currently, agents are configured with their own LLM endpoints and credentials, typically through environment variables or configuration files. This creates problems for:

  • Client proxies: Clients want to route agent traffic through their own proxies, f.i. for setting additional headers or logging
  • Enterprise deployments: Organizations want to route LLM traffic through their own proxies for compliance, logging, or cost management
  • Self-hosted models: Users running local LLM servers (vLLM, Ollama, etc.) cannot easily redirect agent traffic
  • API gateways: Organizations using LLM gateways for rate limiting, caching, or multi-provider routing

Shiny future

How will things play out once this feature exists?

Clients will be able to:

  1. Discover whether an agent supports custom LLM endpoints via capabilities during initialization
  2. Perform agent configuration, including authorization based on this knowledge
  3. Pass custom LLM endpoint URLs and headers for different LLM protocols via a dedicated method
  4. Have agent LLM requests automatically routed through the appropriate endpoint based on the LLM protocol

Implementation details and plan

Tell me more about your implementation. What is your detailed implementation plan?

Intended flow

The design uses a two-step approach: capability discovery during initialization, followed by endpoint configuration via a dedicated method. This enables the following flow:

sequenceDiagram
    participant Client
    participant Agent

    Client->>Agent: initialize
    Note right of Agent: Agent reports capabilities,<br/>including llmEndpoints support
    Agent-->>Client: initialize response<br/>(agentCapabilities.llmEndpoints.protocols)

    Note over Client: Client sees supported protocols.<br/>Performs configuration / authorization<br/>based on this knowledge.

    Client->>Agent: setLlmEndpoints
    Agent-->>Client: setLlmEndpoints response

    Note over Client,Agent: Ready for session setup
    Client->>Agent: session/new
Loading
  1. Initialization: The client calls initialize. The agent responds with its capabilities, including an llmEndpoints object listing supported LLM protocols.
  2. Client-side decision: The client inspects the supported protocols list. If the agent lists protocols in llmEndpoints.protocols, the client can perform authorization, resolve credentials, or configure endpoints for those specific protocols. If llmEndpoints is absent or the protocols list is empty, the client falls back to a different authorization and configuration strategy.
  3. Endpoint configuration: The client calls setLlmEndpoints with endpoint configurations for the supported protocols.
  4. Session creation: The client proceeds to create a session.

Capability advertisement

The agent advertises support for custom LLM endpoints and lists its supported LLM protocols via a new llmEndpoints capability in agentCapabilities:

/**
 * LLM protocol identifier representing an API compatibility level, not a specific vendor.
 * For example, "openai" means any endpoint implementing the OpenAI-compatible API
 * (including proxies, gateways, and self-hosted servers like vLLM or Ollama).
 *
 * Well-known values: "anthropic", "openai", "google", "amazon".
 * Custom protocol identifiers are allowed for regional or emerging LLM APIs not covered by the well-known set.
 */
type LlmProtocol = string;

interface LlmEndpointsCapability {
  /**
   * Map of supported LLM protocol identifiers.
   * The client should only configure endpoints for protocols listed here.
   */
  protocols: Record<LlmProtocol, unknown>;

  /** Extension metadata */
  _meta?: Record<string, unknown>;
}

interface AgentCapabilities {
  // ... existing fields ...

  /**
   * Custom LLM endpoint support.
   * If present with a non-empty protocols map, the agent supports the setLlmEndpoints method.
   * If absent or protocols is empty, the agent does not support custom endpoints.
   */
  llmEndpoints?: LlmEndpointsCapability;
}

Initialize Response example:

{
  "jsonrpc": "2.0",
  "id": 0,
  "result": {
    "protocolVersion": 1,
    "agentInfo": {
      "name": "MyAgent",
      "version": "2.0.0"
    },
    "agentCapabilities": {
      "llmEndpoints": {
        "protocols": {
          "anthropic": {},
          "openai": {}
        }
      },
      "sessionCapabilities": {}
    }
  }
}

setLlmEndpoints method

A dedicated method that can be called after initialization but before session creation.

interface LlmEndpointConfig {
  /** Base URL for LLM API requests (e.g., "https://llm-proxy.corp.example.com/v1") */
  url: string;

  /**
   * Additional HTTP headers to include in LLM API requests.
   * Each entry is a header name mapped to its value.
   * Common use cases include Authorization, custom routing, or tracing headers.
   */
  headers?: Record<string, string> | null;

  /** Extension metadata */
  _meta?: Record<string, unknown>;
}

interface SetLlmEndpointsRequest {
  /**
   * Custom LLM endpoint configurations per LLM protocol.
   * When provided, the agent should route LLM requests to the appropriate endpoint
   * based on the protocol being used.
   * This configuration is per-process and should not be persisted to disk.
   */
  endpoints: Record<LlmProtocol, LlmEndpointConfig>;

  /** Extension metadata */
  _meta?: Record<string, unknown>;
}

interface SetLlmEndpointsResponse {
  /** Extension metadata */
  _meta?: Record<string, unknown>;
}

JSON Schema Additions

{
  "$defs": {
    "LlmEndpointConfig": {
      "description": "Configuration for a custom LLM endpoint.",
      "properties": {
        "url": {
          "type": "string",
          "description": "Base URL for LLM API requests."
        },
        "headers": {
          "type": ["object", "null"],
          "description": "Additional HTTP headers to include in LLM API requests.",
          "additionalProperties": {
            "type": "string"
          }
        },
        "_meta": {
          "additionalProperties": true,
          "type": ["object", "null"]
        }
      },
      "required": ["url"],
      "type": "object"
    },
    "LlmEndpoints": {
      "description": "Map of LLM protocol identifiers to endpoint configurations. This configuration is per-process and should not be persisted to disk.",
      "type": "object",
      "additionalProperties": {
        "$ref": "#/$defs/LlmEndpointConfig"
      }
    }
  }
}

Example Exchange

setLlmEndpoints Request:

{
  "jsonrpc": "2.0",
  "id": 2,
  "method": "setLlmEndpoints",
  "params": {
    "endpoints": {
      "anthropic": {
        "url": "https://llm-gateway.corp.example.com/anthropic/v1",
        "headers": {
          "Authorization": "Bearer anthropic-token-abc123",
          "X-Request-Source": "my-ide"
        }
      },
      "openai": {
        "url": "https://llm-gateway.corp.example.com/openai/v1",
        "headers": {
          "Authorization": "Bearer openai-token-xyz789"
        }
      }
    }
  }
}

setLlmEndpoints Response:

{
  "jsonrpc": "2.0",
  "id": 2,
  "result": {}
}

Behavior

  1. Capability discovery: The agent MUST list its supported protocols in agentCapabilities.llmEndpoints.protocols if it supports the setLlmEndpoints method. Clients SHOULD only send endpoint configurations for protocols listed there.

  2. Timing: The setLlmEndpoints method MUST be called after initialize and before session/new. Calling this MAY NOT affect currently running sessions. Agents MUST apply these settings to any sessions created or loaded after this has been called.

  3. Per-process scope: The endpoint configuration applies to the entire agent process lifetime. It should not be stored to disk or persist beyond the process.

  4. Protocol-based routing: The agent should route LLM requests to the appropriate endpoint based on the LLM protocol. If the agent uses a protocol not in the provided map, it uses its default endpoint for that protocol.

  5. Agent discretion: If an agent cannot support custom endpoints (e.g., uses a proprietary API), it should omit llmEndpoints from capabilities or return an empty protocols map.

Open questions

How should protocol identifiers be standardized? Resolved

Protocol identifiers are plain strings with a set of well-known values ("anthropic", "openai", "google", "amazon"). Custom identifiers are allowed for regional or emerging LLM APIs not covered by the well-known set. Agents advertise the protocol identifiers they understand; clients match against them.

How should model availability be handled? Resolved

Model availability is not a concern of this RFD. Most agents can discover available models from the configured endpoint at runtime. Agents that cannot should report an error if a user selects a model not supported by the custom endpoint.

Frequently asked questions

What questions have arisen over the course of authoring this document?

Why is LlmProtocol a plain string instead of a fixed enum?

The protocol wire format uses plain strings to keep the specification stable and avoid blocking adoption of new LLM APIs. A fixed enum would require a spec update every time a new protocol emerges.

Instead, well-known protocol identifiers are provided by SDKs as convenience constants:

TypeScript — an open string type with predefined values:

type LlmProtocol = string & {};

const LlmProtocols = {
  Anthropic: "anthropic",
  OpenAI: "openai",
  Google: "google",
  Amazon: "amazon",
} as const;

Kotlin — a string wrapper with predefined companion objects:

value class LlmProtocol(val value: String) {
    companion object {
        val Anthropic = LlmProtocol("anthropic")
        val OpenAI = LlmProtocol("openai")
        val Google = LlmProtocol("google")
        val Amazon = LlmProtocol("amazon")
    }
}

This way SDK users get discoverability and autocomplete for well-known protocols while remaining free to use any custom string for protocols not yet in the predefined set.

Why not pass endpoints in the initialize request?

Passing endpoints directly in initialize would require the client to have already resolved credentials and configured endpoints before knowing whether the agent supports this feature. In practice, the client needs to inspect the agent's capabilities first to decide its authorization strategy — for example, whether to route through a corporate proxy or use direct credentials. A dedicated method after initialization solves this chicken-and-egg problem and keeps capability negotiation separate from endpoint configuration.

Why not pass endpoint when selecting a model?

One option would be to pass the endpoint URL and credentials when the user selects a model (e.g., in session/new or a model selection method).

Many agents throw authentication errors before the model selection happens. This makes the flow unreliable.

Why not use environment variables or command-line arguments?

One option would be to pass endpoint configuration via environment variables (like OPENAI_API_BASE) or command-line arguments when starting the agent process.

This approach has significant drawbacks:

  • With multiple providers, the configuration becomes complex JSON that is awkward to pass via command-line arguments
  • Environment variables may be logged or visible to other processes, creating security concerns
  • Requires knowledge of agent-specific variable names or argument formats
  • No standardized way to confirm the agent accepted the configuration

What if the agent doesn't support custom endpoints?

If the agent doesn't support custom endpoints, llmEndpoints will be absent from agentCapabilities (or its protocols map will be empty). The client can detect this and choose an alternative authorization and configuration strategy, or proceed using the agent's default endpoints.

Revision history

  • 2026-03-07: Rename "provider" to "protocol" to reflect API compatibility level; make LlmProtocol an open string type with well-known values; resolve open questions on identifier standardization and model availability
  • 2026-03-04: Revised to use dedicated setLlmEndpoints method with capability advertisement
  • 2026-02-02: Initial draft - preliminary proposal to start discussion

@xtmq xtmq requested a review from a team as a code owner March 4, 2026 20:55
Copy link
Member

@benbrandt benbrandt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall I really like this! Some questions but no major objections

* based on the provider being used.
* This configuration is per-process and should not be persisted to disk.
*/
endpoints: Record<LlmProvider, LlmEndpointConfig>;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, so the agent can provide multiple possible endpoints, and the client can set all of them
So for agents supporting models from multiple providers, they can set them all at once?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that was the idea — some agents use different providers for different subagents


```typescript
/** Well-known LLM provider identifiers */
type LlmProvider = "anthropic" | "openai" | "google" | "amazon" | "will be added later";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the goal here to have "api compatibility"?
i.e. I see anthropic and therefore know that I can provide any anthropic compatible endpoint?

My question being: this could become a huge list, and I wonder if this should be driven more by endpoint compatibility?
though I guess depending on the provider you might need more or less auth requirements?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello Ben, thank you for the blazingly fast review!

Is the goal here to have "api compatibility"?
i.e. I see anthropic and therefore know that I can provide any anthropic compatible endpoint

Yes

this could become a huge list, and I wonder if this should be driven more by endpoint compatibility?

Yes! this is the most tricky part of this RFD for me also... We can also use string type here instead of enum, and provide common providers in a documentation and comments instead. But it is important to have an agreement between an agent and a client how do they call "anthropic" provider. Otherwise the whole idea of standartization will be broken.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RIght, I guess I was still thinking we have an enum that is a sub property of the nested object...

I am stil not sure if this is a good idea to be honest... this is a tricky one. but I think this does need to be more than some generic input, we need a semantically meaningful definition of a provider in the protocol for sure

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there is also a different need of :

  • This is an openai compatible provider and
  • This is a provider where I expect specific models to be available, but behind a proxy or some other company specific need

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The second case can be achived by passing custom metadata (model list?) with the provider name. We can even make a protocol extension there to define API when we will understand the needs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For me, the main question right now is how to define provider types so that both the client and the server can be sure they’re referring to the same thing, while still allowing this list of providers to be easily extended in the future without breaking compatibility in both agents and clients.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggested a possible solution


2. **Timing**: The `setLlmEndpoints` method MUST be called after `initialize` and before `session/new`. Calling it during an active session is undefined behavior. All subsequent sessions will use the newly configured endpoints.

3. **Per-process scope**: The endpoint configuration applies to the entire agent process lifetime. It should not be stored to disk or persist beyond the process.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i.e it is up to the client to pass this again the next time a connection is made?
I like that, just clarifying

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes


3. **Per-process scope**: The endpoint configuration applies to the entire agent process lifetime. It should not be stored to disk or persist beyond the process.

4. **Provider-based routing**: The agent should route LLM requests to the appropriate endpoint based on the provider. If the agent uses a provider not in the provided map, it uses its default endpoint for that provider.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should an agent ALWAYS have a default endpoint? (I think so, for simplicity + backwards compat)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because a client may or may not provide a custom endpoint. This is optional and up to the client. So the agent should not relay on that information.


### How should model availability be handled?

When a custom endpoint is provided, it may only support a subset of models. For example, a self-hosted vLLM server might only have `llama-3-70b` available, while the agent normally advertises `claude-3-opus`, `gpt-4`, etc.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would think many agents would be able to generate a model list based on the new provider in most cases
For those that don't they will likely show an error if you configured a provider without model support


### How should provider identifiers be standardized?

We need to define a standard set of provider identifiers (e.g., `"anthropic"`, `"openai"`, `"google"`, `"amazon"`). Should this be:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely my main open question as well.

One option would be that the key in the providers is an arbitrary string and it has a subfield of which type of provider it is compatible with? Potentially more information could then be provided on additional requirements?

If we go this route though, it feels less like a capability and more like something that comes back on the top-level intiailize response.

But this would allow for agents that have potentially multiple openai compatible providers from an api layer, but maybe not all should use the same endpoint (i.e. they support multiple providers, several of which are openai compatible, and you might want to be able to configure all of them)

This opens a lot of UX questions in terms of how these are chosen by the user... but might make things more flexible

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great idea. The only downside I see here is the potential for overengineering. We still don't really know how agents and clients will want to use different providers for the same protocol, or whether they’ll want to at all.

Maybe it’s worth considering an approach where the setLlmEndpoints request could accept multiple URLs with different conditions instead.

But I would vote for simplicity right now.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the PR to address provider issues! I renamed LlmProvider to LlmProtocol and made it string

@cdxiaodong
Copy link

nice sir!!!

@IceyLiu
Copy link
Contributor

IceyLiu commented Mar 6, 2026

good PR, we do need it

```typescript
type LlmProtocol = string & {};

const LlmProtocols = {
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benbrandt is it possible to provide such well-known constants in SDK?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants