docs(rfd): Сustom LLM endpoints#648
Conversation
benbrandt
left a comment
There was a problem hiding this comment.
Overall I really like this! Some questions but no major objections
docs/rfds/custom-llm-endpoint.mdx
Outdated
| * based on the provider being used. | ||
| * This configuration is per-process and should not be persisted to disk. | ||
| */ | ||
| endpoints: Record<LlmProvider, LlmEndpointConfig>; |
There was a problem hiding this comment.
Cool, so the agent can provide multiple possible endpoints, and the client can set all of them
So for agents supporting models from multiple providers, they can set them all at once?
There was a problem hiding this comment.
yes, that was the idea — some agents use different providers for different subagents
docs/rfds/custom-llm-endpoint.mdx
Outdated
|
|
||
| ```typescript | ||
| /** Well-known LLM provider identifiers */ | ||
| type LlmProvider = "anthropic" | "openai" | "google" | "amazon" | "will be added later"; |
There was a problem hiding this comment.
Is the goal here to have "api compatibility"?
i.e. I see anthropic and therefore know that I can provide any anthropic compatible endpoint?
My question being: this could become a huge list, and I wonder if this should be driven more by endpoint compatibility?
though I guess depending on the provider you might need more or less auth requirements?
There was a problem hiding this comment.
Hello Ben, thank you for the blazingly fast review!
Is the goal here to have "api compatibility"?
i.e. I see anthropic and therefore know that I can provide any anthropic compatible endpoint
Yes
this could become a huge list, and I wonder if this should be driven more by endpoint compatibility?
Yes! this is the most tricky part of this RFD for me also... We can also use string type here instead of enum, and provide common providers in a documentation and comments instead. But it is important to have an agreement between an agent and a client how do they call "anthropic" provider. Otherwise the whole idea of standartization will be broken.
There was a problem hiding this comment.
RIght, I guess I was still thinking we have an enum that is a sub property of the nested object...
I am stil not sure if this is a good idea to be honest... this is a tricky one. but I think this does need to be more than some generic input, we need a semantically meaningful definition of a provider in the protocol for sure
There was a problem hiding this comment.
I guess there is also a different need of :
- This is an openai compatible provider and
- This is a provider where I expect specific models to be available, but behind a proxy or some other company specific need
There was a problem hiding this comment.
The second case can be achived by passing custom metadata (model list?) with the provider name. We can even make a protocol extension there to define API when we will understand the needs.
There was a problem hiding this comment.
For me, the main question right now is how to define provider types so that both the client and the server can be sure they’re referring to the same thing, while still allowing this list of providers to be easily extended in the future without breaking compatibility in both agents and clients.
|
|
||
| 2. **Timing**: The `setLlmEndpoints` method MUST be called after `initialize` and before `session/new`. Calling it during an active session is undefined behavior. All subsequent sessions will use the newly configured endpoints. | ||
|
|
||
| 3. **Per-process scope**: The endpoint configuration applies to the entire agent process lifetime. It should not be stored to disk or persist beyond the process. |
There was a problem hiding this comment.
i.e it is up to the client to pass this again the next time a connection is made?
I like that, just clarifying
docs/rfds/custom-llm-endpoint.mdx
Outdated
|
|
||
| 3. **Per-process scope**: The endpoint configuration applies to the entire agent process lifetime. It should not be stored to disk or persist beyond the process. | ||
|
|
||
| 4. **Provider-based routing**: The agent should route LLM requests to the appropriate endpoint based on the provider. If the agent uses a provider not in the provided map, it uses its default endpoint for that provider. |
There was a problem hiding this comment.
Should an agent ALWAYS have a default endpoint? (I think so, for simplicity + backwards compat)
There was a problem hiding this comment.
Yes, because a client may or may not provide a custom endpoint. This is optional and up to the client. So the agent should not relay on that information.
docs/rfds/custom-llm-endpoint.mdx
Outdated
|
|
||
| ### How should model availability be handled? | ||
|
|
||
| When a custom endpoint is provided, it may only support a subset of models. For example, a self-hosted vLLM server might only have `llama-3-70b` available, while the agent normally advertises `claude-3-opus`, `gpt-4`, etc. |
There was a problem hiding this comment.
I would think many agents would be able to generate a model list based on the new provider in most cases
For those that don't they will likely show an error if you configured a provider without model support
docs/rfds/custom-llm-endpoint.mdx
Outdated
|
|
||
| ### How should provider identifiers be standardized? | ||
|
|
||
| We need to define a standard set of provider identifiers (e.g., `"anthropic"`, `"openai"`, `"google"`, `"amazon"`). Should this be: |
There was a problem hiding this comment.
Definitely my main open question as well.
One option would be that the key in the providers is an arbitrary string and it has a subfield of which type of provider it is compatible with? Potentially more information could then be provided on additional requirements?
If we go this route though, it feels less like a capability and more like something that comes back on the top-level intiailize response.
But this would allow for agents that have potentially multiple openai compatible providers from an api layer, but maybe not all should use the same endpoint (i.e. they support multiple providers, several of which are openai compatible, and you might want to be able to configure all of them)
This opens a lot of UX questions in terms of how these are chosen by the user... but might make things more flexible
There was a problem hiding this comment.
That's a great idea. The only downside I see here is the potential for overengineering. We still don't really know how agents and clients will want to use different providers for the same protocol, or whether they’ll want to at all.
Maybe it’s worth considering an approach where the setLlmEndpoints request could accept multiple URLs with different conditions instead.
But I would vote for simplicity right now.
There was a problem hiding this comment.
I have updated the PR to address provider issues! I renamed LlmProvider to LlmProtocol and made it string
|
nice sir!!! |
|
good PR, we do need it |
…custom LLM endpoint RFD
| ```typescript | ||
| type LlmProtocol = string & {}; | ||
|
|
||
| const LlmProtocols = { |
There was a problem hiding this comment.
@benbrandt is it possible to provide such well-known constants in SDK?
title: "Custom LLM Endpoint Configuration"
Elevator pitch
Add the ability for clients to pass custom LLM endpoint URLs and authentication credentials to agents via a dedicated
setLlmEndpointsmethod, with support for multiple LLM protocols. This allows clients to route LLM requests through their own infrastructure (proxies, gateways, or self-hosted models) without agents needing to know about this configuration in advance.Status quo
Currently, agents are configured with their own LLM endpoints and credentials, typically through environment variables or configuration files. This creates problems for:
Shiny future
Clients will be able to:
Implementation details and plan
Intended flow
The design uses a two-step approach: capability discovery during initialization, followed by endpoint configuration via a dedicated method. This enables the following flow:
sequenceDiagram participant Client participant Agent Client->>Agent: initialize Note right of Agent: Agent reports capabilities,<br/>including llmEndpoints support Agent-->>Client: initialize response<br/>(agentCapabilities.llmEndpoints.protocols) Note over Client: Client sees supported protocols.<br/>Performs configuration / authorization<br/>based on this knowledge. Client->>Agent: setLlmEndpoints Agent-->>Client: setLlmEndpoints response Note over Client,Agent: Ready for session setup Client->>Agent: session/newinitialize. The agent responds with its capabilities, including anllmEndpointsobject listing supported LLM protocols.llmEndpoints.protocols, the client can perform authorization, resolve credentials, or configure endpoints for those specific protocols. IfllmEndpointsis absent or the protocols list is empty, the client falls back to a different authorization and configuration strategy.setLlmEndpointswith endpoint configurations for the supported protocols.Capability advertisement
The agent advertises support for custom LLM endpoints and lists its supported LLM protocols via a new
llmEndpointscapability inagentCapabilities:Initialize Response example:
{ "jsonrpc": "2.0", "id": 0, "result": { "protocolVersion": 1, "agentInfo": { "name": "MyAgent", "version": "2.0.0" }, "agentCapabilities": { "llmEndpoints": { "protocols": { "anthropic": {}, "openai": {} } }, "sessionCapabilities": {} } } }setLlmEndpointsmethodA dedicated method that can be called after initialization but before session creation.
JSON Schema Additions
{ "$defs": { "LlmEndpointConfig": { "description": "Configuration for a custom LLM endpoint.", "properties": { "url": { "type": "string", "description": "Base URL for LLM API requests." }, "headers": { "type": ["object", "null"], "description": "Additional HTTP headers to include in LLM API requests.", "additionalProperties": { "type": "string" } }, "_meta": { "additionalProperties": true, "type": ["object", "null"] } }, "required": ["url"], "type": "object" }, "LlmEndpoints": { "description": "Map of LLM protocol identifiers to endpoint configurations. This configuration is per-process and should not be persisted to disk.", "type": "object", "additionalProperties": { "$ref": "#/$defs/LlmEndpointConfig" } } } }Example Exchange
setLlmEndpoints Request:
{ "jsonrpc": "2.0", "id": 2, "method": "setLlmEndpoints", "params": { "endpoints": { "anthropic": { "url": "https://llm-gateway.corp.example.com/anthropic/v1", "headers": { "Authorization": "Bearer anthropic-token-abc123", "X-Request-Source": "my-ide" } }, "openai": { "url": "https://llm-gateway.corp.example.com/openai/v1", "headers": { "Authorization": "Bearer openai-token-xyz789" } } } } }setLlmEndpoints Response:
{ "jsonrpc": "2.0", "id": 2, "result": {} }Behavior
Capability discovery: The agent MUST list its supported protocols in
agentCapabilities.llmEndpoints.protocolsif it supports thesetLlmEndpointsmethod. Clients SHOULD only send endpoint configurations for protocols listed there.Timing: The
setLlmEndpointsmethod MUST be called afterinitializeand beforesession/new. Calling this MAY NOT affect currently running sessions. Agents MUST apply these settings to any sessions created or loaded after this has been called.Per-process scope: The endpoint configuration applies to the entire agent process lifetime. It should not be stored to disk or persist beyond the process.
Protocol-based routing: The agent should route LLM requests to the appropriate endpoint based on the LLM protocol. If the agent uses a protocol not in the provided map, it uses its default endpoint for that protocol.
Agent discretion: If an agent cannot support custom endpoints (e.g., uses a proprietary API), it should omit
llmEndpointsfrom capabilities or return an empty protocols map.Open questions
How should protocol identifiers be standardized?ResolvedProtocol identifiers are plain strings with a set of well-known values (
"anthropic","openai","google","amazon"). Custom identifiers are allowed for regional or emerging LLM APIs not covered by the well-known set. Agents advertise the protocol identifiers they understand; clients match against them.How should model availability be handled?ResolvedModel availability is not a concern of this RFD. Most agents can discover available models from the configured endpoint at runtime. Agents that cannot should report an error if a user selects a model not supported by the custom endpoint.
Frequently asked questions
Why is
LlmProtocola plain string instead of a fixed enum?The protocol wire format uses plain strings to keep the specification stable and avoid blocking adoption of new LLM APIs. A fixed enum would require a spec update every time a new protocol emerges.
Instead, well-known protocol identifiers are provided by SDKs as convenience constants:
TypeScript — an open string type with predefined values:
Kotlin — a string wrapper with predefined companion objects:
This way SDK users get discoverability and autocomplete for well-known protocols while remaining free to use any custom string for protocols not yet in the predefined set.
Why not pass endpoints in the
initializerequest?Passing endpoints directly in
initializewould require the client to have already resolved credentials and configured endpoints before knowing whether the agent supports this feature. In practice, the client needs to inspect the agent's capabilities first to decide its authorization strategy — for example, whether to route through a corporate proxy or use direct credentials. A dedicated method after initialization solves this chicken-and-egg problem and keeps capability negotiation separate from endpoint configuration.Why not pass endpoint when selecting a model?
One option would be to pass the endpoint URL and credentials when the user selects a model (e.g., in
session/newor a model selection method).Many agents throw authentication errors before the model selection happens. This makes the flow unreliable.
Why not use environment variables or command-line arguments?
One option would be to pass endpoint configuration via environment variables (like
OPENAI_API_BASE) or command-line arguments when starting the agent process.This approach has significant drawbacks:
What if the agent doesn't support custom endpoints?
If the agent doesn't support custom endpoints,
llmEndpointswill be absent fromagentCapabilities(or itsprotocolsmap will be empty). The client can detect this and choose an alternative authorization and configuration strategy, or proceed using the agent's default endpoints.Revision history
LlmProtocolan open string type with well-known values; resolve open questions on identifier standardization and model availabilitysetLlmEndpointsmethod with capability advertisement