Skip to content

Detect and log terminal/IDE host in telemetry#5315

Open
simonfaltum wants to merge 3 commits into
mainfrom
simonfaltum/telemetry-host-detection
Open

Detect and log terminal/IDE host in telemetry#5315
simonfaltum wants to merge 3 commits into
mainfrom
simonfaltum/telemetry-host-detection

Conversation

@simonfaltum
Copy link
Copy Markdown
Member

@simonfaltum simonfaltum commented May 22, 2026

Why

Today we have no way to tell whether the CLI is being run from a raw terminal or from inside an IDE (VSCode, Cursor, IntelliJ, etc.). We want that signal so we can answer questions like:

  • How many of our users are on VSCode but not using the Copilot extension? (a real population we want to size)
  • What's the upper bound on how many users could be on VSCode Copilot? Since we at least know who's on VSCode, that gives us the ceiling.
  • Where should we prioritize IDE-specific UX work?

Changes

Adds env-var-based host detection in libs/cmdio/host.go. Surfaces the result two ways:

  • User agent: host/vscode, host/jetbrains, host/iterm, host/unknown, etc.
  • New Host field on the telemetry ExecutionContext.

Scope is deliberately narrow. Only detections backed by direct observation or upstream docs are shipped:

  • vscode: covers vanilla VSCode and any fork (Cursor, Windsurf, code-server). Forks don't expose a stable, trustworthy discriminator in env, so we don't try to split them.
  • jetbrains, apple-terminal, iterm, warp, wezterm, ghostty
  • unknown for everything else (raw shells, terminals we don't recognize yet)

What's deliberately NOT in this PR

Detecting whether a user has the Copilot extension active, or whether Claude Code / Cursor Agent is driving the CLI, is a separate dimension. We want it to be independent of host so the analyses above stay clean: "VSCode users WITHOUT Copilot" should be a join across two dimensions, not a single enum value. Conflating them (e.g. emitting vscode-copilot as a single host value) would make that query awkward. Agent/extension detection will land as its own field once we've verified the actual env signals.

Privacy: enum-only values, no raw env values, no paths, no versions.

Test plan

  • Unit tests for `DetectHost` covering every shipped host
  • Unit tests for the user-agent hook
  • `./task checks` clean
  • `./task fmt-q lint-q` clean

Adds env-based host detection (TERM_PROGRAM, TERMINAL_EMULATOR, CURSOR_TRACE_ID,
__CFBundleIdentifier) returning an enum value (vscode, cursor, jetbrains,
terminal, etc.). The result is added to the user agent as host/<value> and
to the CLI telemetry ExecutionContext as a new Host field.

Includes a best-effort vscode-copilot sentinel that lights up when Copilot
agent env vars are seen alongside VSCode. The exact env vars Copilot sets
in agent-mode terminals are not stable yet; this is a coarse signal to be
refined once we see real telemetry.

Co-authored-by: Isaac
@github-actions
Copy link
Copy Markdown
Contributor

Approval status: pending

/cmd/root/ - needs approval

Files: cmd/root/root.go, cmd/root/user_agent_host.go, cmd/root/user_agent_host_test.go
Suggested: @mihaimitrea-db
Also eligible: @renaudhartert-db, @hectorcast-db, @parthban-db, @tanmay-db, @Divyansh-db, @tejaskochar-db, @chrisst, @rauchy

/libs/cmdio/ - needs approval

Files: libs/cmdio/host.go, libs/cmdio/host_test.go
Suggested: @mihaimitrea-db
Also eligible: @renaudhartert-db, @hectorcast-db, @parthban-db, @tanmay-db, @Divyansh-db, @tejaskochar-db, @chrisst, @rauchy

/libs/telemetry/ - needs approval

Files: libs/telemetry/protos/databricks_cli_log.go
Suggested: @mihaimitrea-db
Also eligible: @renaudhartert-db, @hectorcast-db, @parthban-db, @tanmay-db, @Divyansh-db, @tejaskochar-db, @chrisst, @rauchy

Any maintainer (@andrewnester, @anton-107, @denik, @pietern, @shreyas-goenka, @renaudhartert-db) can approve all areas.
See OWNERS for ownership rules.

Collapse Cursor/Windsurf/code-server into a single 'vscode' label rather
than relying on speculative discriminators (ToDesktop IDs aren't stable
identifiers; the path-based check is substring-matching a string outside
our control). We can split forks back out later if/when we observe a real
stable signal.

Also drops unverified entries (Hyper, Tabby, Zed) that were guesses.

Co-authored-by: Isaac
Host and 'is Copilot/agent active' are two independent dimensions. Mixing
them into one enum value (vscode-copilot) makes it awkward to compute
populations like "VSCode users WITHOUT the Copilot extension". Drop the
sentinel from this PR; agent/extension detection should land as its own
dimension once we've verified the actual env signals.

Co-authored-by: Isaac
@eng-dev-ecosystem-bot
Copy link
Copy Markdown
Collaborator

Commit: 2744ae9

Run: 26294539117

Comment thread libs/cmdio/host.go
return HostJetBrains
}

return HostUnknown
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep the (sanitized) value? Having categorization on the server side makes the logic forward compatible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants