Skip to content

[XTA-15079] Add SPOG ?o= routing support and fix final-flush auth on close#391

Open
samikshya-db wants to merge 5 commits into
mainfrom
spog-support
Open

[XTA-15079] Add SPOG ?o= routing support and fix final-flush auth on close#391
samikshya-db wants to merge 5 commits into
mainfrom
spog-support

Conversation

@samikshya-db
Copy link
Copy Markdown
Collaborator

@samikshya-db samikshya-db commented May 23, 2026

Summary

Ports the SPOG (Single Panel of Glass) routing fix from
databricks/databricks-jdbc#1316
to the Node.js driver, and fixes a related close-ordering bug uncovered
while validating the port against a real SPOG host.

Change 1: SPOG header injection

SPOG replaces workspace-specific hostnames with account-level vanity URLs.
When httpPath carries ?o=<workspaceId>, endpoints that don't include
the workspace in their URL path need the workspace conveyed via the
x-databricks-org-id header instead.

  • Parse ?o=<digits> out of httpPath in DBSQLClient.connect() and
    stash the org-id as x-databricks-org-id on a new
    ClientConfig.customHeaders field. A user-supplied customHeaders
    entry (case-insensitively keyed) takes precedence.
  • DatabricksTelemetryExporter spreads config.customHeaders into the
    outgoing POST headers; auth headers still win on collision.
  • FeatureFlagCache does the same for the feature-flag GET.
  • ConnectionOptions.customHeaders exposed so callers can also inject
    their own out-of-band headers.

OAuth / OIDC token requests use openid-client's private HTTP transport
and never see customHeaders, matching JDBC's exemption.

Not needed vs JDBC: this driver passes options.path through
unmodified (no split-on-= parser), uses Thrift only (no SEA JSON body
warehouse-ID extraction), and exposes no Volume API surface.

Change 2: Close-ordering fix

While verifying Change 1 against a real SPOG warehouse, the final
client.close() flush emitted:

[warn] Telemetry: Authorization header missing — metrics will be dropped
[debug] Telemetry export auth failure: Telemetry export: missing Authorization header

Root cause: TelemetryClientProvider.releaseClient called
unregisterContext before client.close(). On the last refcount, that
emptied the FIFO of (context, authProvider) pairs before the final
flush, so the exporter's getAuthProvider() walk returned undefined and
the batch was dropped.

Fix: skip the unregister on the last release so the FIFO snapshot
survives through close()'s final flush. Multi-refcount path is
unchanged. No leak — the TelemetryClient is unreachable as soon as
releaseClient returns, and the retained (context, authProvider)
pair dies with it.

Live verification

Ran against dogfood-spog.staging.azuredatabricks.net warehouse
ad6f9b54e6593746?o=7064161269814046:

customHeaders={"x-databricks-org-id":"7064161269814046"}     ← SPOG header injected
Feature flag enableTelemetryForNodeJs: true                  ← header propagated via SPOG
Result: [{"v":1}]                                            ← SELECT 1 worked
Successfully exported 3 telemetry metrics                    ← final flush no longer drops

Test plan

Unit tests added (all passing locally):

  • buildCustomHeaders parses ?o= into x-databricks-org-id
  • No header when ?o= is absent and no user-supplied customHeaders
  • User-supplied customHeaders is preserved alongside parsed org-id
  • User-supplied x-databricks-org-id (any case) wins over ?o=
  • Non-numeric o=<value> does not inject a header
  • End-to-end DBSQLClient.connect() populates
    config.customHeaders['x-databricks-org-id'] from the path
  • DatabricksTelemetryExporter attaches config.customHeaders to
    the POST headers
  • Auth headers still override customHeaders on key collision
  • No header is attached when config.customHeaders is empty
  • FeatureFlagCache.fetchFeatureFlag attaches customHeaders to
    the GET headers
  • releaseClient keeps the context registered through close() on
    last refcount
  • releaseClient drops the context immediately when other
    refcounts remain

This pull request and its description were written by Isaac.

…flag

SPOG (Single Panel of Glass) replaces workspace-specific hostnames with
account-level vanity URLs. When httpPath carries `?o=<workspaceId>`,
endpoints that don't include the workspace in their URL path (telemetry,
feature flags) need the workspace conveyed via the
`x-databricks-org-id` header instead.

Changes:

- Parse `?o=<digits>` out of httpPath in DBSQLClient.connect() and stash
  the org-id as `x-databricks-org-id` on a new `ClientConfig.customHeaders`
  field. A user-supplied `customHeaders` entry (case-insensitive) takes
  precedence.
- DatabricksTelemetryExporter spreads `config.customHeaders` into the
  outgoing POST headers. Auth headers still win on collision.
- FeatureFlagCache does the same for the feature-flag GET.

Not applicable to this driver (vs JDBC port in
databricks/databricks-jdbc#1316):

- httpPath property parser fix — Node.js passes `options.path` through
  unmodified.
- Warehouse ID regex fix for SEA — driver uses Thrift only.
- DBFS Volume header injection — driver exposes no Volume API.

OAuth/OIDC token requests deliberately do NOT receive customHeaders.

Co-authored-by: Isaac
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@samikshya-db samikshya-db changed the title Add SPOG support for ?o= routing in httpPath [XTA-15079] Add SPOG support for ?o= routing in httpPath May 23, 2026
…lush

`TelemetryClientProvider.releaseClient` was calling `unregisterContext`
before `client.close()`. On the last refcount, that emptied the FIFO of
`(context, authProvider)` pairs before the final flush ran, so the
exporter's `getAuthProvider()` walk returned undefined and the batch was
dropped with "Telemetry: Authorization header missing — metrics will be
dropped".

Defer `unregisterContext` until after `close()` on the last refcount;
multi-refcount path is unchanged (immediate unregister so surviving
consumers don't reach into a closing context).

Verified end-to-end against a real SPOG host: the final flush now
exports its 3 metrics (connection.open, statement.start,
statement.complete) instead of warning and dropping them.

Co-authored-by: Isaac
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

const msg = error instanceof Error ? error.message : String(error);
logger.log(LogLevel.debug, `Error releasing TelemetryClient: ${msg}`);
}
// Remove from map BEFORE awaiting close so a concurrent
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have we tested this for auth and telemetry?

Copy link
Copy Markdown
Collaborator Author

@samikshya-db samikshya-db May 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, added test details in description

Tighten the previous commit to the minimal diff: guard the existing
unregisterContext call instead of restructuring the function. Restores
the unchanged comment on TelemetryClient.unregisterContext.

Co-authored-by: Isaac
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

@samikshya-db samikshya-db changed the title [XTA-15079] Add SPOG support for ?o= routing in httpPath Add SPOG ?o= routing support and fix final-flush auth on close May 24, 2026
@samikshya-db samikshya-db changed the title Add SPOG ?o= routing support and fix final-flush auth on close [XTA-15079] Add SPOG ?o= routing support and fix final-flush auth on close May 24, 2026
Comment thread lib/DBSQLClient.ts Outdated
* `options.customHeaders` (case-insensitively keyed) wins over the parsed
* value.
*/
private buildCustomHeaders(options: ConnectionOptions): Record<string, string> | undefined {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[M1] buildCustomHeaders(options) reads this.httpPath, not options.path — misleading signature

The method takes options: ConnectionOptions but only reads options.customHeaders. The workspace ID is sourced from this.httpPath via this.extractWorkspaceId(). The unit test at tests/unit/DBSQLClient.test.ts:803-805 makes this concrete — it sets (client as any).httpPath separately from the path in the options literal.

This creates a hidden ordering dependency at the call site (line 588): this.httpPath = options.path must run before this.config.customHeaders = this.buildCustomHeaders(options). A future refactor that swaps those two lines silently breaks SPOG with no test failure — extractWorkspaceId() returns undefined, no header is injected.

Suggested — make it pure / static so the dependency is explicit:

private static buildCustomHeaders(
  httpPath: string | undefined,
  userHeaders: Record<string, string> | undefined,
): Record<string, string> | undefined {
  const merged: Record<string, string> = { ...(userHeaders ?? {}) };
  const hasOrgIdAlready = Object.keys(merged).some((k) => k.toLowerCase() === 'x-databricks-org-id');
  if (!hasOrgIdAlready) {
    const orgId = DBSQLClient.extractWorkspaceId(httpPath); // also static
    if (orgId) merged['x-databricks-org-id'] = orgId;
  }
  return Object.keys(merged).length > 0 ? merged : undefined;
}

Call site at line 588: this.config.customHeaders = DBSQLClient.buildCustomHeaders(options.path, options.customHeaders);

Bonus: the (client as any).httpPath = ... test workaround disappears.

This pull request and its description were written by Isaac.

Copy link
Copy Markdown
Contributor

@msrathore-db msrathore-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please verify all purpose cluster url is working and also M2M is working

Comment thread lib/DBSQLClient.ts
if (orgId) {
merged['x-databricks-org-id'] = orgId;
}
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[H2] No log signal when SPOG header is injected, dropped, or overridden — JDBC parity gap

buildCustomHeaders and extractWorkspaceId emit zero log statements at any level. When SPOG routing misbehaves in prod, an oncall engineer has nothing to grep for. JDBC at DatabricksConnectionContext.java:1349-1364 logs all three branches: success injection, user-supplied-wins, and non-numeric/malformed path. The PR description's "live verification" confirms one path worked once — but a future regression in extraction or a non-numeric ?o= on a customer host would be invisible.

Suggested:

private buildCustomHeaders(options: ConnectionOptions): Record<string, string> | undefined {
  const merged: Record<string, string> = { ...(options.customHeaders ?? {}) };
  const hasOrgIdAlready = Object.keys(merged).some((k) => k.toLowerCase() === 'x-databricks-org-id');
  if (hasOrgIdAlready) {
    this.logger.log(LogLevel.debug, 'SPOG: x-databricks-org-id already set by caller, not extracting from httpPath');
  } else {
    const orgId = this.extractWorkspaceId();
    if (orgId) {
      merged['x-databricks-org-id'] = orgId;
      this.logger.log(LogLevel.debug, `SPOG: injecting x-databricks-org-id=${orgId} (extracted from ?o= in httpPath)`);
    }
  }
  return Object.keys(merged).length > 0 ? merged : undefined;
}

Also: at extractWorkspaceId (line 329), a non-numeric ?o=tenant_xyz is silently dropped — add a LogLevel.warn so customers with malformed paths get a signal.

This pull request and its description were written by Isaac.

…path form

- buildCustomHeaders(httpPath, userHeaders) and extractWorkspaceId(httpPath):
  SPOG-routing inputs now in the signature instead of read off this.httpPath.
  extractWorkspaceId is static. Drops the (client as any).httpPath test
  workaround. Fixes the hidden ordering dependency flagged in PR review [M1].

- Log SPOG header injection (debug), caller-supplied override (debug), and
  non-numeric workspace ID (warn). Closes JDBC parity gap so oncall has a
  grep handle when SPOG routing misbehaves [H2].

- extractWorkspaceId now also matches the all-purpose cluster path form
  /o/<wsId>/<cluster-id> in addition to the warehouse query form ?o=<wsId>.
  Verified end-to-end against a real all-purpose cluster on a SPOG host.

Co-authored-by: Isaac
Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Auto-fix from `prettier --write` on the two files touched by the
previous commit; no logic changes.

Co-authored-by: Isaac
Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>
@samikshya-db samikshya-db deployed to azure-prod May 25, 2026 15:30 — with GitHub Actions Active
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants