[XTA-15079] Add SPOG ?o= routing support and fix final-flush auth on close#391
[XTA-15079] Add SPOG ?o= routing support and fix final-flush auth on close#391samikshya-db wants to merge 5 commits into
Conversation
…flag SPOG (Single Panel of Glass) replaces workspace-specific hostnames with account-level vanity URLs. When httpPath carries `?o=<workspaceId>`, endpoints that don't include the workspace in their URL path (telemetry, feature flags) need the workspace conveyed via the `x-databricks-org-id` header instead. Changes: - Parse `?o=<digits>` out of httpPath in DBSQLClient.connect() and stash the org-id as `x-databricks-org-id` on a new `ClientConfig.customHeaders` field. A user-supplied `customHeaders` entry (case-insensitive) takes precedence. - DatabricksTelemetryExporter spreads `config.customHeaders` into the outgoing POST headers. Auth headers still win on collision. - FeatureFlagCache does the same for the feature-flag GET. Not applicable to this driver (vs JDBC port in databricks/databricks-jdbc#1316): - httpPath property parser fix — Node.js passes `options.path` through unmodified. - Warehouse ID regex fix for SEA — driver uses Thrift only. - DBFS Volume header injection — driver exposes no Volume API. OAuth/OIDC token requests deliberately do NOT receive customHeaders. Co-authored-by: Isaac
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
…lush `TelemetryClientProvider.releaseClient` was calling `unregisterContext` before `client.close()`. On the last refcount, that emptied the FIFO of `(context, authProvider)` pairs before the final flush ran, so the exporter's `getAuthProvider()` walk returned undefined and the batch was dropped with "Telemetry: Authorization header missing — metrics will be dropped". Defer `unregisterContext` until after `close()` on the last refcount; multi-refcount path is unchanged (immediate unregister so surviving consumers don't reach into a closing context). Verified end-to-end against a real SPOG host: the final flush now exports its 3 metrics (connection.open, statement.start, statement.complete) instead of warning and dropping them. Co-authored-by: Isaac
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
| const msg = error instanceof Error ? error.message : String(error); | ||
| logger.log(LogLevel.debug, `Error releasing TelemetryClient: ${msg}`); | ||
| } | ||
| // Remove from map BEFORE awaiting close so a concurrent |
There was a problem hiding this comment.
Have we tested this for auth and telemetry?
There was a problem hiding this comment.
yes, added test details in description
Tighten the previous commit to the minimal diff: guard the existing unregisterContext call instead of restructuring the function. Restores the unchanged comment on TelemetryClient.unregisterContext. Co-authored-by: Isaac
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
| * `options.customHeaders` (case-insensitively keyed) wins over the parsed | ||
| * value. | ||
| */ | ||
| private buildCustomHeaders(options: ConnectionOptions): Record<string, string> | undefined { |
There was a problem hiding this comment.
[M1] buildCustomHeaders(options) reads this.httpPath, not options.path — misleading signature
The method takes options: ConnectionOptions but only reads options.customHeaders. The workspace ID is sourced from this.httpPath via this.extractWorkspaceId(). The unit test at tests/unit/DBSQLClient.test.ts:803-805 makes this concrete — it sets (client as any).httpPath separately from the path in the options literal.
This creates a hidden ordering dependency at the call site (line 588): this.httpPath = options.path must run before this.config.customHeaders = this.buildCustomHeaders(options). A future refactor that swaps those two lines silently breaks SPOG with no test failure — extractWorkspaceId() returns undefined, no header is injected.
Suggested — make it pure / static so the dependency is explicit:
private static buildCustomHeaders(
httpPath: string | undefined,
userHeaders: Record<string, string> | undefined,
): Record<string, string> | undefined {
const merged: Record<string, string> = { ...(userHeaders ?? {}) };
const hasOrgIdAlready = Object.keys(merged).some((k) => k.toLowerCase() === 'x-databricks-org-id');
if (!hasOrgIdAlready) {
const orgId = DBSQLClient.extractWorkspaceId(httpPath); // also static
if (orgId) merged['x-databricks-org-id'] = orgId;
}
return Object.keys(merged).length > 0 ? merged : undefined;
}Call site at line 588: this.config.customHeaders = DBSQLClient.buildCustomHeaders(options.path, options.customHeaders);
Bonus: the (client as any).httpPath = ... test workaround disappears.
This pull request and its description were written by Isaac.
msrathore-db
left a comment
There was a problem hiding this comment.
Please verify all purpose cluster url is working and also M2M is working
| if (orgId) { | ||
| merged['x-databricks-org-id'] = orgId; | ||
| } | ||
| } |
There was a problem hiding this comment.
[H2] No log signal when SPOG header is injected, dropped, or overridden — JDBC parity gap
buildCustomHeaders and extractWorkspaceId emit zero log statements at any level. When SPOG routing misbehaves in prod, an oncall engineer has nothing to grep for. JDBC at DatabricksConnectionContext.java:1349-1364 logs all three branches: success injection, user-supplied-wins, and non-numeric/malformed path. The PR description's "live verification" confirms one path worked once — but a future regression in extraction or a non-numeric ?o= on a customer host would be invisible.
Suggested:
private buildCustomHeaders(options: ConnectionOptions): Record<string, string> | undefined {
const merged: Record<string, string> = { ...(options.customHeaders ?? {}) };
const hasOrgIdAlready = Object.keys(merged).some((k) => k.toLowerCase() === 'x-databricks-org-id');
if (hasOrgIdAlready) {
this.logger.log(LogLevel.debug, 'SPOG: x-databricks-org-id already set by caller, not extracting from httpPath');
} else {
const orgId = this.extractWorkspaceId();
if (orgId) {
merged['x-databricks-org-id'] = orgId;
this.logger.log(LogLevel.debug, `SPOG: injecting x-databricks-org-id=${orgId} (extracted from ?o= in httpPath)`);
}
}
return Object.keys(merged).length > 0 ? merged : undefined;
}Also: at extractWorkspaceId (line 329), a non-numeric ?o=tenant_xyz is silently dropped — add a LogLevel.warn so customers with malformed paths get a signal.
This pull request and its description were written by Isaac.
…path form - buildCustomHeaders(httpPath, userHeaders) and extractWorkspaceId(httpPath): SPOG-routing inputs now in the signature instead of read off this.httpPath. extractWorkspaceId is static. Drops the (client as any).httpPath test workaround. Fixes the hidden ordering dependency flagged in PR review [M1]. - Log SPOG header injection (debug), caller-supplied override (debug), and non-numeric workspace ID (warn). Closes JDBC parity gap so oncall has a grep handle when SPOG routing misbehaves [H2]. - extractWorkspaceId now also matches the all-purpose cluster path form /o/<wsId>/<cluster-id> in addition to the warehouse query form ?o=<wsId>. Verified end-to-end against a real all-purpose cluster on a SPOG host. Co-authored-by: Isaac Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Auto-fix from `prettier --write` on the two files touched by the previous commit; no logic changes. Co-authored-by: Isaac Signed-off-by: samikshya-chand_data <samikshya.chand@databricks.com>
|
Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase ( |
Summary
Ports the SPOG (Single Panel of Glass) routing fix from
databricks/databricks-jdbc#1316
to the Node.js driver, and fixes a related close-ordering bug uncovered
while validating the port against a real SPOG host.
Change 1: SPOG header injection
SPOG replaces workspace-specific hostnames with account-level vanity URLs.
When
httpPathcarries?o=<workspaceId>, endpoints that don't includethe workspace in their URL path need the workspace conveyed via the
x-databricks-org-idheader instead.?o=<digits>out ofhttpPathinDBSQLClient.connect()andstash the org-id as
x-databricks-org-idon a newClientConfig.customHeadersfield. A user-suppliedcustomHeadersentry (case-insensitively keyed) takes precedence.
DatabricksTelemetryExporterspreadsconfig.customHeadersinto theoutgoing POST headers; auth headers still win on collision.
FeatureFlagCachedoes the same for the feature-flag GET.ConnectionOptions.customHeadersexposed so callers can also injecttheir own out-of-band headers.
OAuth / OIDC token requests use
openid-client's private HTTP transportand never see
customHeaders, matching JDBC's exemption.Not needed vs JDBC: this driver passes
options.paththroughunmodified (no split-on-
=parser), uses Thrift only (no SEA JSON bodywarehouse-ID extraction), and exposes no Volume API surface.
Change 2: Close-ordering fix
While verifying Change 1 against a real SPOG warehouse, the final
client.close()flush emitted:Root cause:
TelemetryClientProvider.releaseClientcalledunregisterContextbeforeclient.close(). On the last refcount, thatemptied the FIFO of
(context, authProvider)pairs before the finalflush, so the exporter's
getAuthProvider()walk returned undefined andthe batch was dropped.
Fix: skip the
unregisteron the last release so the FIFO snapshotsurvives through
close()'s final flush. Multi-refcount path isunchanged. No leak — the TelemetryClient is unreachable as soon as
releaseClientreturns, and the retained(context, authProvider)pair dies with it.
Live verification
Ran against
dogfood-spog.staging.azuredatabricks.netwarehousead6f9b54e6593746?o=7064161269814046:Test plan
Unit tests added (all passing locally):
buildCustomHeadersparses?o=intox-databricks-org-id?o=is absent and no user-suppliedcustomHeaderscustomHeadersis preserved alongside parsed org-idx-databricks-org-id(any case) wins over?o=o=<value>does not inject a headerDBSQLClient.connect()populatesconfig.customHeaders['x-databricks-org-id']from the pathDatabricksTelemetryExporterattachesconfig.customHeaderstothe POST headers
customHeaderson key collisionconfig.customHeadersis emptyFeatureFlagCache.fetchFeatureFlagattachescustomHeaderstothe GET headers
releaseClientkeeps the context registered throughclose()onlast refcount
releaseClientdrops the context immediately when otherrefcounts remain
This pull request and its description were written by Isaac.