Skip to content

[SEA-NodeJS] (1/8) Backend abstraction — IBackend / ISessionBackend / IOperationBackend#378

Draft
msrathore-db wants to merge 3 commits into
mainfrom
msrathore/sea-abstraction
Draft

[SEA-NodeJS] (1/8) Backend abstraction — IBackend / ISessionBackend / IOperationBackend#378
msrathore-db wants to merge 3 commits into
mainfrom
msrathore/sea-abstraction

Conversation

@msrathore-db
Copy link
Copy Markdown
Contributor

@msrathore-db msrathore-db commented May 16, 2026

Stack

Linear stack of 8 PRs landing the M0 + M1 Phase 1 SEA NodeJS work. Merge in order from base ↑ to tip. The tip branch (msrathore/sea-auth-u2m, PR #383) is the single snapshot containing everything in flight — point your test or benchmark harness at it for an end-to-end check.

Pos PR Branch Scope
1/8 #378 sea-abstraction IBackend / ISessionBackend / IOperationBackend interfaces
2/8 #380 sea-napi-binding TS loader + build script for the kernel-provided .node artifact
3/8 #377 sea-errors-logging Kernel ErrorCode → JS error-class mapping (M0 minimum)
4/8 #379 sea-auth PAT auth via useSEA: true
5/8 #382 sea-execution executeStatement + openSession (sessionConfig, initialCatalog/Schema)
6/8 #381 sea-results CloudFetch + Inline Arrow result fetching
7/8 #384 sea-operation cancel / close / finished lifecycle + INTERVAL parity + napi-relocation acceptance (absorbed sea-integration content)
8/8 #383 sea-auth-u2m ← TIP M1 Phase 1 OAuth M2M + U2M (5 review rounds, ZERO HIGH at close)

Companion kernel stack (databricks/databricks-sql-kernel): 8 PRs — root #26 (async-public-api) → #27#25#29#28#30#24#23 (tip).

Policy: new PRs always stack on the current tip. No sibling/parallel topology. No force-pushes on existing PRs unless absolutely necessary; if a PR's content is wrong, add a fix-up commit on top of the stack tip rather than rewriting history.


This PR is position 1/8 (ROOT of the stack).

Summary

Foundational backend abstraction — introduces IBackend / ISessionBackend / IOperationBackend interfaces and refactors the existing thrift driver to plug in behind them.

Size note (1839 LOC, over the 800 cap) — foundational refactor

+1064/-775 ratio means ~775 lines are pre-existing code being refactored to plug into the new interfaces (renames, restructuring, no behavior change). Actual review burden is lower than the raw LOC suggests — most of the diff is mechanical file-level refactoring rather than new logic.

Cannot be cleanly split: interfaces + at least one implementation (thrift) must land together to compile.

Test plan

  • ✅ Existing thrift backend wraps cleanly behind IBackend interface
  • ✅ No behavior change on the thrift path
  • ✅ Stub SeaBackend compiles + exposes the same interface

Draft until you give the go for review.

Tracking

  • PECOBLR-2666 — NodeJS SEA integration (parent epic — backend abstraction scaffolding)

Co-authored-by: Isaac

…kend

Refactors DBSQLClient/Session/Operation to dispatch through three
backend interfaces. ThriftBackend (lib/thrift-backend/) contains the
relocated existing thrift logic. SeaBackend (lib/sea/) is a stub for
M0; the sea-napi-binding feature wires the real impl.

Public surface (lib/index.ts) unchanged.
No new dependencies. All existing tests pass.

Files:
- lib/contracts/IBackend.ts (new)
- lib/contracts/ISessionBackend.ts (new)
- lib/contracts/IOperationBackend.ts (new)
- lib/contracts/IDBSQLClient.ts (adds useSEA?: boolean to ConnectionOptions)
- lib/thrift-backend/ThriftBackend.ts (new)
- lib/thrift-backend/ThriftSessionBackend.ts (new)
- lib/thrift-backend/ThriftOperationBackend.ts (new)
- lib/sea/SeaBackend.ts (new, M0 stub)
- lib/DBSQLClient.ts (dispatch through IBackend; useSEA picks SeaBackend)
- lib/DBSQLSession.ts (facade over ISessionBackend; staging stays here)
- lib/DBSQLOperation.ts (facade over IOperationBackend; iterators/fetchAll stay here)
- tests/unit/DBSQLClient.test.ts (retarget internal state lookup through backend; pre-seed client.backend in tests that bypass connect())
- tests/unit/DBSQLOperation.test.ts (retarget internal state lookup through backend)
…nline type

Addresses code-bloat-watchdog findings from commit 0085928:
- Restores public-API JSDoc on DBSQLSession + DBSQLOperation methods
  (was deleted as scope creep; contracts unchanged so docs still apply)
- Adds makeStubbedClient() helper to tests/unit/DBSQLClient.test.ts;
  replaces 14× duplicated ThriftBackend pre-seed
- Imports WaitUntilReadyOptions instead of inline option types in
  IOperationBackend + DBSQLOperation.waitUntilReady
@github-actions
Copy link
Copy Markdown

Thanks for your contribution! To satisfy the DCO policy in our contributing guide every commit message must include a sign-off message. One or more of your commits is missing this message. You can reword previous commit messages with an interactive rebase (git rebase -i main).

Round-N fixes from the 9-reviewer pre-review. Public IOperation/DBSQLOperation
surface preserved byte-identical; backend interfaces (IBackend / ISessionBackend
/ IOperationBackend) made fully neutral so both Thrift and SEA can implement
the same contract.

F1 — neutral DTOs at IOperationBackend with Thrift-shape preservation on the
  public facade (adapter pattern):
- lib/contracts/OperationStatus.ts (new) — neutral OperationStatus + OperationState
  enum mirroring databricks-sql-python's CommandState and kernel pyo3's
  StatementStatus taxonomy.
- lib/contracts/ResultMetadata.ts (new) — neutral ResultMetadata + ResultFormat
  enum mirroring the three TSparkRowSetType cases.
- IOperationBackend.status()/getResultMetadata() return the neutral DTOs.
- ThriftOperationBackend.status() adapts at the boundary via adaptOperationStatus
  / adaptResultMetadata; module-level helpers thriftStateToOperationState and
  thriftRowSetTypeToResultFormat do the enum maps.
- ThriftOperationBackend exposes thriftStatusResponse() and
  thriftResultMetadataResponse() as public Thrift-only accessors used by the
  facade's zero-loss fast path (kept for internal state-machine + result-handler
  dispatch as well).
- lib/utils/thriftWireSynthesis.ts (new) — synthesizeThriftStatus and
  synthesizeThriftResultSetMetadata: convert neutral DTOs back to Thrift wire
  shape for the non-Thrift backend path. Lossy on Thrift-only fields
  (taskStatus, numModifiedRows, cacheLookupResult, etc.).
- DBSQLOperation.status() and getMetadata() preserved Thrift return shape:
  Thrift backend path returns the real wire response (zero loss); non-Thrift
  backend path synthesizes via the new helpers.
- DBSQLOperation.getResultMetadata() — new additive neutral accessor on
  IOperation; DBSQLSession.handleStagingOperation uses it instead of the
  deprecated Thrift-shaped getMetadata().

F2 — IBackend.connect() is now zero-arg. Backend reads everything it needs
  from IClientContext / constructor; matches Python connector's pattern of
  passing session_configuration via constructor not method-arg.

F3 — Restore the 'Server protocol version' debug log dropped by the original
  PR-378 refactor. Re-added to ThriftSessionBackend.constructor with the
  LogLevel.debug + IClientContext.getLogger() pattern; matches the pre-refactor
  log site at main:lib/DBSQLSession.ts:175.

F4 + F11 + F14 — SeaBackend stub safety:
- close() is a no-op so DBSQLClient.close()'s state-clearing block can finish
  even after a useSEA: true connect() failure.
- connect() and openSession() throw HiveDriverError instead of generic Error,
  matching the rest of the codebase.
- connect(options: ConnectionOptions) and openSession(request: OpenSessionRequest)
  declare their parameters (with @typescript-eslint/no-unused-vars disable)
  so IDE autocomplete prompts the M1 SEA implementer.

F6 + F7 + F9 + F10 — JSDoc on backend interfaces:
- IBackend: connect/openSession/close docstrings; close() doc explicitly
  states transport-layer cleanup is owned by DBSQLClient.
- ISessionBackend: copy IDBSQLSession's per-method one-liner JSDoc.
- IOperationBackend: doc hasResultSet (readonly external; mutates internally),
  waitUntilReady (MUST throw OperationStateError on terminal non-success).

F8 — tests/unit/sea/SeaBackend.test.ts (new) locks in the stub contract:
  connect() rejects HiveDriverError, openSession() rejects HiveDriverError,
  close() resolves no-op. ~30 LOC.

F12 — Drop legacy { handle, ... } ctor branch from DBSQLOperation and
  DBSQLSession:
- Facades accept only { backend, context }.
- DBSQLSession no longer imports ThriftSessionBackend at all.
- DBSQLOperation imports ThriftOperationBackend solely for the F1 typed
  downcast (zero-loss Thrift fast path); this is a deliberate, scoped
  coupling tied to the back-compat decision.
- tests/unit/.stubs/createSessionForTest.ts and createOperationForTest.ts
  (new) wrap the legacy shape; all 48 + 54 test sites mechanically migrated.

F15 — ThriftOperationBackend.waitUntilReady uses imported WaitUntilReadyOptions
  type instead of an anonymous inline shape.

F16 — useSEA flag moved out of public ConnectionOptions:
- Removed useSEA?: boolean from the exported lib/contracts/IDBSQLClient.ts
  ConnectionOptions; no longer ships in the public .d.ts.
- lib/contracts/InternalConnectionOptions.ts (new) declares the flag as a
  non-exported internal extension; DBSQLClient.connect() reads via a typed
  cast. Mirrors Python's kwargs.get('use_sea', False) pattern at
  databricks-sql-python/src/databricks/sql/session.py:111.

F17 — Missing return; after case 'timeout' in forwardConnectionEvent so a
  future fifth case doesn't silently fall through. The trailing return; in
  the last case triggers no-useless-return — quieted with a localized
  eslint-disable-next-line + intent comment.

F5 — deferred per owner instruction (test-only as any cast tightening).

Verification:
- yarn lint clean (3 pre-existing warnings in tests/e2e/protocol_versions.test.ts).
- yarn build clean.
- tsc --noEmit -p tsconfig.json clean (apart from pre-existing
  examples/tokenFederation/* import errors that exist on main).
- Runtime smoke test of SeaBackend stub + Thrift-wire synthesis round-trip
  passes 5/5 assertions.

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
msrathore-db added a commit that referenced this pull request May 24, 2026
Addresses PR #380 review findings C1, C2 (partial), C3, H1, H2, H3,
H4, H5, H6, H7, H8 (runtime guard), H9, M1, M2, M3, M4 (linguist),
M5, M6, M7, M8, L1, L2, L4.

- C1 lint: drop `.js` ext from native/sea require so eslint
  import/extensions passes; gitignore the auto-generated index.js
  and *.node artifacts; prettier-ignore the napi-rs auto-generated
  index.d.ts / index.js.
- C2 publish: whitelist native/sea/index.{js,d.ts} in .npmignore;
  declare @databricks/sea-native-linux-x64-gnu in optionalDependencies.
  The kernel-side package-name rename (drop the linux-x64-gnu
  prefix) is tracked separately as a cross-PR ask.
- C3 test wiring: move tests/native/version.test.ts ->
  tests/unit/sea/, tests/native/e2e-smoke.test.ts -> tests/e2e/sea/;
  both now picked up by the existing mocharc globs and run via the
  existing `npm test` / `npm run e2e` jobs.
- H1 noise drop: version.test.ts now asserts one meaningful semver
  check; three tautological assertions removed.
- H2 + H3 + H7 lazy load: rewrite SeaNativeLoader.ts on the
  lib/utils/lz4.ts pattern. Lazy require behind getSeaNative();
  capability-detection helper tryGetSeaNative(); structured error
  messages that classify MODULE_NOT_FOUND vs ERR_DLOPEN_FAILED and
  include platform/arch/Node-version + install hint.
- H4 supply chain: pin @napi-rs/cli to 2.18.4 in devDependencies;
  build:native switches to `npx --no-install` so the pinned local
  install is used (no per-build network fetch).
- H5 path: switch build:native default kernel path to the canonical
  `../../databricks-sql-kernel/napi-binding` (the worktree-specific
  `-sea-WT` suffix is gone).
- H6 CI safety: e2e suite hard-fails when CI=true and any required
  env var is missing; dev machines still skip.
- H8 runtime guard: loader throws a structured error on Node <18
  instead of attempting a dlopen that would fail mysteriously.
  Driver's engines.node stays >=14 — Thrift consumers on older
  Node continue to work.
- H9 + M1 + M2 types: tsconfig adds a `@sea-native` path alias to
  native/sea/index.d.ts; loader imports the real Connection /
  Statement / ConnectionOptions / ExecuteOptions / ArrowBatch /
  ArrowSchema types and re-exports them. Tests use the re-exported
  types — the inline-shape duplication across three files collapses.
- M3 README: rewrite native/sea/README.md to match kernel reality.
  The napi crate is a standalone Cargo workspace (not a pyo3
  sibling); explain the tls-rustls choice that the standalone
  workspace exists to enable.
- M4 drift detection: mark native/sea/index.d.ts as
  linguist-generated in .gitattributes so GitHub collapses it in
  diffs and excludes from blame/language stats.
- M5 artifacts: gitignore native/sea/index.js, index.node,
  index.*.node.
- M6 + M7 e2e coverage: decode the IPC payload via apache-arrow's
  tableFromIPC and assert numRows + cell value (not just shape);
  add drain-past-null idempotence and schema-before-fetch
  coverage.
- L1 build scripts: collapse build:native + build:native:debug
  into one script taking BUILD_PROFILE (defaults to --release).
- L2 / L4 / M8: drop the version() alias and the "Round 2+ will…"
  comment debt; the loader now actually delivers the value the
  prior JSDoc was only promising.

Verified on this branch: npm run lint clean (0 errors); npm run
prettier clean for PR-owned files (3 unrelated pre-existing
warnings on PR #378 territory); tsc --project tsconfig.build.json
clean; mocha on tests/unit/sea passes (4/4); mocha on
tests/e2e/sea passes against a live pecotesting warehouse (2/2,
IPC decode confirms SELECT 1 returns 1).
msrathore-db added a commit that referenced this pull request May 24, 2026
Addresses PR #380 review findings C1, C2 (partial), C3, H1, H2, H3,
H4, H5, H6, H7, H8 (runtime guard), H9, M1, M2, M3, M4 (linguist),
M5, M6, M7, M8, L1, L2, L4.

- C1 lint: drop `.js` ext from native/sea require so eslint
  import/extensions passes; gitignore the auto-generated index.js
  and *.node artifacts; prettier-ignore the napi-rs auto-generated
  index.d.ts / index.js.
- C2 publish: whitelist native/sea/index.{js,d.ts} in .npmignore;
  declare @databricks/sea-native-linux-x64-gnu in optionalDependencies.
  The kernel-side package-name rename (drop the linux-x64-gnu
  prefix) is tracked separately as a cross-PR ask.
- C3 test wiring: move tests/native/version.test.ts ->
  tests/unit/sea/, tests/native/e2e-smoke.test.ts -> tests/e2e/sea/;
  both now picked up by the existing mocharc globs and run via the
  existing `npm test` / `npm run e2e` jobs.
- H1 noise drop: version.test.ts now asserts one meaningful semver
  check; three tautological assertions removed.
- H2 + H3 + H7 lazy load: rewrite SeaNativeLoader.ts on the
  lib/utils/lz4.ts pattern. Lazy require behind getSeaNative();
  capability-detection helper tryGetSeaNative(); structured error
  messages that classify MODULE_NOT_FOUND vs ERR_DLOPEN_FAILED and
  include platform/arch/Node-version + install hint.
- H4 supply chain: pin @napi-rs/cli to 2.18.4 in devDependencies;
  build:native switches to `npx --no-install` so the pinned local
  install is used (no per-build network fetch).
- H5 path: switch build:native default kernel path to the canonical
  `../../databricks-sql-kernel/napi-binding` (the worktree-specific
  `-sea-WT` suffix is gone).
- H6 CI safety: e2e suite hard-fails when CI=true and any required
  env var is missing; dev machines still skip.
- H8 runtime guard: loader throws a structured error on Node <18
  instead of attempting a dlopen that would fail mysteriously.
  Driver's engines.node stays >=14 — Thrift consumers on older
  Node continue to work.
- H9 + M1 + M2 types: tsconfig adds a `@sea-native` path alias to
  native/sea/index.d.ts; loader imports the real Connection /
  Statement / ConnectionOptions / ExecuteOptions / ArrowBatch /
  ArrowSchema types and re-exports them. Tests use the re-exported
  types — the inline-shape duplication across three files collapses.
- M3 README: rewrite native/sea/README.md to match kernel reality.
  The napi crate is a standalone Cargo workspace (not a pyo3
  sibling); explain the tls-rustls choice that the standalone
  workspace exists to enable.
- M4 drift detection: mark native/sea/index.d.ts as
  linguist-generated in .gitattributes so GitHub collapses it in
  diffs and excludes from blame/language stats.
- M5 artifacts: gitignore native/sea/index.js, index.node,
  index.*.node.
- M6 + M7 e2e coverage: decode the IPC payload via apache-arrow's
  tableFromIPC and assert numRows + cell value (not just shape);
  add drain-past-null idempotence and schema-before-fetch
  coverage.
- L1 build scripts: collapse build:native + build:native:debug
  into one script taking BUILD_PROFILE (defaults to --release).
- L2 / L4 / M8: drop the version() alias and the "Round 2+ will…"
  comment debt; the loader now actually delivers the value the
  prior JSDoc was only promising.

Verified on this branch: npm run lint clean (0 errors); npm run
prettier clean for PR-owned files (3 unrelated pre-existing
warnings on PR #378 territory); tsc --project tsconfig.build.json
clean; mocha on tests/unit/sea passes (4/4); mocha on
tests/e2e/sea passes against a live pecotesting warehouse (2/2,
IPC decode confirms SELECT 1 returns 1).

Signed-off-by: Madhavendra Rathore <madhavendra.rathore@databricks.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant