Skip to content

feat(sea): queryTags through executeStatement + fetchAll regression lock#393

Open
msrathore-db wants to merge 7 commits into
msrathore/sea-max-connectionsfrom
msrathore/sea-statement-options
Open

feat(sea): queryTags through executeStatement + fetchAll regression lock#393
msrathore-db wants to merge 7 commits into
msrathore/sea-max-connectionsfrom
msrathore/sea-statement-options

Conversation

@msrathore-db
Copy link
Copy Markdown
Contributor

Summary

Two features bundled:

  1. queryTags wiring (F3 scope-reduced) — threads the public ExecuteStatementOptions.queryTags through SeaSessionBackend.executeStatement into the napi binding's new ExecuteOptions.statementConf field. Tags are pre-serialised on the JS side via the existing serializeQueryTags util (preserves null-valued tags) and written into statementConf["query_tags"]. Same wire shape NodeJS Thrift produces — byte-equivalent parity.

  2. fetchAll regression lock (F4) — adds an e2e regression test exercising DBSQLOperation.fetchAll() end-to-end on the SEA backend with useSEA: true. The drain primitive is already wired (facade fetchAll loops over SeaOperationBackend.fetchChunk); this test pins the shape against Thrift's Array<object> convention.

F3 — Scope deferred to follow-on PRs

  • Positional / named parameters (need TypedValue mapping over napi)
  • Row limit, wait timeout (need kernel-side StatementSpec extensions)
  • Async execute (need new ExecutedAsyncStatement napi wrapper class)
  • TIMESTAMP_NTZ/LTZ/VARIANT (Thrift gap — skip per scope rubric)

Cross-repo dependency

Requires kernel branch msrathore/krn-statement-options (PR https://github.com/databricks/databricks-sql-kernel/pull/75) for the napi-binding ExecuteOptions interface.

Plumbing

  • SeaNativeLoader.tsSeaNativeConnection.executeStatement accepts optional SeaNativeExecuteOptions second arg
  • SeaSessionBackend.ts — adapter builds nativeOptions.statementConf.query_tags from the public queryTags; existing deferred-to-M1 errors for namedParameters/ordinalParameters/queryTimeout stay
  • native/sea/index.d.ts — adds ExecuteOptions interface + updates Connection.executeStatement(sql, options?) signature
  • tests/unit/sea/execution-query-tags.test.ts — 8 unit cases covering null-valued tags, special-char escaping, multi-tag ordering
  • tests/e2e/sea/fetch-all-e2e.test.ts — 3 live-warehouse cases: 100-row drain, empty-set drain, multi-column mixed-type drain

Test plan

  • tests/unit/sea/execution-query-tags.test.ts — 8/8 pass
  • Cross-test regression (auth-pat + auth-max-connections + execution-query-tags) — 25/25 pass
  • tsc: zero new TS errors from F3+F4 (line numbers on baseline-pre-existing errors shift due to added code, count unchanged)
  • Live warehouse e2e — pending merge of kernel PR

Stack

PR 10 of the NodeJS SEA stack:
1-8 = existing stack tips
9 = msrathore/sea-max-connections (F1+F2, PR #392)
10 = msrathore/sea-statement-options (F3+F4, this PR)

@msrathore-db msrathore-db force-pushed the msrathore/sea-statement-options branch 2 times, most recently from 4e673fd to 7d07de3 Compare May 25, 2026 22:38
…ding

Threads the public `ExecuteStatementOptions.queryTags` through the
SEA backend into the kernel's per-statement Spark conf overlay.

JS-side adapter path:
- Serialises the `Record<string, string | null | undefined>` tag map
  via the existing `serializeQueryTags` util — same wire shape the
  Thrift backend produces (`k1:v1,k2:v2`, null values as bare keys).
- Writes the serialised string into `statementConf["query_tags"]`
  on the napi `ExecuteOptions` (defined on the kernel side in
  `napi/src/connection.rs`).
- Pre-serialising at the JS layer (instead of via the napi binding's
  own `queryTags` field) preserves null-valued tags, which a
  `HashMap<String, String>` over the FFI boundary cannot carry.

Why match Thrift's exact wire shape: SEA and Thrift hit the same
server with the same `confOverlay.query_tags` key. A consumer
porting from Thrift to SEA gets byte-equivalent tagging behaviour
without any code change on their side.

Plumbing changes:
- `SeaNativeLoader.ts` — `SeaNativeConnection.executeStatement` now
  accepts an optional `SeaNativeExecuteOptions` second arg with
  `statementConf` + `queryTags` fields, mirroring the napi-generated
  `ExecuteOptions` interface.
- `SeaSessionBackend.ts` — adapter call site builds `nativeOptions`
  from the public `queryTags` and forwards it on the napi call.
  Existing `namedParameters`/`ordinalParameters`/`queryTimeout`
  deferred errors stay in place.
- `native/sea/index.d.ts` — adds `ExecuteOptions` and updates
  `Connection.executeStatement` signature to take the optional
  second arg. Mirrors the napi-rs codegen from kernel branch
  `msrathore/krn-statement-options`.

Tests:
- `tests/unit/sea/execution-query-tags.test.ts` — 8 cases covering
  omission/empty, single + multiple tags, null/undefined-valued
  tags as bare keys, special-char escaping in keys and values.

Deferred (separate PRs):
- Positional / named parameters (TypedValue mapping over napi)
- Row limit, wait timeout (need kernel `StatementSpec` extension)
- Async execute (`submit()` + `await_result()` — needs new
  ExecutedAsyncStatement napi wrapper)

Cross-PR dependency: requires kernel branch
`msrathore/krn-statement-options` (commit `79bba34`), which is
stacked on `msrathore/krn-max-connections`. This branch stacks
on `msrathore/sea-max-connections`.

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <[email protected]>
Adds an end-to-end test against pecotesting (gated by
`DATABRICKS_PECOTESTING_*` env vars; skipped when absent) that
exercises `DBSQLOperation.fetchAll()` on the SEA backend and
asserts the returned shape matches the Thrift backend's
`Array<object>` convention.

Implementation status — fetchAll on the SEA path is already
wired:
- The facade `DBSQLOperation.fetchAll` (`lib/DBSQLOperation.ts`)
  loops over `fetchChunk` with `disableBuffering: true` until
  `hasMoreRows()` returns false. Backend-agnostic.
- `SeaOperationBackend.fetchChunk` already routes through the
  `ResultSlicer` over the napi statement's `fetchNextBatch()`
  IPC stream.

So this test is regression-locking the existing path, not
exercising new code. Three cases:
1. 100-row range — drain to a flat Array of 100 objects, each
   with the `id` key.
2. Empty result set — drain to an empty array (no zero-row crash).
3. Multi-column mixed types — assert all four columns round-trip
   into JS objects with correct types (`id`, `d` Double,
   `is_even` Boolean, `name` String).

Uses the internal `useSEA: true` flag on `client.connect(...)` to
route through `SeaBackend`. The `useSEA` flag is consumed via a
non-exported cast (see `lib/DBSQLClient.ts:244`) so the public
`.d.ts` doesn't ship it — mirrors Python's `use_sea` kwarg pattern.

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <[email protected]>
… tests

DA round 1 findings on F4 (fetchAll):

H1 skip-gate (REPEAT of F2 H1): the F4 e2e imports `DBSQLClient`
from `'../../../lib'`, which transitively pulls
`SeaNativeLoader.ts`'s module-level
`require('../../native/sea/index.js')`. When the `.node` artifact
isn't built, that require throws MODULE_NOT_FOUND BEFORE mocha can
invoke `before()`, crashing test discovery. Same root cause as F2
H1, repeating the fix:
1. Skip-gate verifies the native artifact exists in `before()`,
   skipping with a clear "run yarn build:native" message if absent
2. Switch the top-level `import { DBSQLClient }` to a `type`-only
   import (erased at compile time) + lazy `requireFromHere` inside
   the `connect()` helper via `createRequire(import.meta.url)` so
   the require works under both CJS and the ESM-reparse path
   mocha 11+ may use.

M1 edge tests: add two cases covering drain-twice and
drain-after-close, mirroring Thrift's `DBSQLOperation.fetchAll`
behaviour:
- drain-twice: after the first fetchAll drains the cursor to
  end-of-stream, a second fetchAll on the same operation returns
  an empty array (do/while body executes zero iterations, []
  flattened).
- drain-after-close: fetchAll on a closed operation throws (typed
  Error class). Doesn't pin the exact class because the SEA backend's
  closed-state error path may surface as either an
  `OperationStateError` or a kernel-envelope-decoded error
  depending on whether the close had reached the facade-level
  lifecycle flag or only the napi layer — both acceptable.

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <[email protected]>
Same ESM-reparse fix as F2's commit `d076067`. Apply named imports
`existsSync` from `fs` and `resolve as resolvePath` from `path` so
the test runs under mocha's MODULE_TYPELESS_PACKAGE_JSON
reparse-as-ESM path.

Verified live e2e passes (all 5 cases: 100-row drain, empty-set
drain, mixed-type drain, drain-twice, drain-after-close).

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <[email protected]>
Adds an end-to-end test against pecotesting (gated by
`DATABRICKS_PECOTESTING_*` env vars; skipped when absent) that
verifies `queryTags` propagates from `ExecuteStatementOptions`
through `SeaSessionBackend.executeStatement` through
`serializeQueryTags` through the napi `ExecuteOptions.statementConf`
into the SEA wire — and that the server accepts the resulting
conf overlay format.

Combined with the kernel unit tests
(`serialize_query_tags_matches_thrift_byte_shape_*`) that pin the
byte shape vs Thrift, this proves end-to-end:
- Unit tests pin the JS adapter call shape AND the kernel-side
  serialiser byte shape vs Thrift parity
- This e2e proves the server actually accepts the wire format

Four cases:
1. Two normal tags — happy path
2. Tag value with `:`, `,`, and `\\` — escape contract end-to-end
3. Null + undefined valued tags — bare-key form survives the wire
4. No queryTags — default-path regression check

All four pass against pecotesting (~500ms each, first one warming
the session: 1.9s).

DA round-1 F3 "live" finding addressed.

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <[email protected]>
@msrathore-db msrathore-db force-pushed the msrathore/sea-statement-options branch from 7d07de3 to 8c5618c Compare May 25, 2026 22:50
DA F3 round-1 M3 finding: airbnb-typescript's `no-unused-vars` rule
does not honor the `_`-prefix convention, so the 4 args in this
test file that are interface-mandated but unused (the IDBSQLLogger
`log(_level, _message)` noop method and the sinon spy
`(_sql, _options) => stmt`) were reported as errors.

Verified the rule config: `.eslintrc` uses
`airbnb-typescript/base` which inherits airbnb-base's
`no-unused-vars` with no `argsIgnorePattern` override.

Two-line fix: `// eslint-disable-next-line` directly above each
site. Both args remain in the signature because they're shape-
mandated by the consumer (IDBSQLLogger interface, sinon spy
function type matching SeaNativeConnection.executeStatement).

Verified `yarn lint` no longer emits errors for
`tests/unit/sea/execution-query-tags.test.ts`. Pre-existing lint
errors on `operation-lifecycle-e2e.test.ts` and
`SeaSessionBackend.ts` are upstream on `sea-auth-u2m` and not
introduced or addressed by this commit.

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <[email protected]>
DA F4 round-1 M1 finding (partial): the existing 5 fetch-all e2e
cases (range(0,100) row-count, range(0,0) empty, multi-column,
drain-twice, drain-after-close) all flow through the row-set
generator path. A literal `SELECT 1` returns a single inline
batch and exercises a distinct code path inside
SeaOperationBackend.fetchChunk → ResultSlicer → ArrowResultConverter
that the other 5 do not.

One new `it()` block:
- Open session, execute `SELECT 1 AS x`, drain via `fetchAll()`.
- Assert array of length 1; row has property `x` with non-null,
  non-undefined value. Type pinning is deliberately loose so the
  test stays forward-compatible with converter promotion changes;
  the load-bearing assertion is the single-row inline-batch drain.

Drop-in form follows DA's round-3 snippet verbatim (modulo the
loosened type pin per the comment above).

Gates (touched file only):
- `node_modules/.bin/eslint tests/e2e/sea/fetch-all-e2e.test.ts`
  EXIT=0, zero errors.
- `node_modules/.bin/tsc --noEmit` baseline 21 → current 21
  (delta 0; the only error in the touched file is a pre-existing
  `import.meta` issue at line 57 carried over from the F4 round-1
  H1 fix — not touched by this commit).

Live e2e against pecotesting via `TS_NODE_TRANSPILE_ONLY=true yarn
e2e tests/e2e/sea/fetch-all-e2e.test.ts`:
  6/6 passing in 4s, with the new test landing at 445ms.

Closes the round-3 F4-M1 audit item; F3-M3 closed in 9f1a967;
matrix scope-down landed by team-lead on PR #77 (dfea75a).

Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <[email protected]>
@msrathore-db
Copy link
Copy Markdown
Contributor Author

Re-review requested — round-3 fixups landed; DA round-3 sign-off received.

Round-3 final tip: 4427d4d68906190bcff60913d3ce1cc3861fb9ce

This PR carries F3 (queryTags through executeStatement) + F4 (fetchAll regression lock). Three rounds of DA verification across both features; all findings closed:

F3 (queryTags):

  • F3-M3 ESLint @typescript-eslint/no-unused-vars errors in tests/unit/sea/execution-query-tags.test.ts closed at 9f1a967 — added per-line // eslint-disable-next-line for the IDBSQLLogger noop method + sinon spy params (DA verified EXIT 0 + 8/8 mocha pass)
  • F3-Live tests/e2e/sea/statement-options-e2e.test.ts 4/4 passing, independently reproduced by DA

F4 (fetchAll lock):

  • F4-H1 same root-cause SeaNativeLoader lazy-load (shared with F2)
  • F4-M1 edge cases — drain-twice + drain-after-close + inline-result-set (SELECT 1, just added in 4427d4d)
  • F4-Live tests/e2e/sea/fetch-all-e2e.test.ts 6/6 passing (up from 5/5 with the new inline-result-set case at 445ms)

Gates on final commit: eslint tests/e2e/sea/fetch-all-e2e.test.ts EXIT 0; tsc --noEmit baseline 21 → current 21 (delta 0). Live e2e 6/6 in 4s.

Co-authored-by: Isaac

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant