feat(sea): queryTags through executeStatement + fetchAll regression lock#393
Open
msrathore-db wants to merge 7 commits into
Open
feat(sea): queryTags through executeStatement + fetchAll regression lock#393msrathore-db wants to merge 7 commits into
msrathore-db wants to merge 7 commits into
Conversation
3 tasks
4e673fd to
7d07de3
Compare
…ding Threads the public `ExecuteStatementOptions.queryTags` through the SEA backend into the kernel's per-statement Spark conf overlay. JS-side adapter path: - Serialises the `Record<string, string | null | undefined>` tag map via the existing `serializeQueryTags` util — same wire shape the Thrift backend produces (`k1:v1,k2:v2`, null values as bare keys). - Writes the serialised string into `statementConf["query_tags"]` on the napi `ExecuteOptions` (defined on the kernel side in `napi/src/connection.rs`). - Pre-serialising at the JS layer (instead of via the napi binding's own `queryTags` field) preserves null-valued tags, which a `HashMap<String, String>` over the FFI boundary cannot carry. Why match Thrift's exact wire shape: SEA and Thrift hit the same server with the same `confOverlay.query_tags` key. A consumer porting from Thrift to SEA gets byte-equivalent tagging behaviour without any code change on their side. Plumbing changes: - `SeaNativeLoader.ts` — `SeaNativeConnection.executeStatement` now accepts an optional `SeaNativeExecuteOptions` second arg with `statementConf` + `queryTags` fields, mirroring the napi-generated `ExecuteOptions` interface. - `SeaSessionBackend.ts` — adapter call site builds `nativeOptions` from the public `queryTags` and forwards it on the napi call. Existing `namedParameters`/`ordinalParameters`/`queryTimeout` deferred errors stay in place. - `native/sea/index.d.ts` — adds `ExecuteOptions` and updates `Connection.executeStatement` signature to take the optional second arg. Mirrors the napi-rs codegen from kernel branch `msrathore/krn-statement-options`. Tests: - `tests/unit/sea/execution-query-tags.test.ts` — 8 cases covering omission/empty, single + multiple tags, null/undefined-valued tags as bare keys, special-char escaping in keys and values. Deferred (separate PRs): - Positional / named parameters (TypedValue mapping over napi) - Row limit, wait timeout (need kernel `StatementSpec` extension) - Async execute (`submit()` + `await_result()` — needs new ExecutedAsyncStatement napi wrapper) Cross-PR dependency: requires kernel branch `msrathore/krn-statement-options` (commit `79bba34`), which is stacked on `msrathore/krn-max-connections`. This branch stacks on `msrathore/sea-max-connections`. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <[email protected]>
Adds an end-to-end test against pecotesting (gated by `DATABRICKS_PECOTESTING_*` env vars; skipped when absent) that exercises `DBSQLOperation.fetchAll()` on the SEA backend and asserts the returned shape matches the Thrift backend's `Array<object>` convention. Implementation status — fetchAll on the SEA path is already wired: - The facade `DBSQLOperation.fetchAll` (`lib/DBSQLOperation.ts`) loops over `fetchChunk` with `disableBuffering: true` until `hasMoreRows()` returns false. Backend-agnostic. - `SeaOperationBackend.fetchChunk` already routes through the `ResultSlicer` over the napi statement's `fetchNextBatch()` IPC stream. So this test is regression-locking the existing path, not exercising new code. Three cases: 1. 100-row range — drain to a flat Array of 100 objects, each with the `id` key. 2. Empty result set — drain to an empty array (no zero-row crash). 3. Multi-column mixed types — assert all four columns round-trip into JS objects with correct types (`id`, `d` Double, `is_even` Boolean, `name` String). Uses the internal `useSEA: true` flag on `client.connect(...)` to route through `SeaBackend`. The `useSEA` flag is consumed via a non-exported cast (see `lib/DBSQLClient.ts:244`) so the public `.d.ts` doesn't ship it — mirrors Python's `use_sea` kwarg pattern. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <[email protected]>
… tests
DA round 1 findings on F4 (fetchAll):
H1 skip-gate (REPEAT of F2 H1): the F4 e2e imports `DBSQLClient`
from `'../../../lib'`, which transitively pulls
`SeaNativeLoader.ts`'s module-level
`require('../../native/sea/index.js')`. When the `.node` artifact
isn't built, that require throws MODULE_NOT_FOUND BEFORE mocha can
invoke `before()`, crashing test discovery. Same root cause as F2
H1, repeating the fix:
1. Skip-gate verifies the native artifact exists in `before()`,
skipping with a clear "run yarn build:native" message if absent
2. Switch the top-level `import { DBSQLClient }` to a `type`-only
import (erased at compile time) + lazy `requireFromHere` inside
the `connect()` helper via `createRequire(import.meta.url)` so
the require works under both CJS and the ESM-reparse path
mocha 11+ may use.
M1 edge tests: add two cases covering drain-twice and
drain-after-close, mirroring Thrift's `DBSQLOperation.fetchAll`
behaviour:
- drain-twice: after the first fetchAll drains the cursor to
end-of-stream, a second fetchAll on the same operation returns
an empty array (do/while body executes zero iterations, []
flattened).
- drain-after-close: fetchAll on a closed operation throws (typed
Error class). Doesn't pin the exact class because the SEA backend's
closed-state error path may surface as either an
`OperationStateError` or a kernel-envelope-decoded error
depending on whether the close had reached the facade-level
lifecycle flag or only the napi layer — both acceptable.
Co-authored-by: Isaac
Signed-off-by: Madhavendra Rathore <[email protected]>
Same ESM-reparse fix as F2's commit `d076067`. Apply named imports `existsSync` from `fs` and `resolve as resolvePath` from `path` so the test runs under mocha's MODULE_TYPELESS_PACKAGE_JSON reparse-as-ESM path. Verified live e2e passes (all 5 cases: 100-row drain, empty-set drain, mixed-type drain, drain-twice, drain-after-close). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <[email protected]>
Adds an end-to-end test against pecotesting (gated by `DATABRICKS_PECOTESTING_*` env vars; skipped when absent) that verifies `queryTags` propagates from `ExecuteStatementOptions` through `SeaSessionBackend.executeStatement` through `serializeQueryTags` through the napi `ExecuteOptions.statementConf` into the SEA wire — and that the server accepts the resulting conf overlay format. Combined with the kernel unit tests (`serialize_query_tags_matches_thrift_byte_shape_*`) that pin the byte shape vs Thrift, this proves end-to-end: - Unit tests pin the JS adapter call shape AND the kernel-side serialiser byte shape vs Thrift parity - This e2e proves the server actually accepts the wire format Four cases: 1. Two normal tags — happy path 2. Tag value with `:`, `,`, and `\\` — escape contract end-to-end 3. Null + undefined valued tags — bare-key form survives the wire 4. No queryTags — default-path regression check All four pass against pecotesting (~500ms each, first one warming the session: 1.9s). DA round-1 F3 "live" finding addressed. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <[email protected]>
7d07de3 to
8c5618c
Compare
DA F3 round-1 M3 finding: airbnb-typescript's `no-unused-vars` rule does not honor the `_`-prefix convention, so the 4 args in this test file that are interface-mandated but unused (the IDBSQLLogger `log(_level, _message)` noop method and the sinon spy `(_sql, _options) => stmt`) were reported as errors. Verified the rule config: `.eslintrc` uses `airbnb-typescript/base` which inherits airbnb-base's `no-unused-vars` with no `argsIgnorePattern` override. Two-line fix: `// eslint-disable-next-line` directly above each site. Both args remain in the signature because they're shape- mandated by the consumer (IDBSQLLogger interface, sinon spy function type matching SeaNativeConnection.executeStatement). Verified `yarn lint` no longer emits errors for `tests/unit/sea/execution-query-tags.test.ts`. Pre-existing lint errors on `operation-lifecycle-e2e.test.ts` and `SeaSessionBackend.ts` are upstream on `sea-auth-u2m` and not introduced or addressed by this commit. Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <[email protected]>
DA F4 round-1 M1 finding (partial): the existing 5 fetch-all e2e cases (range(0,100) row-count, range(0,0) empty, multi-column, drain-twice, drain-after-close) all flow through the row-set generator path. A literal `SELECT 1` returns a single inline batch and exercises a distinct code path inside SeaOperationBackend.fetchChunk → ResultSlicer → ArrowResultConverter that the other 5 do not. One new `it()` block: - Open session, execute `SELECT 1 AS x`, drain via `fetchAll()`. - Assert array of length 1; row has property `x` with non-null, non-undefined value. Type pinning is deliberately loose so the test stays forward-compatible with converter promotion changes; the load-bearing assertion is the single-row inline-batch drain. Drop-in form follows DA's round-3 snippet verbatim (modulo the loosened type pin per the comment above). Gates (touched file only): - `node_modules/.bin/eslint tests/e2e/sea/fetch-all-e2e.test.ts` EXIT=0, zero errors. - `node_modules/.bin/tsc --noEmit` baseline 21 → current 21 (delta 0; the only error in the touched file is a pre-existing `import.meta` issue at line 57 carried over from the F4 round-1 H1 fix — not touched by this commit). Live e2e against pecotesting via `TS_NODE_TRANSPILE_ONLY=true yarn e2e tests/e2e/sea/fetch-all-e2e.test.ts`: 6/6 passing in 4s, with the new test landing at 445ms. Closes the round-3 F4-M1 audit item; F3-M3 closed in 9f1a967; matrix scope-down landed by team-lead on PR #77 (dfea75a). Co-authored-by: Isaac Signed-off-by: Madhavendra Rathore <[email protected]>
Contributor
Author
|
Re-review requested — round-3 fixups landed; DA round-3 sign-off received. Round-3 final tip: This PR carries F3 ( F3 (queryTags):
F4 (fetchAll lock):
Gates on final commit: Co-authored-by: Isaac |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Two features bundled:
queryTags wiring (F3 scope-reduced) — threads the public
ExecuteStatementOptions.queryTagsthroughSeaSessionBackend.executeStatementinto the napi binding's newExecuteOptions.statementConffield. Tags are pre-serialised on the JS side via the existingserializeQueryTagsutil (preserves null-valued tags) and written intostatementConf["query_tags"]. Same wire shape NodeJS Thrift produces — byte-equivalent parity.fetchAll regression lock (F4) — adds an e2e regression test exercising
DBSQLOperation.fetchAll()end-to-end on the SEA backend withuseSEA: true. The drain primitive is already wired (facadefetchAllloops overSeaOperationBackend.fetchChunk); this test pins the shape against Thrift'sArray<object>convention.F3 — Scope deferred to follow-on PRs
StatementSpecextensions)ExecutedAsyncStatementnapi wrapper class)Cross-repo dependency
Requires kernel branch
msrathore/krn-statement-options(PR https://github.com/databricks/databricks-sql-kernel/pull/75) for the napi-bindingExecuteOptionsinterface.Plumbing
SeaNativeLoader.ts—SeaNativeConnection.executeStatementaccepts optionalSeaNativeExecuteOptionssecond argSeaSessionBackend.ts— adapter buildsnativeOptions.statementConf.query_tagsfrom the publicqueryTags; existing deferred-to-M1 errors fornamedParameters/ordinalParameters/queryTimeoutstaynative/sea/index.d.ts— addsExecuteOptionsinterface + updatesConnection.executeStatement(sql, options?)signaturetests/unit/sea/execution-query-tags.test.ts— 8 unit cases covering null-valued tags, special-char escaping, multi-tag orderingtests/e2e/sea/fetch-all-e2e.test.ts— 3 live-warehouse cases: 100-row drain, empty-set drain, multi-column mixed-type drainTest plan
tests/unit/sea/execution-query-tags.test.ts— 8/8 passStack
PR 10 of the NodeJS SEA stack:
1-8 = existing stack tips
9 =
msrathore/sea-max-connections(F1+F2, PR #392)10 =
msrathore/sea-statement-options(F3+F4, this PR)