Skip to content

fix(backend/kernel): comparator parity — async retention + intervals_as_string + precision/scale + named params + utils.exc fix#803

Merged
vikrantpuppala merged 1 commit into
mainfrom
fix/kernel-async-statement-retention
May 25, 2026
Merged

fix(backend/kernel): comparator parity — async retention + intervals_as_string + precision/scale + named params + utils.exc fix#803
vikrantpuppala merged 1 commit into
mainfrom
fix/kernel-async-statement-retention

Conversation

@vikrantpuppala
Copy link
Copy Markdown
Contributor

@vikrantpuppala vikrantpuppala commented May 23, 2026

Summary

Five comparator-parity fixes bundled because they share the same audit baseline and the same kernel-backend client surface.

1. Async Statement retention

KernelDatabricksClient.execute_command closed the parent Statement in finally regardless of async_op. The kernel's Statement.close() invalidates child handles (see databricks-sql-kernel/src/statement/validity.rs), so the async handle was being killed before the user could poll it — breaking the entire async surface (execute_asyncis_query_pendingget_async_execution_result).

Fix: when async_op=True, retain the parent Statement in a new _async_statements dict alongside _async_handles, and close it from close_command, close_session, and get_execution_result after the executed handle is done.

Comparator outcome: STATEMENT_ASYNC suite 3/3 match (was 0/3).

2. intervals_as_string wire-through (companion to kernel PR #64)

pyarrow's Python bindings cannot decode Arrow's month_interval type at all (id 21 — raises KeyError from .as_py, to_pylist, cast(string), to_pandas). Every kernel-backend SELECT * over any table with an INTERVAL YEAR TO MONTH column threw ArrowNotImplementedError32 / 88 audit diffs.

Fix: pass intervals_as_string=True to the kernel Session(...) constructor unconditionally. The kernel post-processor then stringifies Interval / Duration columns server-side to Utf8 (see kernel PR #64), so pyarrow never sees the unreadable type.

Comparator outcome: bucket A (ArrowNotImplementedError) drops from 32 → 0 diffs.

3. Decimal precision/scale extraction

description_from_arrow_schema hard-coded None for slots 4 and 5 of the PEP 249 description tuple. For DECIMAL columns the Arrow schema carries precision / scale on Decimal128Type, but we were silently dropping them. The kernel backend returned:

('decimal_column', 'decimal', None, None, None, None, None)

while Thrift returned:

('decimal_column', 'decimal', None, None, 10, 2, None)

Any consumer that reads PEP 249 slots 4/5 (SQLAlchemy, pandas-read-sql, etc.) can't distinguish DECIMAL(10,2) from DECIMAL(38,18) on the kernel backend. 88 comparator diffs (44 precision + 44 scale).

Fix: factor out _precision_scale_for_arrow_type(arrow_type) and call it from the description builder. Today it only handles decimals; future extensions (e.g. fractional-second precision from Time64Type) slot in here without touching the description builder.

Comparator outcome: 88 precision+scale diffs → 0.

4. Named-parameter binding (companion to kernel PR #65)

bind_tspark_params previously raised NotSupportedError for any TSparkParameter with name set. The canonical SEA proto marks StatementParameter.name as openapi_required=true — named binding is the spec-required public form, and ordinal is PUBLIC_UNDOCUMENTED. Kernel PR #65 added a Statement.bind_named_param PyO3 API mirroring bind_param.

Fix: route named bindings via the new kernel API. Positional ordinals self-increment so a named entry in the middle of the list doesn't claim a positional slot.

Comparator outcome: PREPARED_STATEMENT_NAMED 1/1 match (was 0/1). Bucket B1 in comparator-diff-tasklist.md drops to 0 diffs.

5. utils.ParamEscaper exc.ProgrammingError import fix (was PR #802)

ParamEscaper.escape_args and ParamEscaper.escape_item both raised exc.ProgrammingError(...) (lines 551 and 609 of utils.py), but exc was never imported. On any unsupported parameter shape the user saw NameError: name 'exc' is not defined instead of a clean PEP 249 ProgrammingError.

Surfaced via the python-comparator audit harness running the INLINE_PARAMS connection config: both backends raised NameError, which the comparator's class+message match treated as parity — a false-positive that hid both the real driver bug and the underlying caller-side type mismatch.

Fix: import ProgrammingError directly from databricks.sql.exc (matching the pattern used in session.py, client.py, result_set.py, etc.) and replace the two exc.ProgrammingError(...) sites with bare ProgrammingError(...).

Folded in from #802 (closed in favour of this bundle).

Dependencies

Items #2, #3, and #4 require the matching databricks-sql-kernel changes:

  • kernel PR #64intervals_as_string post-processor + empty-result schema fix + composite type parser + TIMESTAMP tz + metadata reshape fixes
  • kernel PR #65StatementParameter.name + bind_named_param API (to be opened)

For local testing the comparator harness uses KERNEL_FREEZE=1 against a kernel checkout of the feature branches.

Headline comparator results (full thrift-vs-kernel run)

Before After
match / diff / skipped 60 / 88 / 0 97 / 34 / 1
Bucket A (ArrowNotImplementedError) 32 0
decimal_column precision/scale 88 0
Bucket B1 (named params) 1 0
Type-code mismatches (slot 1) 21 14
Suites fully clean 12 / 30 17 / 30

Remaining 34 diffs cluster into documented causes (complex types in fetchall_arrow, timestamp tz on Arrow path, VOID surface, METADATA pattern semantics, empty-result nullability) — tracked in ~/docs/python-kernel/comparator-diff-tasklist.md.

Test plan

  • Manual repro of async path: cur.execute_async("SELECT 1"); while cur.is_query_pending(): time.sleep(0.2); cur.get_async_execution_result(); cur.fetchall() succeeds on use_kernel=True
  • Manual repro of interval path: cur.execute("SELECT ym_interval_column FROM ...") returns string-shaped rows matching the Thrift surface
  • Manual repro of decimal path: cur.description for a DECIMAL(10,2) column now reports precision=10, scale=2 on both backends
  • Live e2e for named params: test_parameterized_query_named_params / test_parameterized_query_named_param_with_null (added) pass against pecotesting warehouse
  • Unit tests: bind_tspark_params named/positional/mixed routing (tests/unit/test_kernel_type_mapping.py)
  • Manual repro of utils fix: ParamEscaper().escape_args(object()) raises ProgrammingError, not NameError
  • Full thrift-vs-kernel comparator: 97 / 34 / 1 of 132
  • Existing connector unit suite: 752 passed

This pull request and its description were written by Isaac.

@vikrantpuppala vikrantpuppala force-pushed the fix/kernel-async-statement-retention branch from 291e18e to 312164a Compare May 23, 2026 16:16
@vikrantpuppala vikrantpuppala changed the title fix(backend/kernel): retain parent Statement for async-submitted commands fix(backend/kernel): comparator parity — async statement retention + intervals_as_string May 23, 2026
@vikrantpuppala vikrantpuppala force-pushed the fix/kernel-async-statement-retention branch from 312164a to 81f065f Compare May 23, 2026 17:18
@vikrantpuppala vikrantpuppala changed the title fix(backend/kernel): comparator parity — async statement retention + intervals_as_string fix(backend/kernel): comparator parity — async statement retention + intervals_as_string + precision/scale May 23, 2026
@vikrantpuppala vikrantpuppala force-pushed the fix/kernel-async-statement-retention branch from 81f065f to 932ec74 Compare May 25, 2026 04:42
@vikrantpuppala vikrantpuppala changed the title fix(backend/kernel): comparator parity — async statement retention + intervals_as_string + precision/scale fix(backend/kernel): comparator parity — async retention + intervals_as_string + precision/scale + named params + utils.exc fix May 25, 2026
@vikrantpuppala vikrantpuppala force-pushed the fix/kernel-async-statement-retention branch from 932ec74 to 73760c7 Compare May 25, 2026 04:46
@gopalldb
Copy link
Copy Markdown

Minor

Missing unit test for the new decimal precision/scale extraction (tests/unit/test_kernel_type_mapping.py)
_precision_scale_for_arrow_type is the load-bearing piece of fix #3, but no test exercises description_from_arrow_schema with a decimal column. The pre-existing assert d[2:] == (None, None, None, None, None)
only passes because no test schema has decimal fields. Add a parametrised (pa.decimal128(10, 2), 10, 2) + decimal256 case.

_async_statements[guid] = stmt overwrite leaks on duplicate statement_id (client.py:305)
Mirrors the existing _async_handles overwrite pattern, so pre-existing in shape — but the new dict holds a server-side Statement with a stronger leak consequence (lasts until close_session). Theoretical
only; server IDs are unique in practice.

cancel_command does not release the parent Statement (client.py:333-346)
Only _async_handles is touched; _async_statements[guid] lingers until close_command or close_session. The connector's normal cursor lifecycle fires close_command on exit, so unlikely in practice. Either
document the requirement or eagerly close in cancel_command.

intervals_as_string=True is hard-coded with no opt-out (client.py:188)
Fine today (pyarrow can't decode month_interval at all). Forward-looking: a future user wanting native intervals will need a new connector kwarg. No fix required.

Stale-wheel errors for intervals_as_string / bind_named_param surface as generic OperationalError("Unexpected error from databricks_sql_kernel: ...") (client.py:172-192, type_mapping.py:235)
Same shape as PR #795's stale-wheel concern. Out of scope for this bundle; consider a version check in a follow-up.

@vikrantpuppala vikrantpuppala force-pushed the fix/kernel-async-statement-retention branch from 73760c7 to ecd0d37 Compare May 25, 2026 07:06
@vikrantpuppala
Copy link
Copy Markdown
Contributor Author

Thanks for the review. Summary of what's addressed in the new push (ecd0d37b):

Item 1 — fixed

Missing decimal precision/scale unit test
Added two new tests in tests/unit/test_kernel_type_mapping.py:

  • test_description_extracts_decimal_precision_scale — parametrized over pa.decimal128(10,2) / decimal128(38,0) / decimal128(38,18) / decimal256(76,38). Asserts slots 4/5 are populated correctly.
  • test_description_non_decimal_columns_have_none_precision_scale — companion test that catches a regression where the helper accidentally extracts precision from non-decimal types (e.g. Time64 fractional-second precision).

The existing test_description_from_schema_preserves_field_names_and_order stays — its d[2:] == (None,)*5 assertion is still true for the non-decimal schema it uses; the new tests cover the decimal case the existing one didn't reach.

pytest tests/unit/test_kernel_type_mapping.py: 48 / 48 pass (was 43; +5 new).

Items 2, 3, 4, 5 — accepted as documented

Per your own punt signals, leaving these as-is for follow-up:

  • (2) _async_statements[guid] overwrite — theoretical; server IDs unique in practice; same shape as the existing _async_handles pattern.
  • (3) cancel_command doesn't release the parent Statement — connector normal cursor lifecycle fires close_command on __exit__. Will fix if it ever bites.
  • (4) intervals_as_string=True hard-coded — no fix required today (pyarrow can't decode month_interval); future-looking knob for a follow-up PR.
  • (5) Stale-wheel OperationalError shape — out of scope for this bundle. Will consider a __version__ check + a structured KernelVersionMismatch exception in a follow-up.

Signed off the amended commit.

…as_string + precision/scale + named params + utils.exc fix

Five small comparator-parity fixes surfaced by the same audit run.
All are independently small but ship together because they share
the kernel-backend client surface and the audit baseline.

### 1. Async Statement retention

`KernelDatabricksClient.execute_command` closed the parent `Statement`
in `finally` regardless of `async_op`. The kernel's `Statement.close()`
invalidates child handles (see databricks-sql-kernel
`src/statement/validity.rs`), so the async handle died before the user
could poll it — breaking `execute_async` → `is_query_pending` →
`get_async_execution_result`.

Fix: when `async_op=True`, retain the parent Statement in a new
`_async_statements` dict alongside `_async_handles`, and close it
from `close_command`, `close_session`, and `get_execution_result`
after the executed handle is done.

Comparator outcome: STATEMENT_ASYNC suite 3/3 match (was 0/3).

### 2. `intervals_as_string` wire-through

pyarrow's Python bindings cannot decode Arrow's `month_interval` type
(id 21 — `KeyError` from `.as_py`, `to_pylist`, `cast(string)`,
`to_pandas`). Every kernel-backend `SELECT *` over a table with an
`INTERVAL YEAR TO MONTH` column raised `ArrowNotImplementedError` —
32 / 88 audit diffs.

Fix: pass `intervals_as_string=True` to the kernel `Session(...)`
constructor unconditionally. The kernel post-processor stringifies
`Interval` / `Duration` columns server-side to `Utf8` (kernel PR #64).

Comparator outcome: bucket A (ArrowNotImplementedError) 32 → 0.

### 3. Decimal precision/scale extraction

`description_from_arrow_schema` hard-coded `None` for PEP 249
description-tuple slots 4 and 5. For DECIMAL columns the Arrow
schema carries `precision` / `scale` on `Decimal128Type`, but they
were silently dropped — 88 cell diffs (44 precision + 44 scale).

Fix: factor out `_precision_scale_for_arrow_type(arrow_type)` and
call it from the description builder. Today it only handles
decimals; future extensions slot in here.

Comparator outcome: 88 precision+scale diffs → 0.

### 4. Named-parameter binding

`bind_tspark_params` raised `NotSupportedError` for any
`TSparkParameter` with `name` set. The canonical SEA proto marks
`StatementParameter.name` as `openapi_required=true` (named is the
spec-required public form; `ordinal` is `PUBLIC_UNDOCUMENTED`).
Kernel PR #65 added a `Statement.bind_named_param` PyO3 API.

Fix: route named bindings via the new API. Positional ordinals
self-increment so a named entry in the middle of the list doesn't
claim a positional slot.

Comparator outcome: PREPARED_STATEMENT_NAMED 1/1 match (was 0/1).
Full thrift-vs-kernel run: 97/132 match (was 96/132).

### 5. `utils.ParamEscaper` `exc.ProgrammingError` import fix

`ParamEscaper.escape_args` and `escape_item` both raised
`exc.ProgrammingError(...)`, but `exc` was never imported. On any
unsupported parameter shape the user saw `NameError: name 'exc' is
not defined` instead of a clean PEP 249 `ProgrammingError`.

Surfaced via the same audit harness running INLINE_PARAMS: both
backends raised `NameError`, which the comparator's class+message
match treated as parity — a false-positive that hid both the driver
bug and the underlying caller-side type mismatch.

Fix: import `ProgrammingError` directly from `databricks.sql.exc`
(matching the pattern used in `session.py`, `client.py`,
`result_set.py`, etc.) and replace the two `exc.ProgrammingError(...)`
sites with bare `ProgrammingError(...)`.

## Dependencies

Items #2, #3, and #4 require the matching databricks-sql-kernel
changes: PR #64 (`intervals_as_string` + empty-result schema fix)
and PR #65 (named-param binding). For local testing the comparator
harness uses `KERNEL_FREEZE=1` against a kernel checkout of the
feature branches.

## Headline comparator results (full thrift-vs-kernel run)

|                                    | Before | After |
|------------------------------------|--------|-------|
| match / diff / skipped             | 60 / 88 / 0 | **97 / 34 / 1** |
| Bucket A (ArrowNotImplementedError)| 32     | 0     |
| `decimal_column` precision/scale   | 88     | 0     |
| Bucket B1 (named params)           | 1      | 0     |
| Suites fully clean                 | 12 / 30 | **17 / 30** |

Remaining 34 diffs cluster into documented causes (complex types in
`fetchall_arrow`, timestamp tz on Arrow path, VOID surface,
METADATA pattern semantics) — tracked in
`~/docs/python-kernel/comparator-diff-tasklist.md`.

## Test plan

- [x] Manual repro of async path: `cur.execute_async("SELECT 1")` →
      `is_query_pending` → `get_async_execution_result` → `fetchall`
      succeeds on `use_kernel=True`
- [x] Manual repro of interval path: `cur.execute("SELECT ym_interval_column FROM ...")`
      returns string-shaped rows matching the Thrift surface
- [x] Manual repro of decimal path: `cur.description` for a
      `DECIMAL(10,2)` column now reports `precision=10, scale=2`
- [x] Live e2e: `test_parameterized_query_named_params` /
      `test_parameterized_query_named_param_with_null` (added)
- [x] Unit tests: `bind_tspark_params` named/positional/mixed
      routing (`tests/unit/test_kernel_type_mapping.py`)
- [x] Manual repro of utils fix:
      `ParamEscaper().escape_args(object())` raises
      `ProgrammingError`, not `NameError`
- [x] Full thrift-vs-kernel comparator: 97 / 34 / 1 of 132
- [x] Existing connector unit suite: 752 passed

Co-authored-by: Isaac
Signed-off-by: Vikrant Puppala <vikrant.puppala@databricks.com>
@vikrantpuppala vikrantpuppala force-pushed the fix/kernel-async-statement-retention branch from ecd0d37 to 7adb049 Compare May 25, 2026 07:31
@vikrantpuppala vikrantpuppala merged commit 85b99d4 into main May 25, 2026
38 of 39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants