You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-55502][PYTHON] Unify UDF and UDTF Arrow conversion error handling
### What changes were proposed in this pull request?
Backport SPARK-55502 to branch-4.0: unify error messages for UDF and UDTF Arrow conversion errors to match master.
**Key changes**:
- UDF path: updated error messages from "Exception thrown when converting pandas.Series..." to user-friendly "Failed to convert..." / "Cannot convert..." format
- UDTF path: replaced `UDTF_ARROW_TYPE_CAST_ERROR` error class with "Exception thrown when converting pandas.Series..." format (matching master's legacy path)
- Removed unused `UDTF_ARROW_TYPE_CAST_ERROR` error condition
- Updated test expectations to match new error messages
### Why are the changes needed?
The cross-version CI test (master-server + branch-4.0-client) fails because master updated the error messages in SPARK-55502, but branch-4.0 tests still expect the old format.
### Does this PR introduce _any_ user-facing change?
Yes, error messages change for UDF Arrow conversion errors (same changes as master).
### How was this patch tested?
Updated existing unit tests.
### Was this patch authored or co-authored using generative AI tooling?
Yes
Copy file name to clipboardExpand all lines: python/pyspark/errors/error-conditions.json
-5Lines changed: 0 additions & 5 deletions
Original file line number
Diff line number
Diff line change
@@ -990,11 +990,6 @@
990
990
"Return type of the user-defined function should be <expected>, but is <actual>."
991
991
]
992
992
},
993
-
"UDTF_ARROW_TYPE_CAST_ERROR": {
994
-
"message": [
995
-
"Cannot convert the output value of the column '<col_name>' with type '<col_type>' to the specified return type of the column: '<arrow_type>'. Please check if the data types match and try again."
"Failed to evaluate the user-defined table function '<name>' because its constructor is invalid: the function implements the 'analyze' method, but its constructor has more than two arguments (including the 'self' reference). Please update the table function so that its constructor accepts exactly one 'self' argument, or one 'self' argument plus another argument for the result of the 'analyze' method, and try the query again."
0 commit comments