-
Notifications
You must be signed in to change notification settings - Fork 304
feat: [iceberg] allow native Iceberg scans with non-identity transform residuals #2948
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 2 commits
db2fa02
f41059f
63f4056
d8fa8c9
560887c
3f9b89e
bebab89
21d6521
c0d3839
60115fa
8bb72e1
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -478,29 +478,35 @@ case class CometScanRule(session: SparkSession) extends Rule[SparkPlan] with Com | |
| false | ||
| } | ||
|
|
||
| // Check for unsupported transform functions in residual expressions | ||
| // iceberg-rust can only handle identity transforms in residuals; all other transforms | ||
| // (truncate, bucket, year, month, day, hour) must fall back to Spark | ||
| // Check for transform functions in residual expressions | ||
| // Non-identity transforms (truncate, bucket, year, month, day, hour) in residuals | ||
| // are now supported - they skip row-group filtering and are handled | ||
| // post-scan by CometFilter. | ||
| // This is less optimal than row-group filtering but still allows native execution. | ||
| val transformFunctionsSupported = | ||
| try { | ||
| IcebergReflection.findNonIdentityTransformInResiduals(metadata.tasks) match { | ||
| case Some(transformType) => | ||
| // Found unsupported transform | ||
| fallbackReasons += | ||
| s"Iceberg transform function '$transformType' in residual expression " + | ||
| "is not yet supported by iceberg-rust. " + | ||
| "Only identity transforms are supported." | ||
| false | ||
| // Found non-identity transform - log info and continue with native scan | ||
| // Row-group filtering will skip these predicates, but post-scan | ||
| // filtering will apply | ||
| logInfo( | ||
| s"Iceberg residual contains transform '$transformType' - " + | ||
| "row-group filtering will skip this predicate, " + | ||
| "post-scan filtering by CometFilter will apply instead.") | ||
| true // Allow native execution | ||
| case None => | ||
| // No unsupported transforms found - safe to use native execution | ||
| // No non-identity transforms - optimal row-group filtering will apply | ||
| true | ||
| } | ||
| } catch { | ||
| case e: Exception => | ||
| // Reflection failure - cannot verify safety, must fall back | ||
| fallbackReasons += "Iceberg reflection failure: Could not check for " + | ||
| s"transform functions in residuals: ${e.getMessage}" | ||
| false | ||
| // Reflection failure - log warning but allow native execution | ||
| // The predicate conversion will handle unsupported cases gracefully | ||
| logWarning( | ||
| s"Could not check for transform functions in residuals: ${e.getMessage}. " + | ||
| "Continuing with native scan.") | ||
| true | ||
|
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. if we could not find transform using IcebergReflection.findNonIdentityTransformInResiduals then we will do with native scan and get the correct result : Native Scan -> CometFilter -> User Query
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Reflection fails → "Try native anyway" |
||
| } | ||
|
|
||
| // Check for unsupported struct types in delete files | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.