[flink][spark] Fix partition pruning for non-string partition keys by fresh-borzoni · Pull Request #3322 · apache/fluss

fresh-borzoni · 2026-05-15T02:04:42Z

Summary

Partition predicate pushdown stringified literals and partition values before evaluation, so range comparisons fell back to string lexicographic order.
Example: An INT partition column with values 2 and 10 under WHERE pt > 2 lex-compared "10" < "2" and incorrectly dropped partition 10.

Added PartitionUtils.toPartitionRow and PartitionUtils.partitionRowType in fluss-common.

Partition predicate pushdown stringified literals and partition values before evaluation, so range comparisons fell back to string lexicographic order. An INT partition column with values 2 and 10 under WHERE pt > 2 lex-compared "10" < "2" and incorrectly dropped partition 10. Add PartitionUtils.toPartitionRow and PartitionUtils.partitionRowType in fluss-common. Use them from SparkPartitionPredicate and FlinkSourceEnumerator; drop the stringify step in FlinkTableSource and delete StringifyPredicateVisitor. The stringifier was also hiding two latent gaps in LeafPredicate.get: BYTES had no case (UnsupportedOperationException) and TIMESTAMP_WITH_LOCAL_TIME_ZONE used getTimestampNtz instead of getTimestampLtz (ClassCastException). Both exercised by testStreamingReadAllPartitionTypePushDown; fix in the same file. Regression test for the partition pruning bug added with an INT partition column and a range predicate in SparkLogTableReadTest and FlinkTableSourceITCase. Closes apache#3292.

fresh-borzoni · 2026-05-15T09:40:59Z

@Yohahaha @YannByron @luoyuxia PTAL 🙏

Yohahaha

parts of Spark LGTM! thank you for the fix!

fresh-borzoni · 2026-05-15T13:17:08Z

@Yohahaha addressed 👍
@luoyuxia do you mind to take a quick look, pls?

luoyuxia

@fresh-borzoni Thanks for the pr. Only one comment. PTAL

Copilot

Pull request overview

Fixes partition pruning when a partition key is non-STRING (e.g., INT). Previously, both Flink and Spark connectors stringified literals and partition values before evaluating predicates, so range comparisons (e.g., pt > 2) used lexicographic order and could drop partitions like pt=10. The fix builds typed partition rows via new shared helpers and removes the stringification path. A latent LeafPredicate.get bug for TIMESTAMP_WITH_LOCAL_TIME_ZONE (and missing BYTES case) is also corrected, since these are now actually exercised.

Changes:

Add PartitionUtils.partitionRowType and PartitionUtils.toPartitionRow to build typed partition rows; reuse them from Flink and Spark.
Drop literal stringification in Flink (StringifyPredicateVisitor deleted) and Spark (stringifyLiterals removed); pass typed predicates straight through.
Fix LeafPredicate.get to use getTimestampLtz for LTZ and to handle BYTES; add Flink and Spark integration tests for INT-partition range pushdown.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
fluss-common/src/main/java/org/apache/fluss/utils/PartitionUtils.java	Adds shared `partitionRowType` and `toPartitionRow` helpers used by both connectors.
fluss-common/src/main/java/org/apache/fluss/predicate/LeafPredicate.java	Fixes LTZ extraction and adds BYTES case so typed partition rows evaluate correctly.
fluss-flink/.../source/FlinkTableSource.java	Stops stringifying pushed-down partition predicate literals.
fluss-flink/.../source/enumerator/FlinkSourceEnumerator.java	Builds typed partition row using new helpers when applying the partition filter.
fluss-flink/.../utils/StringifyPredicateVisitor.java	Removed; no longer needed after typed-row evaluation.
fluss-flink/.../source/FlinkTableSourceITCase.java	Adds INT partition range-predicate pushdown integration test.
fluss-spark/.../utils/SparkPartitionPredicate.scala	Drops local helpers/stringifier and delegates to `PartitionUtils`; threads `tableInfo` through `filterPartitions`.
fluss-spark/.../read/FlussBatch.scala	Passes `tableInfo` into the new `filterPartitions` signature.
fluss-spark/.../SparkLogTableReadTest.scala	Adds Spark INT partition range-predicate pushdown test.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

fresh-borzoni · 2026-05-16T08:16:44Z

@luoyuxia addressed comments, PTAL 🙏

fresh-borzoni force-pushed the feat/fix-partition-pruning-types branch from 5b6aee7 to e83a839 Compare May 15, 2026 08:00

Yohahaha approved these changes May 15, 2026

View reviewed changes

Comment thread ...ink-common/src/main/java/org/apache/fluss/flink/source/enumerator/FlinkSourceEnumerator.java Outdated

fix wrapper

d373e0d

luoyuxia reviewed May 15, 2026

View reviewed changes

Comment thread ...fluss-spark-common/src/main/scala/org/apache/fluss/spark/utils/SparkPartitionPredicate.scala

luoyuxia requested a review from Copilot May 15, 2026 13:33

Copilot started reviewing on behalf of luoyuxia May 15, 2026 13:33 View session

Copilot AI reviewed May 15, 2026

View reviewed changes

fresh-borzoni force-pushed the feat/fix-partition-pruning-types branch from b379de7 to b646912 Compare May 16, 2026 02:03

address comments

abeeff4

fresh-borzoni force-pushed the feat/fix-partition-pruning-types branch from b646912 to abeeff4 Compare May 16, 2026 02:31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[flink][spark] Fix partition pruning for non-string partition keys#3322

[flink][spark] Fix partition pruning for non-string partition keys#3322
fresh-borzoni wants to merge 3 commits into
apache:mainfrom
fresh-borzoni:feat/fix-partition-pruning-types

fresh-borzoni commented May 15, 2026 •

edited

Loading

Uh oh!

fresh-borzoni commented May 15, 2026

Uh oh!

Yohahaha left a comment

Uh oh!

Uh oh!

fresh-borzoni commented May 15, 2026

Uh oh!

luoyuxia left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fresh-borzoni commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

fresh-borzoni commented May 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Uh oh!

fresh-borzoni commented May 15, 2026

Uh oh!

Yohahaha left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fresh-borzoni commented May 15, 2026

Uh oh!

luoyuxia left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fresh-borzoni commented May 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

fresh-borzoni commented May 15, 2026 •

edited

Loading