feat[arrow-ord]: suppport REE comparisons by asubiotto · Pull Request #9621 · apache/arrow-rs

asubiotto · 2026-03-27T10:48:40Z

This commit implements native comparisons on REE-encoded arrays which are treated similarly to dictionary indirection.

This commit implements REE to scalar comparisons by operating on the physical values only then bulk expanding the boolean result.

REE-to-REE comparisons are also optimized by computing aligned physical value runs to minimize comparisons.

Mixed cases (REE vs flat) materialize a logical index mapping similar to dictionaries.

This commit also supports REE<Dict>.

For comparison, here are the benchmark results with flat arrays as a reference on my local machine:

eq Int32                time:   [14.955 µs 15.162 µs 15.396 µs]
eq scalar Int32         time:   [11.379 µs 11.418 µs 11.459 µs]

ree_comparison/eq_ree_scalar(phys=64,log=65536)     time:   [453.31 ns 454.88 ns 456.43 ns]
ree_comparison/eq_ree_scalar(phys=1024,log=65536)   time:   [4.1224 µs 4.1298 µs 4.1368 µs]
ree_comparison/eq_ree_scalar(phys=32768,log=65536)  time:   [93.506 µs 94.085 µs 94.993 µs]
ree_comparison/eq_ree_ree(phys=64,log=65536)        time:   [413.96 ns 414.82 ns 415.87 ns]
ree_comparison/eq_ree_ree(phys=1024,log=65536)      time:   [4.1597 µs 4.1660 µs 4.1749 µs]
ree_comparison/eq_ree_ree(phys=32768,log=65536)     time:   [128.74 µs 144.40 µs 161.53 µs]

As is expected, the more we take advantage of REE encoding, the faster the comparisons are.

Which issue does this PR close?

Rationale for this change

Feature enhancement for comparisons

What changes are included in this PR?

Support for REE comparisons, including tests and benchmarks.

Are these changes tested?

Yes.

Are there any user-facing changes?

REE comparisons are now supported.

asubiotto · 2026-03-27T11:01:36Z

Unfortunately did not see there is an open PR #9448 before working on this, however I think this PR is more complete since #9448 does not seem to support REE with scalar or mixed comparisons. Also, this PR is more performant:

PR #9448 (REE-vs-REE only, panics on REE-vs-scalar):
  ree_comparison/eq_ree_ree(phys=64,log=65536)      time:   [2.5053 µs 2.5281 µs 2.5516 µs]
  ree_comparison/lt_ree_ree(phys=64,log=65536)       time:   [2.5559 µs 2.5783 µs 2.6024 µs]
  ree_comparison/eq_ree_ree(phys=1024,log=65536)     time:   [9.6708 µs 9.7004 µs 9.7291 µs]
  ree_comparison/lt_ree_ree(phys=1024,log=65536)     time:   [9.4187 µs 9.4457 µs 9.4717 µs]
  ree_comparison/eq_ree_ree(phys=32768,log=65536)    time:   [225.23 µs 228.23 µs 232.46 µs]
  ree_comparison/lt_ree_ree(phys=32768,log=65536)    time:   [200.61 µs 201.16 µs 201.81 µs]

  Ours:
  ree_comparison/eq_ree_scalar(phys=64,log=65536)    time:   [449.04 ns 456.42 ns 464.89 ns]
  ree_comparison/lt_ree_scalar(phys=64,log=65536)    time:   [447.29 ns 449.15 ns 451.32 ns]
  ree_comparison/eq_ree_ree(phys=64,log=65536)       time:   [425.39 ns 426.91 ns 428.48 ns]
  ree_comparison/lt_ree_ree(phys=64,log=65536)       time:   [439.47 ns 440.59 ns 441.69 ns]
  ree_comparison/eq_ree_flat(phys=64,log=65536)      time:   [46.541 µs 46.659 µs 46.783 µs]
  ree_comparison/eq_ree_scalar(phys=1024,log=65536)  time:   [4.4344 µs 4.4435 µs 4.4543 µs]
  ree_comparison/lt_ree_scalar(phys=1024,log=65536)  time:   [5.3170 µs 5.6797 µs 6.1374 µs]
  ree_comparison/eq_ree_ree(phys=1024,log=65536)     time:   [4.7932 µs 4.8049 µs 4.8183 µs]
  ree_comparison/lt_ree_ree(phys=1024,log=65536)     time:   [5.7911 µs 6.7388 µs 7.8086 µs]
  ree_comparison/eq_ree_flat(phys=1024,log=65536)    time:   [51.404 µs 52.651 µs 54.233 µs]
  ree_comparison/eq_ree_scalar(phys=32768,log=65536) time:   [94.964 µs 95.714 µs 96.556 µs]
  ree_comparison/lt_ree_scalar(phys=32768,log=65536) time:   [95.557 µs 96.428 µs 97.527 µs]
  ree_comparison/eq_ree_ree(phys=32768,log=65536)    time:   [137.23 µs 139.33 µs 141.49 µs]
  ree_comparison/lt_ree_ree(phys=32768,log=65536)    time:   [105.79 µs 106.37 µs 107.20 µs]
  ree_comparison/eq_ree_flat(phys=32768,log=65536)   time:   [143.00 µs 215.04 µs 340.63 µs]

cc @alamb @Jefffrey @brunal

brunal · 2026-03-27T11:30:47Z

Super interesting. I suspect that the perf gain might go lower as # of phys indices grow as I'm avoiding copying them, but this definitely seems like a better implementation.

brunal · 2026-03-27T11:33:28Z

 fn apply<T: ArrayOrd>(
    op: Op,
    l: T,
    l_s: bool,


Unless I'm mistaken, you can have only one of l_s == true / l_v.is_some() / l_ree.is_some() at a time. How about introducing an enum? It will reduce the arg # and make them clearer

I don't think that's right. Scalars can have ree and/or dictionary types and l_v and l_ree are both Some for Ree

Ah, right! Still, I think that the too_many_arguments link is legit, and I'd consider groupin the args for each side -- either anonymous 3-tuple or a real struct. It would make the call site much cleaner. e.g.

struct ArrayInfo<'a> { is_scalar: bool, dict_values: &'a dyn AnyDictionyArray, run_array: Option<&'a ReeInfo>, }

asubiotto · 2026-03-27T13:54:58Z

Thanks for the review @brunal. I addressed all your comments.

brunal · 2026-03-28T08:38:15Z

 fn apply<T: ArrayOrd>(
    op: Op,
    l: T,
    l_s: bool,


Ah, right! Still, I think that the too_many_arguments link is legit, and I'd consider groupin the args for each side -- either anonymous 3-tuple or a real struct. It would make the call site much cleaner. e.g.

struct ArrayInfo<'a> { is_scalar: bool, dict_values: &'a dyn AnyDictionyArray, run_array: Option<&'a ReeInfo>, }

brunal · 2026-03-28T21:06:36Z

+    }
+}
+
+fn ree_physical_indices(info: &ReeInfo) -> Vec<usize> {


you can turn this into a .skip().map_while(...) to return an Iterator<Item = usize>. Then in logical_indices's Some-Some branch, you can directly map on it. This avoid materializing the intermediate physical indices vector.

This is very nitpicky, as the one branch that benefits from this, REE<dict>, seems like quite a niche use case :)

I think I'm going to keep this as is currently written to mirror the dict normalized_keys materialization. We can do this as a follow up improvement if we see it's worth it.

asubiotto · 2026-04-02T21:33:20Z

Updated, apologies for the delay.

brunal

Looks excellent to me, now you just need someone with repo rights :-)

asubiotto · 2026-04-03T21:43:13Z

Thanks again for your review. cc @alamb @Jefffrey this PR is ready for a final review

asubiotto · 2026-04-13T07:09:55Z

Friendly ping for a review or triage @alamb (not sure if there's another code owner)

alamb · 2026-04-13T20:24:40Z

Thanks for the ping @asubiotto -- I'll try and find time to review this, but I am having a hard time finding time

alamb · 2026-04-13T20:24:57Z

But nwo I see @brunal has already done it so that will make it easier

asubiotto · 2026-04-14T07:21:48Z

Thanks! Appreciate the time. Yes, @brunal conducted an in-depth review so I only need a final look from a maintainer.

alamb · 2026-04-14T20:32:41Z

run benchmarks comparison_kernels

adriangbot · 2026-04-14T20:34:47Z

🤖 Arrow criterion benchmark running (GKE) | trigger
Instance: c4a-highmem-16 (12 vCPU / 65 GiB) | Linux bench-c4246944340-1260-n77dd 6.12.55+ #1 SMP Sun Feb 1 08:59:41 UTC 2026 aarch64 GNU/Linux

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Comparing asubiotto/reecmp (08e2e39) to ec771cc (merge-base) diff
BENCH_NAME=comparison_kernels
BENCH_COMMAND=cargo bench --features=arrow,async,test_common,experimental,object_store --bench comparison_kernels
BENCH_FILTER=
Results will be posted here when complete

File an issue against this benchmark runner

alamb · 2026-04-14T20:56:54Z

I think there is a bug in this code with sliced arrays. See this PR here for reproducer

[arrow-ord]: add REE slice offset regression test polarsignals/arrow-rs#5

adriangbot · 2026-04-14T21:11:35Z

🤖 Arrow criterion benchmark completed (GKE) | trigger

Instance: c4a-highmem-16 (12 vCPU / 65 GiB)

CPU Details (lscpu)

Architecture:                            aarch64
CPU op-mode(s):                          64-bit
Byte Order:                              Little Endian
CPU(s):                                  16
On-line CPU(s) list:                     0-15
Vendor ID:                               ARM
Model name:                              Neoverse-V2
Model:                                   1
Thread(s) per core:                      1
Core(s) per cluster:                     16
Socket(s):                               -
Cluster(s):                              1
Stepping:                                r0p1
BogoMIPS:                                2000.00
Flags:                                   fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma lrcpc dcpop sha3 sm3 sm4 asimddp sha512 sve asimdfhm dit uscat ilrcpc flagm sb paca pacg dcpodp sve2 sveaes svepmull svebitperm svesha3 svesm4 flagm2 frint svei8mm svebf16 i8mm bf16 dgh rng bti
L1d cache:                               1 MiB (16 instances)
L1i cache:                               1 MiB (16 instances)
L2 cache:                                32 MiB (16 instances)
L3 cache:                                80 MiB (1 instance)
NUMA node(s):                            1
NUMA node0 CPU(s):                       0-15
Vulnerability Gather data sampling:      Not affected
Vulnerability Indirect target selection: Not affected
Vulnerability Itlb multihit:             Not affected
Vulnerability L1tf:                      Not affected
Vulnerability Mds:                       Not affected
Vulnerability Meltdown:                  Not affected
Vulnerability Mmio stale data:           Not affected
Vulnerability Reg file data sampling:    Not affected
Vulnerability Retbleed:                  Not affected
Vulnerability Spec rstack overflow:      Not affected
Vulnerability Spec store bypass:         Mitigation; Speculative Store Bypass disabled via prctl
Vulnerability Spectre v1:                Mitigation; __user pointer sanitization
Vulnerability Spectre v2:                Mitigation; CSV2, BHB
Vulnerability Srbds:                     Not affected
Vulnerability Tsa:                       Not affected
Vulnerability Tsx async abort:           Not affected
Vulnerability Vmscape:                   Not affected

Details

group                                                                                                    asubiotto_reecmp                       main
-----                                                                                                    ----------------                       ----
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar complex                    1.00   1972.5±9.90µs        ? ?/sec    1.00  1975.7±13.99µs        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar contains                   1.00      2.1±0.01ms        ? ?/sec    1.01      2.1±0.01ms        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar ends with                  1.00  1263.3±19.78µs        ? ?/sec    1.00  1259.5±21.53µs        ? ?/sec
StringArray: regexp_matches_utf8 scalar benchmarks/regexp_matches_utf8 scalar starts with                1.00  1161.3±20.33µs        ? ?/sec    1.00  1156.7±19.80µs        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar complex        1.00  1993.0±10.14µs        ? ?/sec    1.00  2000.0±11.82µs        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar contains       1.00      2.1±0.01ms        ? ?/sec    1.00      2.1±0.01ms        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar ends with      1.00  1286.9±15.60µs        ? ?/sec    1.00  1287.2±19.07µs        ? ?/sec
StringViewArray: regexp_matches_utf8view scalar benchmarks/regexp_matches_utf8view scalar starts with    1.00  1183.4±20.31µs        ? ?/sec    1.00  1179.1±23.46µs        ? ?/sec
eq Float32                                                                                               1.00     12.5±0.02µs        ? ?/sec    1.00     12.5±0.15µs        ? ?/sec
eq Int32                                                                                                 1.00     12.5±0.02µs        ? ?/sec    1.00     12.5±0.04µs        ? ?/sec
eq MonthDayNano                                                                                          1.00     65.5±0.17µs        ? ?/sec    1.01     66.3±0.20µs        ? ?/sec
eq StringArray StringArray                                                                               1.01     10.4±0.10ms        ? ?/sec    1.00     10.3±0.09ms        ? ?/sec
eq StringViewArray StringViewArray                                                                       1.00     10.0±0.03ms        ? ?/sec    1.03     10.2±0.06ms        ? ?/sec
eq StringViewArray StringViewArray inlined bytes                                                         1.01      7.4±0.02ms        ? ?/sec    1.00      7.4±0.02ms        ? ?/sec
eq dictionary[10] string[4])                                                                             1.01   1350.2±1.88µs        ? ?/sec    1.00   1331.2±1.73µs        ? ?/sec
eq long same prefix strings StringArray                                                                  1.00    533.1±0.41µs        ? ?/sec    1.00    535.6±0.84µs        ? ?/sec
eq long same prefix strings StringViewArray                                                              1.00    654.5±0.64µs        ? ?/sec    1.00    653.4±0.88µs        ? ?/sec
eq scalar Float32                                                                                        1.00     11.4±0.00µs        ? ?/sec    1.00     11.4±0.03µs        ? ?/sec
eq scalar Int32                                                                                          1.00     11.4±0.01µs        ? ?/sec    1.00     11.4±0.01µs        ? ?/sec
eq scalar MonthDayNano                                                                                   1.00     34.6±0.03µs        ? ?/sec    1.02     35.3±0.06µs        ? ?/sec
eq scalar StringArray                                                                                    1.13      7.6±0.27ms        ? ?/sec    1.00      6.7±0.06ms        ? ?/sec
eq scalar StringViewArray 13 bytes                                                                       1.00      7.0±0.03ms        ? ?/sec    1.00      7.0±0.04ms        ? ?/sec
eq scalar StringViewArray 4 bytes                                                                        1.00      6.3±0.04ms        ? ?/sec    1.00      6.3±0.02ms        ? ?/sec
eq scalar StringViewArray 6 bytes                                                                        1.00      6.3±0.02ms        ? ?/sec    1.00      6.3±0.03ms        ? ?/sec
eq_dyn_utf8_scalar dictionary[10] string[4])                                                             1.00     60.0±1.20µs        ? ?/sec    1.04     62.6±3.23µs        ? ?/sec
gt Float32                                                                                               1.00     21.5±0.02µs        ? ?/sec    1.00     21.5±0.02µs        ? ?/sec
gt Int32                                                                                                 1.00     12.5±0.01µs        ? ?/sec    1.00     12.5±0.03µs        ? ?/sec
gt scalar Float32                                                                                        1.00     16.3±0.00µs        ? ?/sec    1.00     16.3±0.01µs        ? ?/sec
gt scalar Int32                                                                                          1.00     11.4±0.01µs        ? ?/sec    1.00     11.4±0.01µs        ? ?/sec
gt_eq Float32                                                                                            1.00     21.5±0.02µs        ? ?/sec    1.00     21.5±0.02µs        ? ?/sec
gt_eq Int32                                                                                              1.00     12.5±0.01µs        ? ?/sec    1.00     12.5±0.03µs        ? ?/sec
gt_eq scalar Float32                                                                                     1.00     16.3±0.01µs        ? ?/sec    1.00     16.3±0.01µs        ? ?/sec
gt_eq scalar Int32                                                                                       1.00     11.4±0.01µs        ? ?/sec    1.00     11.4±0.01µs        ? ?/sec
gt_eq_dyn_utf8_scalar scalar dictionary[10] string[4])                                                   1.00     60.0±1.50µs        ? ?/sec    1.01     60.3±1.52µs        ? ?/sec
ilike_utf8 scalar complex                                                                                1.00      2.3±0.04ms        ? ?/sec    1.00      2.3±0.03ms        ? ?/sec
ilike_utf8 scalar contains                                                                               1.00      3.9±0.02ms        ? ?/sec    1.00      3.9±0.03ms        ? ?/sec
ilike_utf8 scalar ends with                                                                              1.00    766.9±3.29µs        ? ?/sec    1.00    767.2±6.76µs        ? ?/sec
ilike_utf8 scalar equals                                                                                 1.00    611.5±5.66µs        ? ?/sec    1.00    609.2±3.72µs        ? ?/sec
ilike_utf8 scalar starts with                                                                            1.00    775.6±5.25µs        ? ?/sec    1.00    776.2±5.60µs        ? ?/sec
ilike_utf8_scalar_dyn dictionary[10] string[4])                                                          1.00     59.6±0.06µs        ? ?/sec    1.00     59.6±0.09µs        ? ?/sec
like_utf8 scalar complex                                                                                 1.00  1572.8±31.05µs        ? ?/sec    1.00  1574.8±29.50µs        ? ?/sec
like_utf8 scalar contains                                                                                1.00   1319.5±4.85µs        ? ?/sec    1.00   1316.2±2.94µs        ? ?/sec
like_utf8 scalar ends with                                                                               1.00    242.4±0.30µs        ? ?/sec    1.00   241.6±17.38µs        ? ?/sec
like_utf8 scalar equals                                                                                  1.00     47.3±0.03µs        ? ?/sec    1.00     47.3±0.05µs        ? ?/sec
like_utf8 scalar starts with                                                                             1.00    231.7±0.30µs        ? ?/sec    1.00    231.3±0.17µs        ? ?/sec
like_utf8_scalar_dyn dictionary[10] string[4])                                                           1.03     61.4±1.83µs        ? ?/sec    1.00     59.4±0.04µs        ? ?/sec
like_utf8view scalar complex                                                                             1.00    169.8±2.30ms        ? ?/sec    1.01    171.2±2.38ms        ? ?/sec
like_utf8view scalar contains                                                                            1.00    137.8±0.72ms        ? ?/sec    1.00    137.6±0.73ms        ? ?/sec
like_utf8view scalar ends with 13 bytes                                                                  1.00     27.9±0.07ms        ? ?/sec    1.01     28.1±0.09ms        ? ?/sec
like_utf8view scalar ends with 4 bytes                                                                   1.00     27.7±0.07ms        ? ?/sec    1.00     27.7±0.08ms        ? ?/sec
like_utf8view scalar ends with 6 bytes                                                                   1.00     27.7±0.07ms        ? ?/sec    1.00     27.8±0.12ms        ? ?/sec
like_utf8view scalar equals                                                                              1.00     19.0±0.03ms        ? ?/sec    1.00     19.0±0.04ms        ? ?/sec
like_utf8view scalar starts with 13 bytes                                                                1.00     25.2±0.12ms        ? ?/sec    1.00     25.2±0.11ms        ? ?/sec
like_utf8view scalar starts with 4 bytes                                                                 1.00     16.2±0.06ms        ? ?/sec    1.02     16.5±0.06ms        ? ?/sec
like_utf8view scalar starts with 6 bytes                                                                 1.00     24.7±0.06ms        ? ?/sec    1.00     24.7±0.06ms        ? ?/sec
long same prefix strings like_utf8 scalar complex                                                        1.00  1131.9±18.84µs        ? ?/sec    1.01  1141.0±17.65µs        ? ?/sec
long same prefix strings like_utf8 scalar contains                                                       1.00      3.4±0.04ms        ? ?/sec    1.00      3.4±0.03ms        ? ?/sec
long same prefix strings like_utf8 scalar ends with                                                      1.00  1448.8±30.08µs        ? ?/sec    1.01  1467.5±26.94µs        ? ?/sec
long same prefix strings like_utf8 scalar equals                                                         1.00   479.2±10.01µs        ? ?/sec    1.00    480.1±8.33µs        ? ?/sec
long same prefix strings like_utf8 scalar starts with                                                    1.00  1635.1±33.07µs        ? ?/sec    1.00  1635.2±28.10µs        ? ?/sec
long same prefix strings like_utf8view scalar complex                                                    1.00  1156.6±12.62µs        ? ?/sec    1.00  1157.9±13.09µs        ? ?/sec
long same prefix strings like_utf8view scalar contains                                                   1.00      3.4±0.03ms        ? ?/sec    1.00      3.4±0.03ms        ? ?/sec
long same prefix strings like_utf8view scalar ends with                                                  1.00  1447.4±17.44µs        ? ?/sec    1.00  1452.0±20.73µs        ? ?/sec
long same prefix strings like_utf8view scalar equals                                                     1.00    495.4±5.08µs        ? ?/sec    1.00    496.6±6.02µs        ? ?/sec
long same prefix strings like_utf8view scalar starts with                                                1.01  1655.1±17.68µs        ? ?/sec    1.00  1642.7±14.16µs        ? ?/sec
lt Float32                                                                                               1.00     21.5±0.02µs        ? ?/sec    1.00     21.5±0.02µs        ? ?/sec
lt Int32                                                                                                 1.00     12.5±0.01µs        ? ?/sec    1.00     12.5±0.02µs        ? ?/sec
lt StringViewArray StringViewArray inlined bytes                                                         1.00     11.4±0.14ms        ? ?/sec    1.00     11.4±0.11ms        ? ?/sec
lt long same prefix strings StringArray                                                                  1.00    596.8±0.58µs        ? ?/sec    1.00    594.3±0.79µs        ? ?/sec
lt long same prefix strings StringViewArray                                                              1.00    688.5±0.61µs        ? ?/sec    1.00    686.9±0.86µs        ? ?/sec
lt scalar Float32                                                                                        1.00     16.3±0.01µs        ? ?/sec    1.00     16.3±0.02µs        ? ?/sec
lt scalar Int32                                                                                          1.00     11.4±0.01µs        ? ?/sec    1.00     11.4±0.01µs        ? ?/sec
lt scalar StringArray                                                                                    1.04     26.6±0.02ms        ? ?/sec    1.00     25.6±0.02ms        ? ?/sec
lt scalar StringViewArray                                                                                1.00     18.8±0.12ms        ? ?/sec    1.00     18.7±0.06ms        ? ?/sec
lt_eq Float32                                                                                            1.00     21.5±0.02µs        ? ?/sec    1.00     21.5±0.03µs        ? ?/sec
lt_eq Int32                                                                                              1.00     12.5±0.01µs        ? ?/sec    1.00     12.5±0.04µs        ? ?/sec
lt_eq scalar Float32                                                                                     1.00     16.3±0.01µs        ? ?/sec    1.02     16.6±0.30µs        ? ?/sec
lt_eq scalar Int32                                                                                       1.00     11.4±0.01µs        ? ?/sec    1.00     11.4±0.01µs        ? ?/sec
neq Float32                                                                                              1.00     12.5±0.02µs        ? ?/sec    1.01     12.6±0.02µs        ? ?/sec
neq Int32                                                                                                1.00     12.5±0.01µs        ? ?/sec    1.00     12.5±0.03µs        ? ?/sec
neq long same prefix strings StringArray                                                                 1.00    532.9±0.70µs        ? ?/sec    1.01    535.9±1.05µs        ? ?/sec
neq long same prefix strings StringViewArray                                                             1.00    654.8±0.77µs        ? ?/sec    1.00    653.0±0.70µs        ? ?/sec
neq scalar Float32                                                                                       1.00     11.4±0.01µs        ? ?/sec    1.00     11.4±0.01µs        ? ?/sec
neq scalar Int32                                                                                         1.00     11.4±0.01µs        ? ?/sec    1.00     11.4±0.01µs        ? ?/sec
nilike_utf8 scalar complex                                                                               1.00      2.2±0.03ms        ? ?/sec    1.03      2.3±0.03ms        ? ?/sec
nilike_utf8 scalar contains                                                                              1.00      3.9±0.02ms        ? ?/sec    1.01      3.9±0.02ms        ? ?/sec
nilike_utf8 scalar ends with                                                                             1.00    768.0±5.25µs        ? ?/sec    1.04    796.9±8.68µs        ? ?/sec
nilike_utf8 scalar equals                                                                                1.00    609.3±3.69µs        ? ?/sec    1.02    622.4±7.30µs        ? ?/sec
nilike_utf8 scalar starts with                                                                           1.00    774.8±4.02µs        ? ?/sec    1.05   809.8±16.44µs        ? ?/sec
nlike_utf8 scalar complex                                                                                1.00  1572.9±30.22µs        ? ?/sec    1.00  1574.7±28.68µs        ? ?/sec
nlike_utf8 scalar contains                                                                               1.00   1319.6±4.09µs        ? ?/sec    1.00   1319.0±3.32µs        ? ?/sec
nlike_utf8 scalar ends with                                                                              1.00    241.5±0.72µs        ? ?/sec    1.00    241.6±0.41µs        ? ?/sec
nlike_utf8 scalar equals                                                                                 1.00     47.3±0.02µs        ? ?/sec    1.00     47.3±0.02µs        ? ?/sec
nlike_utf8 scalar starts with                                                                            1.00    231.7±0.26µs        ? ?/sec    1.00    231.2±0.29µs        ? ?/sec
ree_comparison/eq_ree_flat(phys=1024,log=65536)                                                          1.00     59.5±0.12µs        ? ?/sec  
ree_comparison/eq_ree_flat(phys=32768,log=65536)                                                         1.00    107.3±0.38µs        ? ?/sec  
ree_comparison/eq_ree_flat(phys=64,log=65536)                                                            1.00     58.7±0.10µs        ? ?/sec  
ree_comparison/eq_ree_ree(phys=1024,log=65536)                                                           1.00      4.0±0.02µs        ? ?/sec  
ree_comparison/eq_ree_ree(phys=32768,log=65536)                                                          1.00    150.8±0.18µs        ? ?/sec  
ree_comparison/eq_ree_ree(phys=64,log=65536)                                                             1.00    497.7±1.99ns        ? ?/sec  
ree_comparison/eq_ree_scalar(phys=1024,log=65536)                                                        1.00      3.6±0.14µs        ? ?/sec  
ree_comparison/eq_ree_scalar(phys=32768,log=65536)                                                       1.00     92.7±4.73µs        ? ?/sec  
ree_comparison/eq_ree_scalar(phys=64,log=65536)                                                          1.00    483.3±5.91ns        ? ?/sec  
ree_comparison/lt_ree_ree(phys=1024,log=65536)                                                           1.00      3.9±0.04µs        ? ?/sec  
ree_comparison/lt_ree_ree(phys=32768,log=65536)                                                          1.00    100.7±1.11µs        ? ?/sec  
ree_comparison/lt_ree_ree(phys=64,log=65536)                                                             1.00    487.4±1.71ns        ? ?/sec  
ree_comparison/lt_ree_scalar(phys=1024,log=65536)                                                        1.00      3.6±0.13µs        ? ?/sec  
ree_comparison/lt_ree_scalar(phys=32768,log=65536)                                                       1.00     92.7±4.72µs        ? ?/sec  
ree_comparison/lt_ree_scalar(phys=64,log=65536)                                                          1.00    490.3±7.43ns        ? ?/sec

Resource Usage

base (merge-base)

Metric	Value
Wall time	999.0s
Peak memory	3.8 GiB
Avg memory	3.3 GiB
CPU user	997.5s
CPU sys	1.4s
Peak spill	0 B

branch

Metric	Value
Wall time	1147.7s
Peak memory	3.8 GiB
Avg memory	3.4 GiB
CPU user	1146.9s
CPU sys	0.7s
Peak spill	0 B

File an issue against this benchmark runner

asubiotto · 2026-04-15T07:31:23Z

Thanks! I will take a look and update you when it's fixed.

asubiotto · 2026-04-15T13:11:59Z

Fixed and added your test. We were incorrectly computing the expansion multiple for the comparison by using only the left offset as pos. We're now using a logical position (starting at 0 and incremented to the end of the segment i.e. the minimum run).

vegarsti · 2026-04-16T12:24:24Z

FYI there's a p too many in the title: "suppport"!

alamb

I think this PR looks good to me in general. I would like to see the REE code isolated a little more before merging to make understanding the comparison logic easier, but otherwise I think it is good to go

Thank you for bearing with me @asubiotto and @vegarsti

alamb · 2026-04-16T12:57:27Z

        )));
    }

+    let l_side = SideInfo {


if we are going to introduce a struct like this, perhaps as a follow on PR we could refactor all the argument to apply into SideInfo and then make apply a function of SideInfo

Possibly but I'm not sure this will make the code clearer. As currently written, SideInfo can be constructed only once before type the type dispatch. I do think we could probably clean up this code as a whole though but would need to think more deeply about how. I can create an issue but I don't like doing so if there isn't a clear actionable item.

alamb · 2026-04-16T12:59:39Z

+    {
+        // Both non-scalar with indirection. Pure REE-vs-REE uses segment-based
+        // bulk comparison; other combinations fall back to index vectors.
+        if let (Some(li), None, Some(ri), None) = (l_info.ree, l_info.dict, r_info.ree, r_info.dict)


Can we please refactor this into its own function as it is REE specific and I think makes it harder t read cmp

alamb · 2026-04-16T13:00:58Z


+/// Compare two REE arrays by walking both run_ends simultaneously, comparing
+/// once per aligned segment and bulk-filling the result.
+fn apply_op_segments<T: ArrayOrd>(


Can we please name this something with ree in the name so it is clearer it is ree specific?

This commit implements native comparisons on REE-encoded arrays which are treated similarly to dictionary indirection. This commit implements REE to scalar comparisons by operating on the physical values only then bulk expanding the boolean result. REE-to-REE comparisons are also optimized by computing aligned physical value runs to minimize comparisons. Mixed cases (REE vs flat) materialize a logical index mapping similar to dictionaries. This commit also supports REE<Dict>. For comparison, here are the benchmark results with flat arrays as a reference on my local machine: ``` eq Int32 time: [14.955 µs 15.162 µs 15.396 µs] eq scalar Int32 time: [11.379 µs 11.418 µs 11.459 µs] ree_comparison/eq_ree_scalar(phys=64,log=65536) time: [453.31 ns 454.88 ns 456.43 ns] ree_comparison/eq_ree_scalar(phys=1024,log=65536) time: [4.1224 µs 4.1298 µs 4.1368 µs] ree_comparison/eq_ree_scalar(phys=32768,log=65536) time: [93.506 µs 94.085 µs 94.993 µs] ree_comparison/eq_ree_ree(phys=64,log=65536) time: [413.96 ns 414.82 ns 415.87 ns] ree_comparison/eq_ree_ree(phys=1024,log=65536) time: [4.1597 µs 4.1660 µs 4.1749 µs] ree_comparison/eq_ree_ree(phys=32768,log=65536) time: [128.74 µs 144.40 µs 161.53 µs] ``` As is expected, the more we take advantage of REE encoding, the faster the comparisons are. Signed-off-by: Alfonso Subiotto Marques <[email protected]>

asubiotto · 2026-04-17T07:52:24Z

Addressed your comments @alamb, this should be good to go

github-actions bot added the arrow Changes to the arrow crate label Mar 27, 2026

asubiotto force-pushed the asubiotto/reecmp branch from 79d81af to eee179c Compare March 27, 2026 11:06

brunal reviewed Mar 27, 2026

View reviewed changes

asubiotto force-pushed the asubiotto/reecmp branch 2 times, most recently from 85034d0 to 791ec8c Compare March 27, 2026 13:48

brunal reviewed Mar 28, 2026

View reviewed changes

brunal mentioned this pull request Mar 31, 2026

Implement comparisons for RunArray. #9448

Closed

asubiotto force-pushed the asubiotto/reecmp branch from 791ec8c to a3669d9 Compare April 2, 2026 21:18

brunal approved these changes Apr 3, 2026

View reviewed changes

asubiotto force-pushed the asubiotto/reecmp branch from a3669d9 to 08e2e39 Compare April 3, 2026 21:42

alamb mentioned this pull request Apr 14, 2026

[arrow-ord]: add REE slice offset regression test polarsignals/arrow-rs#5

Draft

asubiotto force-pushed the asubiotto/reecmp branch from 08e2e39 to 1ad1b90 Compare April 15, 2026 12:11

alamb approved these changes Apr 16, 2026

View reviewed changes

asubiotto force-pushed the asubiotto/reecmp branch from 1ad1b90 to cfc2a0a Compare April 17, 2026 07:34

asubiotto force-pushed the asubiotto/reecmp branch from cfc2a0a to a0a7521 Compare April 17, 2026 10:54

Conversation

asubiotto commented Mar 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

asubiotto commented Mar 27, 2026

Uh oh!

brunal commented Mar 27, 2026

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

asubiotto commented Mar 27, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

asubiotto commented Apr 2, 2026

Uh oh!

brunal left a comment

Choose a reason for hiding this comment

Uh oh!

asubiotto commented Apr 3, 2026

Uh oh!

asubiotto commented Apr 13, 2026

Uh oh!

alamb commented Apr 13, 2026

Uh oh!

alamb commented Apr 13, 2026

Uh oh!

asubiotto commented Apr 14, 2026

Uh oh!

alamb commented Apr 14, 2026

Uh oh!

adriangbot commented Apr 14, 2026

Uh oh!

alamb commented Apr 14, 2026

Uh oh!

adriangbot commented Apr 14, 2026

Uh oh!

asubiotto commented Apr 15, 2026

Uh oh!

asubiotto commented Apr 15, 2026

Uh oh!

vegarsti commented Apr 16, 2026

Uh oh!

alamb left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

asubiotto commented Apr 17, 2026

asubiotto commented Mar 27, 2026 •

edited

Loading