Describe the enhancement requested
Currently, trying to read a bloom filter from an encrypted Parquet file raises an exception.
It would be nice to implement this at some point. Two pieces of data need to be decrypted separately: the Thrift-serialized bloom filter header ("BloomFilter Header" with module id 8), and the bloom filter data that follows it ("BloomFilter Bitset" with module id 9).
Some inspiration can be found in the PageIndex implementation: see
|
// Get decryptor of column index if encrypted. |
|
std::unique_ptr<Decryptor> decryptor = |
|
InternalFileDecryptor::GetColumnMetaDecryptorFactory( |
|
file_decryptor_, col_chunk->crypto_metadata().get())(); |
|
if (decryptor != nullptr) { |
|
UpdateDecryptor(decryptor.get(), row_group_ordinal_, /*column_ordinal=*/i, |
|
encryption::kColumnIndex); |
|
} |
|
|
|
return ColumnIndex::Make(*descr, column_index_buffer_->data() + buffer_offset, length, |
|
properties_, decryptor.get()); |
and
|
format::ColumnIndex column_index; |
|
ThriftDeserializer deserializer(properties); |
|
deserializer.DeserializeMessage(reinterpret_cast<const uint8_t*>(serialized_index), |
|
&index_len, &column_index, decryptor); |
Component(s)
C++, Parquet
Describe the enhancement requested
Currently, trying to read a bloom filter from an encrypted Parquet file raises an exception.
It would be nice to implement this at some point. Two pieces of data need to be decrypted separately: the Thrift-serialized bloom filter header ("BloomFilter Header" with module id 8), and the bloom filter data that follows it ("BloomFilter Bitset" with module id 9).
Some inspiration can be found in the
PageIndeximplementation: seearrow/cpp/src/parquet/page_index.cc
Lines 259 to 269 in b2e8f25
arrow/cpp/src/parquet/page_index.cc
Lines 970 to 973 in b2e8f25
Component(s)
C++, Parquet