Is your feature request related to a problem or challenge?
Right now, the PartitionedFile has an extensions field that is an optional type erased value:
pub struct PartitionedFile {
// other fields
pub extensions: Option<Arc<dyn std::any::Any + Send + Sync>>,
}
This means you can only store one value in the extension.
other parts of Datafusion that use this pattern use a type map, like SessionConfig:
#[derive(Clone, Debug)]
pub struct SessionConfig {
extensions: AnyMap,
}
Further to this, the machinery in the parquet opener, if we want to provide an access plan, needs this value to be a ParquetAccessPlan:
if let Some(access_plan) = extensions.downcast_ref::<ParquetAccessPlan>() {
let plan_len = access_plan.len();
if plan_len != row_group_count {
return exec_err!(
"Invalid ParquetAccessPlan for {file_name}. Specified {plan_len} row groups, but file has {row_group_count}"
);
}
// check row group count matches the plan
return Ok(access_plan.clone());
}
This means you can't actually provide any other values within the extensions field, if you want to pre-provide your access plans.
Describe the solution you'd like
PartitionedFile should include a type map in it:
pub struct PartitionedFile {
// other fields
pub extensions: AnyMap
}
Describe alternatives you've considered
No response
Additional context
No response
Is your feature request related to a problem or challenge?
Right now, the
PartitionedFilehas an extensions field that is an optional type erased value:This means you can only store one value in the extension.
other parts of Datafusion that use this pattern use a type map, like
SessionConfig:Further to this, the machinery in the parquet opener, if we want to provide an access plan, needs this value to be a
ParquetAccessPlan:This means you can't actually provide any other values within the extensions field, if you want to pre-provide your access plans.
Describe the solution you'd like
PartitionedFileshould include a type map in it:Describe alternatives you've considered
No response
Additional context
No response