Skip to content

Extract name mapping from table properties#2519

Open
prakharjain09 wants to merge 4 commits into
apache:mainfrom
prakharjain09:extract-name-mapping-from-table-properties
Open

Extract name mapping from table properties#2519
prakharjain09 wants to merge 4 commits into
apache:mainfrom
prakharjain09:extract-name-mapping-from-table-properties

Conversation

@prakharjain09
Copy link
Copy Markdown

Which issue does this PR close?

Closes #2518

What changes are included in this PR?

Parses the schema.name-mapping.default table property once during
TableScanBuilder::build and threads the resulting Arc<NameMapping>
through PlanContextManifestFileContextManifestEntryContext so it
lands on FileScanTask instead of always being None.

This unblocks readers that rely on the name mapping to resolve field IDs for Parquet files lacking
field IDs (or with conflicting IDs). Malformed JSON in the property surfaces
as ErrorKind::DataInvalid from build() rather than being silently
dropped.

Removes the corresponding TODO in crates/iceberg/src/scan/context.rs.

Are these changes tested?

Three new unit tests in scan::tests:

  • test_table_scan_without_name_mapping_property — absent property leaves
    plan_context.name_mapping as None.
  • test_table_scan_with_name_mapping_property — valid JSON parses into the
    expected NameMapping fields on PlanContext.
  • test_table_scan_with_malformed_name_mapping_property — invalid JSON
    returns ErrorKind::DataInvalid from build().

All scan:: tests (37) pass.

Parse the schema.name-mapping.default table property once during
TableScanBuilder::build and thread the resulting Arc<NameMapping>
through PlanContext -> ManifestFileContext -> ManifestEntryContext
so it lands on FileScanTask instead of always being None.

This unblocks readers that rely on name mapping to resolve field IDs
for Parquet files lacking field IDs (or with conflicting IDs).
Malformed JSON in the property surfaces as ErrorKind::DataInvalid
rather than silently dropping the mapping.
Adds three tests around the new schema.name-mapping.default plumbing:
- absent property leaves plan_context.name_mapping as None
- well-formed JSON parses into NameMapping fields on PlanContext
- malformed JSON surfaces as ErrorKind::DataInvalid from build()
@prakharjain09 prakharjain09 force-pushed the extract-name-mapping-from-table-properties branch from f471b08 to 964d427 Compare May 27, 2026 00:51
Without .runtime(test_runtime()), TableBuilder::build() fails with
"Runtime must be provided with TableBuilder.runtime()", matching the
pattern already used by TableTestFixture::new() and friends.
Merge the new use crate::ErrorKind with use crate::TableIdent into a
single use crate::{ErrorKind, TableIdent} so cargo fmt --check passes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Propagate schema.name-mapping.default from table metadata into FileScanTask

1 participant