Describe the bug
CommitsImpl in the NoSQL persistence implementation traverses commit history to honor an offset without any safeguard on how much history it will walk. The code has TODOs about adding such a safeguard, but none is implemented, which can lead to unbounded work and performance / resource issues for large histories or pathological offset values.
To Reproduce
Use the NoSQL persistence implementation with a large commit log.
Call an API that ultimately uses CommitsImpl with an offset far back in history.
CommitsImpl walks the commit chain from head toward the offset with no traversal limit, making the operation potentially very expensive.
Expected Behavior
Commit history traversal for an offset should be bounded (e.g., via a configurable maximum traversal limit). If the limit is exceeded before finding the requested offset, the operation should fail fast with a clear error/status instead of continuing unbounded work.
Additional context
File: persistence/nosql/persistence/impl/src/main/java/org/apache/polaris/persistence/nosql/impl/commits/CommitsImpl.java
There are TODOs: // TODO add safeguard to limit the work done when finding the commit with ID 'offset'.
I’d like to work on this by adding a traversal limit for offset lookup and tests for both normal and “limit exceeded” behavior.
System information
OS: macOS
Polaris Catalog Version: current main
Object storage & setup: Not specific; this concerns the NoSQL commit traversal logic.
Describe the bug
CommitsImpl in the NoSQL persistence implementation traverses commit history to honor an offset without any safeguard on how much history it will walk. The code has TODOs about adding such a safeguard, but none is implemented, which can lead to unbounded work and performance / resource issues for large histories or pathological offset values.
To Reproduce
Use the NoSQL persistence implementation with a large commit log.
Call an API that ultimately uses CommitsImpl with an offset far back in history.
CommitsImpl walks the commit chain from head toward the offset with no traversal limit, making the operation potentially very expensive.
Expected Behavior
Commit history traversal for an offset should be bounded (e.g., via a configurable maximum traversal limit). If the limit is exceeded before finding the requested offset, the operation should fail fast with a clear error/status instead of continuing unbounded work.
Additional context
File: persistence/nosql/persistence/impl/src/main/java/org/apache/polaris/persistence/nosql/impl/commits/CommitsImpl.java
There are TODOs: // TODO add safeguard to limit the work done when finding the commit with ID 'offset'.
I’d like to work on this by adding a traversal limit for offset lookup and tests for both normal and “limit exceeded” behavior.
System information
OS: macOS
Polaris Catalog Version: current main
Object storage & setup: Not specific; this concerns the NoSQL commit traversal logic.