Skip to content

Add DoFnRunner::finishKey() method#38454

Draft
arunpandianp wants to merge 8 commits into
apache:masterfrom
arunpandianp:multikey2
Draft

Add DoFnRunner::finishKey() method#38454
arunpandianp wants to merge 8 commits into
apache:masterfrom
arunpandianp:multikey2

Conversation

@arunpandianp
Copy link
Copy Markdown
Contributor

In upcoming changes there'll be multiple dataflow streaming work items
in a single beam bundle. With multiple work items, we've to process elements
and timers of each work item before moving to the next work items.

The new finishKey method allows the DoFnRunners to
cleanup/persist state (that should not be carried over) before switching work items
on multi key bundles.

Dataflow streaming SideInputDoFnRunners are the only classes using the finishKey
method right now.

The finishKey() is not exposed to DoFns and is not visible in user apis.

This is on top of #38430

Moving the processTimers call from finish() to finishKey().

In upcoming changes there'll be multiple streaming work items
in a single beam bundle. With multiple work items, we've to process
elements and timers of each work item before moving to the next work
items.

finishKey() will be called by the NativeIterator classes after iterating
through all elements from a work item.

Batch processes timers in BatchModeUngroupingParDoFn and does not rely
on the processTimers() in ParDoOperation::finish(). So removing the
processTimers() call from ParDoOperation::finish() is safe. Batch also
does not use the new finishKey() method.
In upcoming changes there'll be multiple dataflow streaming work items
in a single beam bundle. With multiple work items, we've to process elements
and timers of each work item before moving to the next work items.

The new finishKey method allows the DoFnRunners to
cleanup/persist state (that should not be carried over) before switching work items
on multi key bundles.

Streaming SideInputDoFnRunners are the only classes using the finishKey
method right now.

The finishKey() is not exposed to DoFns and is not visible in user apis.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant