Skip to content

Split task adoption into two transactions in task adoption#64076

Open
jscheffl wants to merge 1 commit intoapache:mainfrom
jscheffl:bugfix/split-work-in-two-transactions-for-task-adoption
Open

Split task adoption into two transactions in task adoption#64076
jscheffl wants to merge 1 commit intoapache:mainfrom
jscheffl:bugfix/split-work-in-two-transactions-for-task-adoption

Conversation

@jscheffl
Copy link
Contributor

We see longer lasting locks in our production and following the (a bit pending) discussion in Slack (https://apache-airflow.slack.com/archives/CCQ7EGB1P/p1774011646300199) we consider the two steps of adopt_or_reset_orphaned_tasks() to (1) mark tasks and (2) process tasks might be causing a long lasting lock on the job table.
First statement is "just" marking all tasks which are missing heartbeats and then the non heartbeat tasks are looped through to handle the failure. As the second step can take a longer time, rows on the jobtable stay with a lock and other processes might queue up on this - especially if multiple schedulers run in parallel.
Therefore proposing to add a commit() as (1) the both parts are logically independent and (2) a potential lock is immediately released.

But it might be conflicting to error handling in PSQL OperationalError handling and retry...


Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

  • Read the Pull Request Guidelines for more information. Note: commit author/co-author name and email in commits become permanently public when merged.
  • For fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
  • When adding dependency, check compliance with the ASF 3rd Party License Policy.
  • For significant user-facing changes create newsfragment: {pr_number}.significant.rst, in airflow-core/newsfragments. You can add this file in a follow-up commit after the PR is created so you know the PR number.

@jscheffl jscheffl requested review from XD-DENG and ashb as code owners March 22, 2026 19:08
@boring-cyborg boring-cyborg bot added the area:Scheduler including HA (high availability) scheduler label Mar 22, 2026
@jscheffl jscheffl requested review from kaxil and uranusjr March 22, 2026 21:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:Scheduler including HA (high availability) scheduler

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant