Skip to content

feat: Add dagrun status progress to backfills#64089

Closed
shivaam wants to merge 2 commits intoapache:mainfrom
shivaam:claude/backfill-dagrun-status-TpU3W
Closed

feat: Add dagrun status progress to backfills#64089
shivaam wants to merge 2 commits intoapache:mainfrom
shivaam:claude/backfill-dagrun-status-TpU3W

Conversation

@shivaam
Copy link
Copy Markdown
Contributor

@shivaam shivaam commented Mar 23, 2026

Adds progress tracking to the backfill UI — the banner and Backfills list page now
show a progress bar (green for success, red for failed, gray for remaining) with a
completion count (e.g., "3/6"). Completed backfills show "Completed" text.

image

Changes

Backend:

  • Added num_runs and dag_run_state_counts to BackfillResponse (Open question: I think it might be better to create a new API instead of changing the existing API. See open questions below as well.
  • Single enrichment query (JOIN backfill_dag_run + dag_run, GROUP BY state)
  • num_runs derived by summing state counts (inner join excludes NULL dag_run_id rows)
  • All backfill endpoints enriched (list, get, pause, unpause, cancel)

Frontend:

  • New BackfillProgressBar shared component (used by both banner and table)
  • Completed backfills → "Completed" text, active → progress bar

Tests:

  • test_get_backfill_with_dag_run_state_counts covering mixed states
  • All existing assertions updated for new fields

Open design questions — requesting reviewer input

Should this be a new API endpoint instead of enriching BackfillResponse?

I considered a new APUGET /backfills/{id}/dagRuns returning the linked dag runs with state,
letting the UI compute counts client-side. Reasons it may be better:

  • Keeps BackfillResponse stable (no new fields on the public v2 API)
  • Enables linking individual dag runs to backfills in the UI (not possible today)
  • Supports future CLI airflow backfill status use case
  • UI can poll progress independently of backfill metadata

The current approach (enriching BackfillResponse) was simpler to ship but couples
progress data to every backfill fetch. Happy to refactor if reviewers prefer a
separate endpoint.

Other questions for reviewers:

  1. Should num_runs include skipped BackfillDagRun rows? Currently it only
    counts rows with an actual DagRun (inner join). Skipped slots are excluded as they will never run so counting them in completion does not make sense.

  2. Should completed backfills show final stats? Currently shows "Completed" text
    and hides the bar. A backfill that finished 95 success + 5 failed looks the same
    as 100 success. Showing the final breakdown would be more informative but the
    underlying data has a limitation: when a newer backfill reprocesses the same dates,
    DagRun.backfill_id gets reassigned, so old backfills lose their state counts.
    "Completed" text avoids exposing this stale data.

  3. Should running/queued be visually distinct? The bar currently has three
    segments: green (success), red (failed), gray (remaining). Running and queued are
    lumped into gray. Splitting them out adds information but also visual complexity. I tried that but the UI becomes harder to read.


Was generative AI tooling used to co-author this PR?
  • Yes — Claude Code (claude-opus-4-6)

Generated-by: Claude Code (claude-opus-4-6) following the guidelines

@boring-cyborg boring-cyborg bot added area:airflow-ctl area:API Airflow's REST/HTTP API area:translations area:UI Related to UI/UX. For Frontend Developers. translation:default labels Mar 23, 2026
updated_at: datetime
dag_display_name: str = Field(validation_alias=AliasPath("dag_model", "dag_display_name"))
num_runs: int = 0
dag_run_state_counts: dict[str, int] = {}
Copy link
Copy Markdown
Member

@uranusjr uranusjr Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This probably should not be = {}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! I used = {} to match the existing pattern in the same file Happy to change if you'd prefer a different approach, did you have something specific in mind?

Also curious about your thoughts on the broader design question: would you prefer these fields on a separate endpoint (similar to dagStats) rather than enriching BackfillResponse? I am thinking I should create a new API that lists all the dagRuns associated with a backfill instead

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess it depends on whether other information (on dag runs) would be potentially useful. Attaching them on the same model is better if it’s just the count and state. (Not necessarily enriching BackfillResponse directly though; arguably this should be a different Pydantic model.)

@kaxil kaxil requested a review from Copilot April 2, 2026 00:43
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds aggregated DagRun-state progress data to backfills and displays it in the UI (banner and backfills list) as a segmented progress bar / completion indicator.

Changes:

  • Extend BackfillResponse with num_runs and dag_run_state_counts and enrich multiple backfill endpoints via a grouped aggregate query.
  • Add a reusable BackfillProgressBar component and wire it into the Dag Backfills page and the backfill banner.
  • Update OpenAPI-generated artifacts and adjust/add unit tests (including query-count assertions).

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
airflow-ctl/src/airflowctl/api/datamodels/generated.py Updates generated client-side Pydantic models to include new backfill fields.
airflow-core/src/airflow/api_fastapi/core_api/datamodels/backfills.py Adds new response fields to the Backfill Pydantic model.
airflow-core/src/airflow/api_fastapi/core_api/routes/public/backfills.py Adds _enrich_backfill_responses() and applies it to public backfill endpoints.
airflow-core/src/airflow/api_fastapi/core_api/routes/ui/backfills.py Applies enrichment to UI backfill listing endpoint.
airflow-core/src/airflow/api_fastapi/core_api/openapi/v2-rest-api-generated.yaml Updates generated OpenAPI schema for BackfillResponse.
airflow-core/src/airflow/api_fastapi/core_api/openapi/_private_ui.yaml Updates generated UI OpenAPI schema for BackfillResponse.
airflow-core/src/airflow/ui/openapi-gen/requests/types.gen.ts Updates generated TS types for the new backfill response fields.
airflow-core/src/airflow/ui/openapi-gen/requests/schemas.gen.ts Updates generated TS schema metadata for the new fields.
airflow-core/src/airflow/ui/src/components/BackfillProgressBar.tsx Introduces the segmented progress bar UI component.
airflow-core/src/airflow/ui/src/components/Banner/BackfillBanner.tsx Replaces indeterminate progress with backfill progress segments/count when available.
airflow-core/src/airflow/ui/src/pages/Dag/Backfills/Backfills.tsx Adds a “Progress” column rendering progress or “Completed”.
airflow-core/src/airflow/ui/public/i18n/locales/en/common.json Adds completed and table.progress i18n strings.
airflow-core/tests/unit/api_fastapi/core_api/routes/public/test_backfills.py Updates expected responses; adds a test for mixed DagRun state counts.
airflow-core/tests/unit/api_fastapi/core_api/routes/ui/test_backfills.py Updates expected responses and query-count assertions for UI backfills listing.

from airflow.api_fastapi.core_api.openapi.exceptions import (
create_openapi_http_exception_doc,
)
from airflow.api_fastapi.core_api.routes.public.backfills import _enrich_backfill_responses
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The UI router is importing _enrich_backfill_responses from the public router module. Sharing logic via a private helper in another router creates a brittle cross-route dependency (and makes refactors/routing changes riskier). Consider moving the enrichment function into a shared module (e.g. a services/backfills.py or api_fastapi/core_api/common/backfills.py) and importing it from both routers.

Copilot uses AI. Check for mistakes.
Comment on lines +27 to +33
export const BackfillProgressBar = ({ stateCounts, total, trackColor = "bg.emphasized" }: Props) => {
const successCount = stateCounts.success ?? 0;
const failedCount = stateCounts.failed ?? 0;
const successPct = (successCount / total) * 100;
const failedPct = (failedCount / total) * 100;
const remainingPct = 100 - successPct - failedPct;

Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

successPct/failedPct divide by total without guarding against total === 0. Current call sites check for total === 0, but this component is exported and could be reused elsewhere; with total=0 it will yield Infinity/NaN widths. Add an internal guard (e.g., treat total <= 0 as 0%/0%/100% and render 0/0).

Copilot uses AI. Check for mistakes.
type: integer
type: object
title: Dag Run State Counts
default: {}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR updates v2-rest-api-generated.yaml directly. In Airflow these OpenAPI specs are generated artifacts; please ensure they’re produced via the repo’s OpenAPI generation workflow rather than hand-edited, so the change is reproducible and stays in sync with the source models.

Suggested change
default: {}

Copilot uses AI. Check for mistakes.
dag_display_name: string;
num_runs?: number;
dag_run_state_counts?: {
[key: string]: (number);
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file is under openapi-gen/ and appears to be generated. Please confirm it was regenerated using the project’s OpenAPI client generation tooling (not manually edited), and consider documenting the regen command in the PR description if it’s not already included.

Suggested change
[key: string]: (number);
[key: string]: number;

Copilot uses AI. Check for mistakes.
Comment on lines +1486 to +1495
num_runs:
type: integer
title: Num Runs
default: 0
dag_run_state_counts:
additionalProperties:
type: integer
type: object
title: Dag Run State Counts
default: {}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR updates _private_ui.yaml (an OpenAPI artifact) directly. Please ensure this file is regenerated via the repo’s OpenAPI generation process rather than hand-edited, to keep generated specs consistent and reproducible.

Copilot uses AI. Check for mistakes.
Comment on lines +1138 to +1139
num_runs: Annotated[int | None, Field(title="Num Runs")] = 0
dag_run_state_counts: Annotated[dict[str, int] | None, Field(title="Dag Run State Counts")] = {}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this generated model, num_runs/dag_run_state_counts are typed as optional but default to non-None values, and dag_run_state_counts uses a mutable default {}. This pattern is repeated elsewhere in this file (e.g. dag_run_conf = {}), and can lead to confusing typing and shared mutable state. If the generator can be configured, prefer generating non-optional fields with default_factory for dict/list defaults to avoid these issues.

Copilot uses AI. Check for mistakes.
Comment on lines +56 to +58
dag_display_name: str = Field(validation_alias=AliasPath("dag_model", "dag_display_name"))
num_runs: int = 0
dag_run_state_counts: dict[str, int] = {}
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dag_run_state_counts is initialized with a mutable default {}. This can lead to shared state between BackfillResponse instances if the dict is ever mutated. Use Field(default_factory=dict) for this field (and similarly consider BackfillPostBody.dag_run_conf: dict = {} which has the same issue).

Copilot uses AI. Check for mistakes.
Comment on lines +75 to +92
rows = session.execute(
select(
BackfillDagRun.backfill_id,
DagRun.state,
func.count().label("count"),
)
.join(DagRun, BackfillDagRun.dag_run_id == DagRun.id)
.where(
BackfillDagRun.backfill_id.in_(ids),
DagRun.backfill_id == BackfillDagRun.backfill_id,
)
.group_by(BackfillDagRun.backfill_id, DagRun.state)
).all()
counts: dict[int, dict[str, int]] = defaultdict(dict)
num_runs: dict[int, int] = defaultdict(int)
for backfill_id, state, count in rows:
counts[backfill_id][state] = count
num_runs[backfill_id] += count
Copy link

Copilot AI Apr 2, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DagRun.state is nullable (stored in _state with nullable=True), so this aggregation can produce a NULL state group. That would populate dag_run_state_counts with a None key (and potentially break JSON serialization or UI expectations). Consider filtering out NULL states (DagRun.state.is_not(None)) or mapping them to a stable string using func.coalesce in the SELECT.

Copilot uses AI. Check for mistakes.
@shivaam
Copy link
Copy Markdown
Contributor Author

shivaam commented Apr 9, 2026

Closing in favor of a new approach: a dedicated GET /backfills/{backfill_id}/dag_runs endpoint (see #46250 for discussion). The enrichment-on-BackfillResponse approach worked but both @uranusjr and @dstandish's feedback pointed toward a separate endpoint as the cleaner design.

Backup of the full work (including UI progress bar) is at shivaam/airflow:backup/backfill-dagrun-status-full.

@shivaam shivaam closed this Apr 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:airflow-ctl area:API Airflow's REST/HTTP API area:translations area:UI Related to UI/UX. For Frontend Developers. translation:default

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants