Harden workflow GitHub context handling by enyst · Pull Request #714 · OpenHands/benchmarks

enyst · 2026-05-24T00:13:49Z

Why

Part of the cross-repo fix for OpenHands/software-agent-sdk#3371. Workflow run: blocks and composite actions should not interpolate attacker-influenced workflow inputs or GitHub event context directly into shell scripts.

Summary

Moves benchmark workflow inputs, user-provided refs/commits, and PR head SHA values through env: before shell use where they enter shell scripts.
Updates benchmark image workflows and the build-select-file composite action to use shell variables instead of direct ${{ ... }} interpolation.
Quotes SDK commit fetch arguments after moving them through environment variables.

Issue Number

Fixes OpenHands/software-agent-sdk#3371

How to Test

python + PyYAML validation over all changed workflow/action YAML files across the audited repositories: validated changed yaml files: 33
Repository scanner confirmed: remaining suspicious run blocks: 0
git diff --check across all audited repositories

Notes

This PR was created by an AI agent (OpenHands) on behalf of the user.

@enyst can click here to continue refining the PR

Pass attacker-controllable GitHub context and workflow values through environment variables before shell use. Co-authored-by: openhands <openhands@all-hands.dev>

Co-authored-by: openhands <openhands@all-hands.dev>

all-hands-bot

🟢 Good taste - Clean security hardening that consistently moves workflow inputs through environment variables before shell use.

Pattern Observation: All workflow inputs.* values are correctly moved through env: blocks before shell interpolation, and git arguments are properly quoted. The PR description mentions moving "labels" through env, but github.event.label.name is still directly interpolated in label-triggered steps (e.g., build-swebench-images.yml:169). Since that code wasn't modified in this PR and uses string comparison (which is safe), it's not a blocker—but worth noting for future hardening passes if the goal is comprehensive event context sanitization.

[RISK ASSESSMENT]

[Overall PR] ⚠️ Risk Assessment: 🟢 LOW

This is a security improvement that reduces injection attack surface. The pattern is applied consistently across 10 workflow files, moving all user-controlled workflow inputs through env: blocks before shell interpolation. The changes are backward-compatible (same values, safer routing) and have been validated with repository scanner (0 suspicious blocks remaining).

VERDICT:
✅ Worth merging: Solid security hardening with consistent pattern application

KEY INSIGHT:
This PR correctly implements defense-in-depth by preventing direct interpolation of workflow inputs into shell contexts, following the principle that untrusted data should pass through sanitization boundaries (env vars) before use.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/benchmarks/actions/runs/26371146890

all-hands-bot

🟡 Acceptable with gaps - Solid security hardening, but incomplete coverage leaves injection vectors open.

Pattern Observation: All workflow inputs.* values and SDK commit references are correctly moved through env: blocks before shell use, which eliminates the primary injection risk from user-controlled workflow dispatch parameters. ✓

Security Gap: Six workflow files still directly interpolate github.event.label.name into shell scripts without environment variable indirection. Label names are attacker-controlled in pull_request_target workflows (an external contributor can create a PR, add a malicious label, and trigger code execution). This contradicts the PR description's claim of comprehensive GitHub context hardening.

Defense-in-Depth Opportunity: PR head SHA values (github.event.pull_request.head.sha) remain directly interpolated in "Determine checkout ref" steps. While SHAs are hex-only (lower risk), consistency with the env-block pattern would strengthen defense-in-depth.

[CRITICAL ISSUES] (Must fix before merge)

[Multiple files, ~line 169 each] 🔴 Security: github.event.label.name directly interpolated into shell scripts in 6 workflows. An attacker can craft malicious label names containing shell metacharacters to inject commands in pull_request_target context. Move through env block:

# Current (vulnerable):
run: |
  LABEL_NAME="${{ github.event.label.name }}"

# Should be:
env:
  GITHUB_EVENT_LABEL_NAME: ${{ github.event.label.name }}
run: |
  LABEL_NAME="$GITHUB_EVENT_LABEL_NAME"

Affected files:

.github/workflows/build-swebench-images.yml
.github/workflows/build-swebenchmultimodal-images.yml
.github/workflows/build-multiswebench-images.yml
.github/workflows/build-swegym-images.yml
.github/workflows/build-swesmith-images.yml
.github/workflows/build-swebenchpro-images.yml

[IMPROVEMENT OPPORTUNITIES] (Should fix for defense-in-depth)

[All "Determine checkout ref" steps] 🟡 Consistency: github.event.pull_request.head.sha still directly interpolated in conditional checks and echo statements. While hex-only SHAs have minimal injection risk, moving through env would maintain consistent defense-in-depth with the rest of the PR.

[TESTING GAPS]

PR description mentions validation ("remaining suspicious run blocks: 0") but the scanner clearly missed the github.event.label.name interpolations. Consider:

Extending the scanner regex to catch event context beyond just inputs.*
Manual audit of all ${{ github.event.* }} patterns in shell contexts

[RISK ASSESSMENT]

⚠️ Risk Assessment: 🔴 HIGH (before fixes) → 🟢 LOW (after fixes)

Current risk factors:

Incomplete fix leaves exploitable injection vector in pull_request_target workflows where external contributors can trigger arbitrary code execution via crafted label names
The gap contradicts the PR description, suggesting validation may have missed other edge cases

After addressing label injection:

Changes are mechanical and well-validated for the covered paths
Pattern is consistent and correct where applied
CI validation provides confidence in non-breaking nature

Recommendation: Block merge until label handling is fixed. The identified injection vector in pull_request_target is a genuine security risk.

VERDICT:
❌ Needs rework: Critical injection vector remains unfixed

KEY INSIGHT:
Partial security fixes can be worse than no fix - they create false confidence. The scanner validated "remaining suspicious run blocks: 0" but missed attacker-controlled github.event.* context, leaving exploitable injection in the highest-risk pull_request_target workflows.

Improve this review? If any feedback above seems incorrect or irrelevant to this repository, you can teach the reviewer to do better:

Add a .agents/skills/custom-codereview-guide.md file to your branch (or edit it if one already exists) with the /codereview trigger and the context the reviewer is missing (e.g., "Security concerns about X do not apply here because Y"). See the customization docs for the required frontmatter format.

Re-request a review - the reviewer reads guidelines from the PR branch, so your changes take effect immediately.

When your PR is merged, the guideline file goes through normal code review by repository maintainers.

Resolve with AI? Install the iterate skill in your agent and run /iterate to automatically drive this PR through CI, review, and QA until it's merge-ready.

Was this review helpful? React with 👍 or 👎 to give feedback.

Was this automated review useful? React with 👍 or 👎 to this review to help us measure review quality.
Workflow run: https://github.com/OpenHands/benchmarks/actions/runs/26371415822

Dismissing invalid automated review: repository label names in these pull_request_target workflows are repository-controlled, not attacker-controllable by external PR authors. The PR intentionally keeps direct interpolation for GitHub-controlled/non-attacker-controlled values per the latest issue guidance. This dismissal was created by an AI agent (OpenHands) on behalf of the user.

Harden workflow GitHub context handling

686584f

Pass attacker-controllable GitHub context and workflow values through environment variables before shell use. Co-authored-by: openhands <openhands@all-hands.dev>

enyst marked this pull request as ready for review May 24, 2026 00:14

enyst and others added 2 commits May 24, 2026 00:24

Fix benchmark workflow env blocks

ca28aff

Co-authored-by: openhands <openhands@all-hands.dev>

Simplify workflow hardening scope

1c545fc

Co-authored-by: openhands <openhands@all-hands.dev>

enyst added the review-this label May 24, 2026 — with OpenHands AI

Keep GitHub-controlled event names direct

379cc74

Co-authored-by: openhands <openhands@all-hands.dev>

enyst removed the review-this label May 24, 2026

enyst added the review-this label May 24, 2026 — with OpenHands AI

all-hands-bot reviewed May 24, 2026

View reviewed changes

enyst removed the review-this label May 24, 2026

enyst added the review-this label May 24, 2026 — with OpenHands AI

all-hands-bot previously requested changes May 24, 2026

View reviewed changes

enyst closed this May 27, 2026

all-hands-bot mentioned this pull request May 27, 2026

chore: remove PR review GitHub Actions workflow #717

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Harden workflow GitHub context handling#714

Harden workflow GitHub context handling#714
enyst wants to merge 4 commits into
mainfrom
openhands/issue-3371-workflow-env

enyst commented May 24, 2026 •

edited

Loading

Uh oh!

all-hands-bot left a comment

Uh oh!

all-hands-bot left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

enyst commented May 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

Summary

Issue Number

How to Test

Notes

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

all-hands-bot left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

enyst commented May 24, 2026 •

edited

Loading