Skip to content

Python: Add Hyperlight CodeAct package and docs#5185

Open
eavanvalkenburg wants to merge 39 commits intomicrosoft:mainfrom
eavanvalkenburg:code_mode
Open

Python: Add Hyperlight CodeAct package and docs#5185
eavanvalkenburg wants to merge 39 commits intomicrosoft:mainfrom
eavanvalkenburg:code_mode

Conversation

@eavanvalkenburg
Copy link
Copy Markdown
Member

@eavanvalkenburg eavanvalkenburg commented Apr 9, 2026

Motivation and Context

Add a concrete, optional CodeAct implementation for Python and capture the cross-SDK design for CodeAct with Hyperlight. This provides a reusable path for long-running agents to execute sandboxed code with provider-owned tools, file mounts, and network allow-lists without baking CodeAct into core.

Description

  • add ADR 0024 plus Python feature design notes for the CodeAct and Hyperlight design
  • introduce the alpha agent-framework-hyperlight package with HyperlightCodeActProvider and HyperlightExecuteCodeTool
  • add provider-managed tool, file, and network CRUD; derived approval behavior; serializable provider state; and Hyperlight-backed execution results
  • move the CodeAct samples into the new package and update workspace/package metadata
  • add unit coverage, a guarded real-sandbox integration test, and wire Hyperlight into the Python misc integration workflow

Closes: #5187

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Copilot AI review requested due to automatic review settings April 9, 2026 14:00
@moonbox3 moonbox3 added documentation Improvements or additions to documentation python labels Apr 9, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional Python Hyperlight-backed CodeAct implementation plus cross-SDK design documentation, and wires the new package into the Python workspace and CI.

Changes:

  • Introduces the new agent-framework-hyperlight alpha package (provider + execute_code tool), including samples and tests.
  • Updates agent-framework-core to let context providers inspect/override per-run runtime tools via SessionContext.options["tools"].
  • Adds ADR/design docs for CodeAct and updates Python CI workflows to include Hyperlight integration coverage.

Reviewed changes

Copilot reviewed 27 out of 28 changed files in this pull request and generated 6 comments.

Show a summary per file
File Description
python/uv.lock Adds Hyperlight package + Hyperlight sandbox deps; updates a few dependency markers.
python/pyproject.toml Registers agent-framework-hyperlight in the Python workspace.
python/packages/hyperlight/tests/hyperlight/test_hyperlight_codeact.py Adds unit coverage + guarded real-sandbox integration test.
python/packages/hyperlight/samples/README.md Documents how to run the new Hyperlight samples.
python/packages/hyperlight/samples/codeact_tool.py Standalone HyperlightExecuteCodeTool sample.
python/packages/hyperlight/samples/codeact_context_provider.py Provider-owned CodeAct sample using HyperlightCodeActProvider.
python/packages/hyperlight/README.md Package-level README for installation and public API.
python/packages/hyperlight/pyproject.toml New package metadata, deps, and tooling config.
python/packages/hyperlight/LICENSE Adds MIT license for the new package.
python/packages/hyperlight/agent_framework_hyperlight/_types.py Adds public types (FileMount, FilesystemMode, NetworkMode).
python/packages/hyperlight/agent_framework_hyperlight/_provider.py Implements HyperlightCodeActProvider context provider.
python/packages/hyperlight/agent_framework_hyperlight/_instructions.py Builds dynamic CodeAct instructions and tool descriptions.
python/packages/hyperlight/agent_framework_hyperlight/_execute_code_tool.py Implements sandbox execution, caching, CRUD registries for tools/files/network.
python/packages/hyperlight/agent_framework_hyperlight/init.py Exposes public API + version metadata.
python/packages/core/tests/core/test_agents.py Adds tests validating providers can inspect/remove runtime tools.
python/packages/core/agent_framework/_tools.py Introduces ApprovalMode type alias and updates signatures.
python/packages/core/agent_framework/_sessions.py Updates docs to reflect provider mutability of options["tools"].
python/packages/core/agent_framework/_agents.py Passes runtime tools via SessionContext.options and resolves tools from provider-mutated options.
python/PACKAGE_STATUS.md Adds agent-framework-hyperlight as alpha.
python/.cspell.json Adds codeact and hyperlight to dictionary.
docs/features/code_act/python-implementation.md Adds Python-specific CodeAct design notes and API contract.
docs/features/code_act/dotnet-implementation.md Adds placeholder for .NET CodeAct implementation notes.
docs/decisions/0024-codeact-integration.md Adds ADR covering cross-SDK CodeAct integration approach and approval model.
.github/workflows/python-merge-tests.yml Includes Hyperlight tests in “misc integration” selection.
.github/workflows/python-integration-tests.yml Includes Hyperlight tests in “misc integration” job.

@moonbox3
Copy link
Copy Markdown
Contributor

moonbox3 commented Apr 9, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/core/agent_framework
   _agents.py4155287%461, 470, 525, 1020, 1065, 1138–1142, 1205, 1233, 1270, 1291, 1311–1312, 1317, 1364, 1406, 1428, 1430, 1443, 1449, 1494, 1496, 1505–1510, 1515, 1517, 1523–1524, 1531, 1533–1534, 1542–1543, 1546–1548, 1558–1563, 1567, 1572, 1574
   _sessions.py2733089%82–84, 86–89, 106–107, 109–113, 192–193, 283, 544–548, 590, 593, 627, 676, 680, 690, 823, 839
   _tools.py9488790%191–192, 365, 367, 380, 405–407, 415, 433, 447, 454, 461, 484, 486, 493, 501, 540, 584, 588, 620–622, 630, 675–677, 679, 702, 728, 732, 770–772, 776, 798, 910–916, 952, 964, 966, 968, 971–974, 995, 999, 1003, 1017–1019, 1360, 1382, 1469–1475, 1604, 1608, 1654, 1715–1716, 1831, 1851, 1853, 1909, 1972, 2144–2145, 2165, 2221–2222, 2282, 2360–2361, 2428, 2433, 2440
packages/hyperlight/agent_framework_hyperlight
   _execute_code_tool.py4497583%63, 99–100, 119, 121, 134, 152, 162, 187, 192, 199, 205, 213, 221–223, 225–230, 270, 275, 277, 279, 296–297, 306–309, 336–339, 345, 347, 357–358, 390–391, 394–395, 402, 430, 458–459, 462, 466, 512–513, 539, 602, 638, 644–646, 675–679, 683–684, 689, 706–710, 714–715, 771–772
   _instructions.py44588%14, 33, 45–46, 56
   _provider.py421173%53, 57, 65, 69, 73, 77, 81, 85, 89, 93, 97
   _types.py130100% 
TOTAL28096328988% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
5588 22 💤 0 ❌ 0 🔥 1m 33s ⏱️

Copy link
Copy Markdown
Contributor

@moonbox3 moonbox3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nice and complete ADR - well done. A lot here to unpack so doing a first pass with some questions.

@eavanvalkenburg eavanvalkenburg force-pushed the code_mode branch 2 times, most recently from b309856 to fe24c4f Compare April 14, 2026 07:27
eavanvalkenburg and others added 20 commits April 14, 2026 13:36
eavanvalkenburg and others added 7 commits April 14, 2026 14:02
Enable the sandbox filesystem by providing a workspace_root so
/output is mounted. Remove os.path.exists assertion (unsupported
in WASM guest) and fix Content data assertion to use .uri.
Skip the network integration test on Windows where the WASM
sandbox lacks the encodings.idna codec.

Co-authored-by: Copilot <[email protected]>

## Context and Problem Statement

We need an architecture design that supports CodeAct in both Python and .NET. This is a necessary capability for the current generation of long-running agents, which need to plan, iterate, transform tool outputs, and execute bounded code inside a controlled runtime instead of pushing every intermediate step back through the model. The design should preserve the same behavioral contract across SDKs, but it does not need to use the same internal extension point in each runtime. We also want to standardize on Hyperlight as the initial backend, using the existing Python package and an anticipated .NET binding package once it is available.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

instead of pushing every intermediate step back through the model

Could we elaborate on what this means?

Also an introduction of CodeAct will greatly help readers of this doc.

- Good, because a provider-owned CodeAct tool registry avoids mutating or inferring the agent's direct tool surface and can work consistently in both SDKs.
- Good, because the same conceptual design can remain open to `HyperlightCodeActProvider`, a future `MontyCodeActProvider`, and other backend-specific providers over time.
- Good, because `execute_code` can evolve into multiple backend-specific runtime modes rather than being hard-wired to one Python-plus-tools mode.
- Bad, because it is a bolt-on, which might make it less runtime efficient.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make it less runtime efficient
Why will a bolt-on make it less efficient?


## What is the problem being solved?

- Today, the easiest way to prototype CodeAct is to infer or reshape the agent's direct tool surface, which is fragile and hard to reason about.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be part of the ADR instead?

Comment on lines +226 to +230
- snapshotting the current CodeAct-managed tool registry and capability settings for the run,
- computing the effective approval requirement for `execute_code` from the provider default and the snapshotted tool registry,
- adding a short CodeAct guidance block,
- adding `execute_code` to the run through `SessionContext.extend_tools(...)`,
- and wiring any backend-specific execution state needed for the run.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these required for each run? Can these be done once at construction time which will inject the available tools to the agent's tool list?

client=client,
name="assistant",
tools=[send_email], # direct-only tool
context_providers=[codeact],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this, a question that users may have is that is the difference between tools and contexts?

Just an idea: is it possible to do the following

agent = Agent(
    client=client,
    name="assistant",
    tools=[send_email, *codeact.get_tools()],
)

where the returned tools have a reference to the provider so that that can access the file mounts, allowed domains, etc?

agent = Agent(
client=client,
name="interpreter",
context_providers=[code_interpreter],
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding to the previous comment, code_interpreter is reusing an existing concept whose usage is very different.

Comment on lines +5 to +9
- `codeact_context_provider.py` shows the provider-owned CodeAct model where the
agent only sees `execute_code` and sandbox tools are owned by
`HyperlightCodeActProvider`.
- `codeact_tool.py` shows the standalone `HyperlightExecuteCodeTool` surface
where `execute_code` is added directly to the agent tool list.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A short paragraph on when to use what will be helpful for customers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation python

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python: [Feature]: CodeAct python implementation

4 participants