feat(par): add default restricted shell paths with container-aware prefixing#48236
Conversation
Files inventory check summaryFile checks results against ancestor 1548dc21: Results for datadog-agent_7.79.0~devel.git.206.acdc417.pipeline.104533204-1_amd64.deb:No change detected |
Static quality checks✅ Please find below the results from static quality gates Successful checksInfo
22 successful checks with minimal change (< 2 KiB)
On-wire sizes (compressed)
|
Regression DetectorRegression Detector ResultsMetrics dashboard Baseline: 712311d Optimization Goals: ✅ No significant changes detected
|
| perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
|---|---|---|---|---|---|---|
| ➖ | docker_containers_cpu | % cpu utilization | -0.16 | [-3.11, +2.79] | 1 | Logs |
Fine details of change detection per experiment
| perf | experiment | goal | Δ mean % | Δ mean % CI | trials | links |
|---|---|---|---|---|---|---|
| ➖ | quality_gate_logs | % cpu utilization | +2.81 | [+1.15, +4.48] | 1 | Logs bounds checks dashboard |
| ➖ | quality_gate_metrics_logs | memory utilization | +1.45 | [+1.21, +1.70] | 1 | Logs bounds checks dashboard |
| ➖ | tcp_syslog_to_blackhole | ingress throughput | +1.24 | [+1.08, +1.39] | 1 | Logs |
| ➖ | otlp_ingest_logs | memory utilization | +0.62 | [+0.51, +0.72] | 1 | Logs |
| ➖ | ddot_metrics | memory utilization | +0.38 | [+0.20, +0.55] | 1 | Logs |
| ➖ | quality_gate_idle_all_features | memory utilization | +0.35 | [+0.31, +0.38] | 1 | Logs bounds checks dashboard |
| ➖ | file_to_blackhole_100ms_latency | egress throughput | +0.03 | [-0.05, +0.10] | 1 | Logs |
| ➖ | ddot_metrics_sum_delta | memory utilization | +0.02 | [-0.14, +0.19] | 1 | Logs |
| ➖ | uds_dogstatsd_to_api | ingress throughput | +0.02 | [-0.17, +0.20] | 1 | Logs |
| ➖ | uds_dogstatsd_to_api_v3 | ingress throughput | +0.02 | [-0.18, +0.21] | 1 | Logs |
| ➖ | file_to_blackhole_1000ms_latency | egress throughput | +0.01 | [-0.41, +0.44] | 1 | Logs |
| ➖ | file_to_blackhole_0ms_latency | egress throughput | +0.00 | [-0.52, +0.53] | 1 | Logs |
| ➖ | file_to_blackhole_500ms_latency | egress throughput | +0.00 | [-0.39, +0.39] | 1 | Logs |
| ➖ | tcp_dd_logs_filter_exclude | ingress throughput | -0.00 | [-0.11, +0.11] | 1 | Logs |
| ➖ | quality_gate_idle | memory utilization | -0.03 | [-0.08, +0.01] | 1 | Logs bounds checks dashboard |
| ➖ | ddot_metrics_sum_cumulativetodelta_exporter | memory utilization | -0.04 | [-0.26, +0.18] | 1 | Logs |
| ➖ | docker_containers_cpu | % cpu utilization | -0.16 | [-3.11, +2.79] | 1 | Logs |
| ➖ | docker_containers_memory | memory utilization | -0.23 | [-0.30, -0.16] | 1 | Logs |
| ➖ | file_tree | memory utilization | -0.24 | [-0.30, -0.18] | 1 | Logs |
| ➖ | ddot_metrics_sum_cumulative | memory utilization | -0.29 | [-0.43, -0.15] | 1 | Logs |
| ➖ | uds_dogstatsd_20mb_12k_contexts_20_senders | memory utilization | -0.38 | [-0.44, -0.33] | 1 | Logs |
| ➖ | otlp_ingest_metrics | memory utilization | -0.44 | [-0.60, -0.27] | 1 | Logs |
| ➖ | ddot_logs | memory utilization | -0.62 | [-0.69, -0.55] | 1 | Logs |
Bounds Checks: ❌ Failed
| perf | experiment | bounds_check_name | replicates_passed | observed_value | links |
|---|---|---|---|---|---|
| ✅ | docker_containers_cpu | simple_check_run | 10/10 | 720 ≥ 26 | |
| ✅ | docker_containers_memory | memory_usage | 10/10 | 269.54MiB ≤ 370MiB | |
| ✅ | docker_containers_memory | simple_check_run | 10/10 | 708 ≥ 26 | |
| ✅ | file_to_blackhole_0ms_latency | memory_usage | 10/10 | 0.19GiB ≤ 1.20GiB | |
| ✅ | file_to_blackhole_0ms_latency | missed_bytes | 10/10 | 0B = 0B | |
| ✅ | file_to_blackhole_1000ms_latency | memory_usage | 10/10 | 0.23GiB ≤ 1.20GiB | |
| ✅ | file_to_blackhole_1000ms_latency | missed_bytes | 10/10 | 0B = 0B | |
| ✅ | file_to_blackhole_100ms_latency | memory_usage | 10/10 | 0.20GiB ≤ 1.20GiB | |
| ✅ | file_to_blackhole_100ms_latency | missed_bytes | 10/10 | 0B = 0B | |
| ✅ | file_to_blackhole_500ms_latency | memory_usage | 10/10 | 0.21GiB ≤ 1.20GiB | |
| ✅ | file_to_blackhole_500ms_latency | missed_bytes | 10/10 | 0B = 0B | |
| ✅ | quality_gate_idle | intake_connections | 10/10 | 3 = 3 | bounds checks dashboard |
| ❌ | quality_gate_idle | memory_usage | 9/10 | 175.05MiB > 175MiB | bounds checks dashboard |
| ✅ | quality_gate_idle_all_features | intake_connections | 10/10 | 3 = 3 | bounds checks dashboard |
| ✅ | quality_gate_idle_all_features | memory_usage | 10/10 | 502.75MiB ≤ 550MiB | bounds checks dashboard |
| ✅ | quality_gate_logs | intake_connections | 10/10 | 4 ≤ 6 | bounds checks dashboard |
| ✅ | quality_gate_logs | memory_usage | 10/10 | 208.50MiB ≤ 220MiB | bounds checks dashboard |
| ✅ | quality_gate_logs | missed_bytes | 10/10 | 0B = 0B | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | cpu_usage | 10/10 | 355.55 ≤ 2000 | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | intake_connections | 10/10 | 4 ≤ 6 | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | memory_usage | 10/10 | 414.55MiB ≤ 475MiB | bounds checks dashboard |
| ✅ | quality_gate_metrics_logs | missed_bytes | 10/10 | 0B = 0B | bounds checks dashboard |
Explanation
Confidence level: 90.00%
Effect size tolerance: |Δ mean %| ≥ 5.00%
Performance changes are noted in the perf column of each table:
- ✅ = significantly better comparison variant performance
- ❌ = significantly worse comparison variant performance
- ➖ = no significant change in performance
A regression test is an A/B test of target performance in a repeatable rig, where "performance" is measured as "comparison variant minus baseline variant" for an optimization goal (e.g., ingress throughput). Due to intrinsic variability in measuring that goal, we can only estimate its mean value for each experiment; we report uncertainty in that value as a 90.00% confidence interval denoted "Δ mean % CI".
For each experiment, we decide whether a change in performance is a "regression" -- a change worth investigating further -- if all of the following criteria are true:
-
Its estimated |Δ mean %| ≥ 5.00%, indicating the change is big enough to merit a closer look.
-
Its 90.00% confidence interval "Δ mean % CI" does not contain zero, indicating that if our statistical model is accurate, there is at least a 90.00% chance there is a difference in performance between baseline and comparison variants.
-
Its configuration does not mark it "erratic".
CI Pass/Fail Decision
❌ Failed. Some Quality Gates were violated.
- quality_gate_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
- quality_gate_idle, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_idle, bounds check memory_usage: 9/10 replicas passed. Failed 1 which is > 0. Gate FAILED.
- quality_gate_metrics_logs, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_metrics_logs, bounds check cpu_usage: 10/10 replicas passed. Gate passed.
- quality_gate_metrics_logs, bounds check missed_bytes: 10/10 replicas passed. Gate passed.
- quality_gate_metrics_logs, bounds check intake_connections: 10/10 replicas passed. Gate passed.
- quality_gate_idle_all_features, bounds check memory_usage: 10/10 replicas passed. Gate passed.
- quality_gate_idle_all_features, bounds check intake_connections: 10/10 replicas passed. Gate passed.
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: c57c57d1b2
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex conduct a comprehensive security and code review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 595028d70b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex conduct a comprehensive security and code review. keep in mind that |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 595028d70b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| if _, err := os.Stat(hostProc); err == nil { | ||
| return hostProc | ||
| } |
There was a problem hiding this comment.
Would it be better to be more deterministic?
| if _, err := os.Stat(hostProc); err == nil { | |
| return hostProc | |
| } | |
| return hostProc |
There was a problem hiding this comment.
/host/proc will likely exist in k8s environments with the changes we're making to the helm charts and datadog operator. however, in docker environments the same may not be true.
always returning /host/proc in a containerized environment would make it easier to test and behavior more predictable but i don't think we can always guarantee it to be there
…efixing Add /proc and /etc/os-release to the default allowed paths for the restricted shell, and automatically prepend /host when running in a containerized environment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Serverless-containerized environments (e.g. Fargate) cannot mount host volumes, so unconditionally prefixing with /host produces unusable defaults. Check each path with pathExists() before prefixing, falling back to container-local paths when mounts are absent. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The validation filter was checking IsDir(), silently dropping file paths like /etc/os-release from the allowed paths list. Change to only check existence so both files and directories are preserved. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Remove the PARRestrictedShellProcPath config option and its plumbing through the config adapter, transform, registries, and entrypoint. The proc path is always /proc or /host/proc based on the environment, so determine it at runtime in run_command.go using IsContainerized() gated on path existence. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Test that restricted shell allowed paths and proc path are correctly computed based on whether the agent is containerized and whether host mounts exist. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Change containerizedPathPrefix from const to var so tests can override it to point at temp directories. Tests now create simulated host mount structures and assert exact paths instead of conditionally checking based on the test runner's filesystem. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Revert containerizedPathPrefix to a const. Introduce injectable statFn and parPathExists vars that default to os.Stat/pathExists and can be swapped in tests to simulate host mount presence without touching the filesystem. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…st mounts Always use /host-prefixed paths when containerized instead of silently falling back to container-local paths. Log a warning when the host mount doesn't exist and let rshell handle the missing path at runtime. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
c6c86bd to
ec9e614
Compare
273e223
into
main
|
This seems to be breaking many windows tests. The conversation in slack #windows-products is inconclusive about roll back or fix forward. So, I'm preparing a rollback |
…efixing (#48236) ## Summary - Add `/proc` and `/etc/os-release` to the default allowed paths for the PAR restricted shell - Automatically prepend `/host` to all default paths when running in a containerized environment (detected via `env.IsContainerized()`) - Paths remain user-overridable via config or `DD_PRIVATE_ACTION_RUNNER_RESTRICTED_SHELL_ALLOWED_PATHS` ## Test plan - [ ] Verify default paths are `/var/log`, `/proc`, `/etc/os-release` on bare metal - [ ] Verify default paths are `/host/var/log`, `/host/proc`, `/host/etc/os-release` in a containerized environment - [ ] Verify env var override still works as expected 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: matthew.deguzman <matthew.deguzman@datadoghq.com>
Summary
/procand/etc/os-releaseto the default allowed paths for the PAR restricted shell/hostto all default paths when running in a containerized environment (detected viaenv.IsContainerized())DD_PRIVATE_ACTION_RUNNER_RESTRICTED_SHELL_ALLOWED_PATHSTest plan
/var/log,/proc,/etc/os-releaseon bare metal/host/var/log,/host/proc,/host/etc/os-releasein a containerized environment🤖 Generated with Claude Code