Add Phase 5 cross-adapter verification suite (PR 18) by prk-Jr · Pull Request #725 · IABTechLab/trusted-server

prk-Jr · 2026-05-20T16:06:56Z

Summary

Implements the Phase 5 verification gate suite from issue Verification gates #499: route parity, auth parity, cross-adapter behavior, auction error-correlation, HTML golden tests, and Criterion benchmarks
Adds a dedicated parity test binary in crates/integration-tests that drives the Axum and Cloudflare adapters with identical requests in-process, proving behavioral equivalence before cutover
Establishes CI gates for both the parity suite and a benchmark smoke-run so regressions are caught on every PR

Changes

File	Change
`crates/trusted-server-adapter-cloudflare/tests/routes.rs`	10 route smoke tests + 5 basic-auth parity tests + 4 admin key path tests
`crates/trusted-server-adapter-axum/tests/routes.rs`	5 basic-auth parity tests + 3 admin key path tests
`crates/trusted-server-adapter-cloudflare/Cargo.toml`	Added `base64` dev-dependency
`crates/trusted-server-adapter-axum/Cargo.toml`	Added `base64` dev-dependency
`crates/integration-tests/Cargo.toml`	New `[[test]] parity` binary + adapter path deps + edgezero git deps
`crates/integration-tests/tests/parity.rs`	New — 8 cross-adapter in-process parity tests (Axum vs Cloudflare)
`crates/trusted-server-core/src/platform/http.rs`	`PlatformResponse::backend_name` unit tests (error-correlation, pre-EdgeZero #213)
`crates/trusted-server-core/src/html_processor.rs`	4 golden regression tests: injection position, URL rewriting, no double-inject, size bounds
`crates/trusted-server-core/Cargo.toml`	Added `[[bench]] html_processor_bench` entry
`crates/trusted-server-core/benches/html_processor_bench.rs`	New — Criterion benchmarks for 10 KB and 100 KB HTML processing
`.github/workflows/test.yml`	New `test-parity` CI job + benchmark smoke step in `test-axum` job
`docs/superpowers/plans/2026-05-20-pr18-phase5-verification.md`	Implementation plan

Closes

Closes #499

Test plan

cargo fmt --all -- --check
cargo clippy-axum
cargo test-axum (16 tests pass)
cargo test-cloudflare (22 tests pass)
cargo test --manifest-path crates/integration-tests/Cargo.toml --test parity (8 tests pass)
cargo test --lib -p trusted-server-core (956 tests pass)
cargo bench -p trusted-server-core --bench html_processor_bench -- --test (smoke passes)
cargo test-fastly (Viceroy-based — not run locally; covered by existing CI job)
JS tests / JS format / Docs format (no JS or docs changes)

Checklist

Changes follow CLAUDE.md conventions
No unwrap() in production code — use expect("should ...")
Uses tracing macros (not println!)
New code has tests
No secrets or credentials committed

Adds 10 tests to tests/routes.rs covering every explicitly registered route plus the tsjs catch-all wildcard. Assertions are scoped to routing only (not 404) for handlers that require live settings or outbound connections, matching the pattern established by the existing auction test.

Verifies that /admin/* routes return 401 without credentials, include a WWW-Authenticate: Basic realm=... header, and reject wrong credentials; also confirms /.well-known and /auction are not gated by admin auth.

… base64 dep

…enchmarks - Add golden_script_tag_injected_at_head_start: verifies script tag is the first child of <head> with nothing between the opening tag and the injected <script>. - Add golden_url_rewriting_replaces_origin_in_href: verifies origin host is fully replaced by proxy host in href/src attributes. - Add golden_integration_script_is_not_double_injected: verifies the /static/tsjs= script tag appears exactly once. - Add response_size_does_not_grow_disproportionately: verifies processed HTML stays within 2× of input size to catch buffer/double-processing bugs. - Add Criterion benchmark (html_processor_bench) for process_chunk at 10 KB and 100 KB payload sizes.

The build script writes trusted-server-out.toml to ../../target/ relative to crates/trusted-server-core/. When the test-parity CI job builds this crate as a dependency from crates/integration-tests/ (workspace-excluded), the workspace-root target/ directory may not yet exist, causing a panic. Add fs::create_dir_all for the parent path before the write to handle this case robustly.

The renamed tests duplicated coverage already provided by admin_route_with_wrong_credentials_returns_401. Auth middleware rejects any wrong credentials with 401 regardless of body content, so the extra variants added no unique signal.

The previous comment described the wrong divergence (authenticated path). For unauthenticated requests both adapters return 401. Add the missing assert_eq!(axum_status, 401) and assert_eq!(axum_status, cf_status) so the parity claim is actually verified for both adapters.

crates/integration-tests is workspace-excluded so cargo clippy --workspace is blind to it. Add an explicit step so lint regressions in parity.rs are caught on every PR.

2.0 was a magic number. Named constant with comment makes the bound self-documenting: 2× covers injected script tag plus URL rewrites.

The setup-rust-toolchain action does not guarantee clippy is installed when restoring a shared cache. Explicitly request the component so the Clippy (parity test crate) step can find cargo-clippy.

Matches the convention used by the Axum adapter tests and parity tests. Single-threaded tokio can miss races in middleware that spawns tasks.

Matches the five first-party route tests already present in the Cloudflare adapter test suite. A silently removed route in the Axum adapter now fails the test run instead of going undetected.

- Assert Axum also returns WWW-Authenticate header on 401 (was CF-only) - Add admin_deactivate_unauthenticated_parity covering the deactivate path - Rename cookie_behavior_note → publisher_proxy_fallback_parity (name now reflects what the test actually verifies) - Fix expect("collect body") → expect("should collect body") per style guide

Adds inline comment so future maintainers know why the version is pinned with `=` rather than a range constraint.

ChristianPavilonis

I verified the three issues locally and opened stacked fix PR #739 with test/CI-only changes.

Blocking summary:

route smoke tests need to prove explicit registrations, not just non-404 responses behind wildcard fallbacks
cross-adapter parity tests need to compare critical header values, not just presence
CI should run clippy for the integration test crate with --all-targets so the test targets are actually linted

Route smoke tests — false positives via wildcard fallback: - Add registered_routes() helper using RouterService::routes() for introspection - Replace 9 individual assert_ne!(404) tests with all_explicit_routes_are_registered() which checks the route table directly; a removed explicit route now fails even when a wildcard fallback would have returned non-404 Cross-adapter parity tests — value equality not just presence: - admin_rotate_unauthenticated_parity: extract both WWW-Authenticate header values and assert they are equal; assert the shared value starts with Basic - admin_deactivate_unauthenticated_parity: same fix - geo_header_parity_on_all_responses: compare X-Geo-Info-Available values across adapters rather than only checking presence; catches true vs false divergence CI clippy: - Add --all-targets to the integration-tests clippy invocation so test targets (tests/parity.rs, tests/routes.rs) are actually linted

aram356

Summary

Phase 5 verification adds route smoke tests, cross-adapter parity tests, HTML golden tests, error-correlation unit tests, Criterion benchmarks, and CI gates. The architecture is right and the tests cover important contracts, but several assertions are weaker than they read — they pass against the current code yet would also pass against several plausible regressions the suite is supposed to catch. CI is green across all 11 checks; findings below address test rigor, not breakage.

A prior review by ChristianPavilonis (CHANGES_REQUESTED) flagged three issues. A stacked fix PR (#739) addresses them but is not merged into this branch yet, so those findings are still live. I confirm all three and add four more.

Blocking

🔧 wrench

Route smoke tests pass when explicit registration is removed — both adapters end their router with /{*rest} catch-all GET/POST. Smoke tests asserting != 404 cannot distinguish "explicit handler ran" from "wildcard fallback swallowed the request". Inline detail on cloudflare and axum sides. Same root cause for the basic-auth tests on admin routes: AuthMiddleware short-circuits with 401 regardless of whether the explicit handler is registered, so removing .post("/admin/keys/rotate", ...) would not fail any test in this PR.
Parity tests check presence, not value equality — three places use contains_key(...) where parity needs assert_eq!(axum_value, cf_value, ...). Inline detail on geo_header_parity_on_all_responses and the admin 401 parity tests.
CI clippy gate is a no-op — cargo clippy --manifest-path crates/integration-tests/Cargo.toml -- -D warnings misses every [[test]] target (no --all-targets), and even with --all-targets the crate has no [lints] section so workspace lints (unwrap_used = deny, panic = deny) do not apply. The crate is workspace-excluded so lints.workspace = true may not resolve — verify or inline.
auth_middleware_runs_in_chain_for_protected_routes cannot prove what its name claims — the assertions hold even if AuthMiddleware is removed from the chain entirely. Inline detail at cloudflare/tests/routes.rs:43.

Non-blocking

🤔 thinking

Tokio exact pin is redundant with the lockfile and silently drifts — =1.52.3 adds nothing the workspace-excluded Cargo.lock doesn't already enforce, but it WILL break when workspace tokio bumps and nobody remembers to edit this line.
MAX_GROWTH_FACTOR = 2.0 is too loose — observed growth is likely ≤1.05× for the test HTML; a regression that doubles script injection would still pass.
Discovery JSON parity is is_some == is_some — both adapters could return completely different JSON and the test passes.

♻️ refactor

Bench should add no-rewrite and rewrite-dense baselines to separate fixed pipeline overhead from rewrite cost.
Parity helpers rebuild the full router per request — fine now; will dominate test time as the suite grows. Consider a OnceCell<Arc<Router>> per adapter.

📌 out of scope

edgezero git rev is hardcoded in integration-tests/Cargo.toml separately from [workspace.dependencies]. Worth a follow-up CI check or docs note for the dual-update.

🌱 seedling

Parity coverage gaps worth a follow-up issue: response-body equality for /.well-known/trusted-server.json when both adapters succeed; Set-Cookie attribute parity (HttpOnly, Secure, SameSite, Max-Age); Content-Type header parity on success responses. Today's suite covers status + a couple headers; full parity needs the rest.

⛏ nitpick

axum_post and cf_post helpers are asymmetric — chained .expect() vs oneshot() patterns. Optional symmetrize.

CI Status

cargo fmt: PASS
cargo test (axum native): PASS
cargo test (cross-adapter parity): PASS
cargo test (workspace): PASS
cargo check (cloudflare native + wasm32-unknown-unknown): PASS
browser integration tests: PASS
integration tests: PASS
vitest: PASS
format-docs / format-typescript: PASS

All gates green. The findings above are about what the gates would catch next, not what's broken now.

aram356 · 2026-05-28T15:12:21Z

+        .body(edgezero_core::body::Body::from(r#"{"adUnits":[]}"#))
+        .expect("should build request");
+    let resp = router.oneshot(req).await;
+    assert_ne!(resp.status().as_u16(), 404, "/auction must be routed");


🔧 wrench — Route smoke test passes even when the explicit route registration is removed.

Both adapters register a /{*rest} catch-all GET/POST at the end of RouterService::builder() (see cloudflare/src/app.rs:366-367). With this asserting only status != 404, removing the explicit .post("/auction", auction_handler) registration would still pass: the request falls through to post_fallback → handle_publisher_request → 5xx in CI (no origin), which is still != 404. The same false-positive applies to every smoke test in this file that asserts != 404 (verify_signature, admin/keys/, first_party/, /.well-known/trusted-server.json).

Suggested fix: Either introspect the router's registered routes directly, configure settings so the explicit handler returns a distinguishable status (e.g., 200) while the fallback returns 5xx, or inject a per-handler marker header to prove which handler actually ran.

This matches the prior reviewer's finding; PR #739 stacks a fix but is not merged into this branch.

aram356 · 2026-05-28T15:12:21Z

+        resp.status().as_u16(),
+        404,
+        "/first-party/proxy must be routed"
+    );


🔧 wrench — Same wildcard-fallback false positive as the Cloudflare smoke tests.

Axum's router registers /{*rest} at lines 365-366 of src/app.rs. If the explicit .get("/first-party/proxy", fp_proxy_handler) registration were removed, this test still passes because the catch-all routes to get_fallback → handle_publisher_request, returning 5xx (still != 404).

Applies to every != 404 smoke test in this file. Same fix as the Cloudflare side.

aram356 · 2026-05-28T15:12:21Z

+        assert!(
+            cf_headers.contains_key("x-geo-info-available"),
+            "Cloudflare: {method} {path} (status={cf_status}) must have X-Geo-Info-Available"
+        );


🔧 wrench — Header presence check is too weak for a parity assertion.

The module-level contract is that critical header values match across adapters. This test only proves both adapters return some X-Geo-Info-Available header. If Axum returns true and Cloudflare returns false, the test still passes — exactly the kind of regression a parity suite must catch.

Suggested fix:

let axum_value = axum_headers.get("x-geo-info-available").and_then(|v| v.to_str().ok()); let cf_value = cf_headers.get("x-geo-info-available").and_then(|v| v.to_str().ok()); assert_eq!( axum_value, cf_value, "X-Geo-Info-Available value must match across adapters for {method} {path}" );

Matches the prior reviewer's finding; PR #739 has the fix but is not merged here.

aram356 · 2026-05-28T15:12:21Z

+    assert!(
+        cf_www_auth.starts_with("Basic realm="),
+        "Cloudflare 401 WWW-Authenticate must be Basic scheme: {cf_www_auth:?}"
+    );


🔧 wrench — WWW-Authenticate parity is checked unevenly.

Axum side asserts presence only (line 219); Cloudflare side asserts the value starts with "Basic realm=" (line 228). A regression where both adapters return WWW-Authenticate but with different realms (e.g., Basic realm="admin" vs Basic realm="foo") or different schemes (Basic vs Bearer) would not be detected as a parity failure.

Suggested fix: Extract both values, assert axum_www_auth == cf_www_auth to enforce strict parity, OR explicitly document the non-parity dimensions in a comment.

Same issue at admin_deactivate_unauthenticated_parity (both sides presence-only).

aram356 · 2026-05-28T15:12:21Z

+        run: cargo test --manifest-path crates/integration-tests/Cargo.toml --test parity
+
+      - name: Clippy (parity test crate)
+        run: cargo clippy --manifest-path crates/integration-tests/Cargo.toml -- -D warnings


🔧 wrench — Clippy gate runs against nothing.

crates/integration-tests is a publish = false package whose only targets are [[test]] binaries — no [lib], no [[bin]]. Without --all-targets, cargo clippy lints nothing meaningful, so tests/parity.rs and tests/integration.rs are not actually gated.

There is also a second issue: crates/integration-tests/Cargo.toml has no [lints] section. Because the crate is workspace-excluded (root [workspace].exclude), it does not inherit the workspace's unwrap_used = deny / panic = deny / print_stdout = warn lints even with --all-targets. Both gaps must be closed for the gate to mean anything.

Suggested fix:

run: cargo clippy --manifest-path crates/integration-tests/Cargo.toml --all-targets -- -D warnings

Additionally, add [lints] to crates/integration-tests/Cargo.toml (inline the workspace lints — lints.workspace = true may not resolve in an excluded crate).

The prior reviewer flagged the --all-targets half; the [lints] inheritance gap is additional.

aram356 · 2026-05-28T15:12:21Z

    use std::sync::Arc;

+    // 2× accounts for the injected tsjs script tag plus URL attribute rewrites.
+    const MAX_GROWTH_FACTOR: f64 = 2.0;


🤔 thinking — MAX_GROWTH_FACTOR = 2.0 is too loose to catch real regressions.

Typical growth on the included html_processor.test.html is probably ≤1.05× (one injected script tag + a few URL rewrites). A regression that doubles script-tag injection or duplicates URL rewriting would land around 1.02-1.10× — still well under 2× — and this test would pass.

Suggested fix: Either tighten the bound to a value closer to observed real growth (e.g., 1.3×), or assert an absolute byte budget tied to the known script-tag length: assert!(output_size <= input_size + EXPECTED_INJECTION_BYTES + URL_REWRITE_DELTA_BYTES).

aram356 · 2026-05-28T15:12:22Z

+        cf_json.is_some(),
+        "/.well-known/trusted-server.json body JSON-parsability must match across adapters \
+         (axum_status={axum_status} cf_status={cf_status})"
+    );


🤔 thinking — is_some parity is weaker than the parity goal.

If both adapters return JSON, this passes — even if the JSON bodies are completely different (one {"error":...}, one {"keys":[]}). The intent is parity of the discovery payload.

Suggested fix: Either compare top-level key sets (when both parse as JSON objects) or, if you control test settings, set up signing keys so both adapters return the success payload and assert payload equality. The current assertion only catches the case where one adapter returns JSON and the other returns HTML/text, which is unlikely to regress.

aram356 · 2026-05-28T15:12:22Z

+fn bench_html_processor(c: &mut Criterion) {
+    let mut group = c.benchmark_group("html_processor");
+
+    for size_kb in [10usize, 100] {


♻️ refactor — Add baselines that isolate fixed vs variable cost.

Current bench measures one input shape. Two additional inputs help regressions stay visible:

HTML with zero occurrences of origin_host — measures fixed pipeline overhead (rewriter init, lol_html parse).

HTML with rewrite-dense content (many origin-host URLs per KB) — measures rewriting cost per occurrence.

A bug that adds per-element overhead or duplicates rewriter passes shows up clearly in the second baseline; one that adds fixed init cost shows up in the first.

aram356 · 2026-05-28T15:12:22Z

+
+/// Send a GET request to the Axum adapter and return (status, headers).
+async fn axum_get(uri: &str) -> (u16, HeaderMap) {
+    let mut svc = EdgeZeroAxumService::new(AxumApp::routes());


♻️ refactor — Helpers rebuild the full service per request.

Every axum_get / cf_get / axum_post / cf_post invocation calls AxumApp::routes() / CloudflareApp::routes(), which goes through the full settings + integration-registry boot path each time. With 8 tests × ~2 requests each, that's ~16+ router builds.

Not blocking now, but as the parity suite grows this will dominate test time. Consider a OnceCell<Arc<Router>> per adapter, scoped to the test binary.

aram356 · 2026-05-28T15:12:22Z

+}
+
+/// Send a POST request to the Cloudflare adapter and return (status, headers, body bytes).
+async fn cf_post(uri: &str, body: &str) -> (u16, HeaderMap, bytes::Bytes) {


⛏ nitpick — cf_post and axum_post are asymmetric.

axum_post unwraps three layers: .ready().await.expect().call().await.expect() then body collect.
cf_post (here) uses .oneshot() + into_bytes() synchronously.

The asymmetry isn't wrong, but reading the two paths side-by-side requires the reader to mentally translate. A pair of small helper traits or thin wrapper functions could symmetrize the call sites. Optional cleanup.

prk-Jr added 15 commits May 20, 2026 20:32

Add Phase 5 verification plan for PR-18

a499f93

Fix rustfmt formatting in Cloudflare route smoke tests

acd3f5d

Add basic-auth parity tests to Axum and Cloudflare adapters

7efa2ff

Verifies that /admin/* routes return 401 without credentials, include a WWW-Authenticate: Basic realm=... header, and reject wrong credentials; also confirms /.well-known and /auction are not gated by admin auth.

Fix unwrap_or and comment inconsistency in basic-auth parity tests

69d3693

Add admin key route full path coverage to Axum and Cloudflare adapters

75fac49

Tighten storage-fail assertion and add duplicate context comments

e891f5e

Add cross-adapter in-process parity test suite (Axum vs Cloudflare)

72be307

Fix parity test review issues: expect messages, dead if guard, unused…

4f79021

… base64 dep

Add error-correlation unit tests for PlatformResponse backend_name

040a77b

Add criterion::black_box to benchmark input for measurement purity

0e4d2ea

Add cross-adapter parity and benchmark CI gates for Phase 5 verification

24f122f

Add inline comment explaining -- --test flag in benchmark CI step

99967c8

Update Cargo.lock files after adding parity test dependencies

8a509ac

prk-Jr self-assigned this May 20, 2026

prk-Jr added 13 commits May 20, 2026 21:43

Pin axum/tower/tokio versions in integration-tests to match workspace

c114209

Pin tokio to exact workspace version =1.52.3 in integration-tests

84cbdfe

Add clippy gate for integration-tests crate in test-parity CI job

5123bd4

crates/integration-tests is workspace-excluded so cargo clippy --workspace is blind to it. Add an explicit step so lint regressions in parity.rs are caught on every PR.

Extract MAX_GROWTH_FACTOR constant in html_processor growth test

641bfde

2.0 was a magic number. Named constant with comment makes the bound self-documenting: 2× covers injected script tag plus URL rewrites.

Fix trailing blank line in Cloudflare routes test file

d7670aa

Add clippy component to test-parity toolchain setup

b6a8894

The setup-rust-toolchain action does not guarantee clippy is installed when restoring a shared cache. Explicitly request the component so the Clippy (parity test crate) step can find cargo-clippy.

Use multi_thread tokio flavor for all CF adapter route tests

a521479

Matches the convention used by the Axum adapter tests and parity tests. Single-threaded tokio can miss races in middleware that spawns tasks.

Add first-party route smoke tests to Axum adapter

5f91da2

Matches the five first-party route tests already present in the Cloudflare adapter test suite. A silently removed route in the Axum adapter now fails the test run instead of going undetected.

Document tokio exact pin in integration-tests Cargo.toml

161c41e

Adds inline comment so future maintainers know why the version is pinned with `=` rather than a range constraint.

prk-Jr requested a review from aram356 May 20, 2026 17:15

prk-Jr requested a review from ChristianPavilonis May 20, 2026 17:15

ChristianPavilonis requested changes May 27, 2026

View reviewed changes

Comment thread crates/trusted-server-adapter-cloudflare/tests/routes.rs Outdated

Comment thread crates/integration-tests/tests/parity.rs Outdated

Comment thread crates/integration-tests/tests/parity.rs Outdated

Comment thread .github/workflows/test.yml Outdated

prk-Jr linked an issue May 28, 2026 that may be closed by this pull request

Verification gates #499

Open

aram356 requested changes May 28, 2026

View reviewed changes

Conversation

prk-Jr commented May 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Closes

Test plan

Checklist

Uh oh!

ChristianPavilonis left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

aram356 left a comment

Choose a reason for hiding this comment

Summary

Blocking

🔧 wrench

Non-blocking

🤔 thinking

♻️ refactor

📌 out of scope

🌱 seedling

⛏ nitpick

CI Status

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

prk-Jr commented May 20, 2026 •

edited

Loading