Skip to content

Fix CDP proxy reconnect after Chromium restart#189

Merged
sjmiller609 merged 4 commits intomainfrom
steven/cus-109-cdp-proxy-doesnt-reconnect-to-chromium-after-supervisorctl
Mar 26, 2026
Merged

Fix CDP proxy reconnect after Chromium restart#189
sjmiller609 merged 4 commits intomainfrom
steven/cus-109-cdp-proxy-doesnt-reconnect-to-chromium-after-supervisorctl

Conversation

@sjmiller609
Copy link
Copy Markdown
Contributor

@sjmiller609 sjmiller609 commented Mar 25, 2026

Summary

Fixes CUS-109: CDP proxy doesn't reconnect to Chromium after supervisorctl restart chromium.

The WebSocketProxyHandler now wires into UpstreamManager.Subscribe() to handle Chromium restarts:

  1. Proactive teardown — Each proxy session subscribes to upstream URL changes. When Chromium restarts and emits a new DevTools URL, the pump context is cancelled, closing both sides and forcing clients to reconnect with the fresh upstream.

  2. Stale URL retry — If dialing the current upstream fails (race where supervisorctl restart cycles faster than the tail picks up the new URL), the handler waits up to 5s for a new URL via Subscribe and retries the dial once.

Only server/lib/devtoolsproxy/proxy.go is changed. wsproxy.go is untouched — the handler now calls websocket.Accept, websocket.Dial, and wsproxy.Pump directly instead of going through wsproxy.Proxy, giving it the control needed to wire in the subscribe/cancel logic.

Testing

Manual CDP Reconnect Validation

Ran a live container from this branch and exercised the CDP proxy on :9222 from the host with raw CDP commands.

Test flow

  • Container startup to healthy API/CDP: about 4-5s
  • Pre-restart CDP session:
    • connected to browser WebSocket
    • created/attached to a target
    • navigated, evaluated JS, captured screenshot
    • completed in under 5s
  • Chromium restart:
    • supervisorctl restart chromium returned immediately with stopped/started
    • proxy returned 502 on /json/version for about 6s
    • new DevTools endpoint appeared about 6s after restart
  • Post-restart CDP session:
    • reconnected to new browser WebSocket
    • repeated navigate/evaluate/screenshot successfully
    • completed in under 5s

Observed behavior

  • Before restart:
    • browser: Chrome/146.0.7680.164
    • JS evaluation succeeded (sum=21, expected title/DOM state)
    • screenshot captured (5539 bytes)
  • During restart:
    • old CDP socket closed with 1006
    • stale socket rejected further commands
    • /json/version briefly returned 502
  • After restart:
    • proxy updated to a new browser WebSocket
    • JS evaluation succeeded again (sum=21, expected title/DOM state)
    • screenshot captured (5210 bytes)

Result

  • Pass: real end-to-end CDP reconnect works after supervisorctl restart chromium.

Note

Medium Risk
Changes WebSocket proxy connection lifecycle and retry behavior around upstream restarts, which could affect stability for all DevTools clients if edge cases or timing assumptions are wrong.

Overview
Fixes CDP proxy behavior when Chromium restarts by wiring WebSocketProxyHandler into UpstreamManager.Subscribe() so active proxy sessions are cancelled when the upstream DevTools URL changes, forcing clients to reconnect.

Adds a bounded dial retry path (dialUpstreamWithRetry) that re-checks mgr.Current() after a failed dial and waits briefly for a new upstream URL, plus a test-only hook (devtoolsProxyTestHook + env vars) to reliably reproduce the reconnect race.

Extends coverage with a new e2e test that restarts Chromium mid-connection and verifies stale sockets close/reject commands and a new session can successfully re-establish and run CDP commands, and adds a unit test for the missed-update retry scenario.

Written by Cursor Bugbot for commit b00410f. This will update automatically on new commits. Configure here.

…arts

When Chromium restarts via supervisorctl, the CDP WebSocket proxy now:

1. Subscribes to upstream URL changes per proxy session, so active
   connections are proactively closed when the upstream URL changes
   (forcing clients to reconnect with the fresh URL).

2. If dialing the current upstream URL fails (stale URL from a fast
   restart cycle), waits up to 5s for a new URL from Subscribe and
   retries the dial once before giving up.

Fixes CUS-109.
@sjmiller609 sjmiller609 marked this pull request as ready for review March 26, 2026 16:06
@sjmiller609 sjmiller609 requested a review from rgarcia March 26, 2026 16:09
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@sjmiller609 sjmiller609 merged commit 79a089d into main Mar 26, 2026
5 checks passed
@sjmiller609 sjmiller609 deleted the steven/cus-109-cdp-proxy-doesnt-reconnect-to-chromium-after-supervisorctl branch March 26, 2026 18:42
AbdulRashidReshamwala added a commit to reclaimprotocol/popcorn-images that referenced this pull request Mar 27, 2026
Resolved conflicts keeping both Reclaim-specific features and upstream improvements:

- config.go: Kept TEE configuration (TEEKUrl, TEETUrl, AttestorUrl)
- openapi.yaml: Kept /reclaim/prove endpoint and ReclaimProve* schemas
- go.mod: Kept all Reclaim TEE dependencies and updated to latest versions
- log.ts: Kept OpenTelemetry logging wrapper around base loggers
- proxy.go: Adopted upstream's improved CDP reconnection logic with dialUpstreamWithRetry
- Dockerfiles: Hybrid approach - auto-detect ChromeDriver version with override option

Key upstream improvements integrated:
- CDP proxy reconnection on Chromium restart (kernel#189, kernel#191)
- Improved fonts support (kernel#165)
- Better error handling in devtools proxy
- Updated dependency versions

All conflicts resolved, go.sum regenerated, oapi.go regenerated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants