Refactor/better outbound request types by Stevenjin8 · Pull Request #1772 · istio/ztunnel

Stevenjin8 · 2026-03-04T17:51:32Z

The way I kinda want to see outbound.rs change

…eEntry - Proto: Add ConnectStrategy enum and field to LoadBalancing message - Rust model: Add ConnectStrategy enum and field to LoadBalancer struct - Proto→Rust conversion: Map connect_strategy in TryFrom<&XdsService> - Add resolve_all_on_demand_dns method for returning all DNS IPs - Thread race_candidates through Upstream and Request structs - Implement proxy_to_race using FuturesUnordered for connection racing - Dispatch to proxy_to_race when race_candidates is non-empty - Add rst_close helper for RST-closing on total failure - Fix all test struct literals for new fields Co-authored-by: keithmattix <1531662+keithmattix@users.noreply.github.com>

…placeholder Co-authored-by: keithmattix <1531662+keithmattix@users.noreply.github.com>

1. Explicitly drop FuturesUnordered after finding winner to cancel remaining in-progress connection attempts. 2. Replace separate actual_destination + race_candidates fields with UpstreamTarget enum (Single/Race) for clearer type-level modeling. 3. Guard against racing for in-mesh (HBONE) destinations - only race for TCP (non-mesh) destinations. Co-authored-by: keithmattix <1531662+keithmattix@users.noreply.github.com>

… fix error log level Co-authored-by: keithmattix <1531662+keithmattix@users.noreply.github.com>

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

Instead of spawning all connection attempts simultaneously, use a timer-based approach inspired by Happy Eyeballs (RFC 8305): 1. Start with the first candidate 2. If an attempt fails immediately (e.g. RST), start the next candidate right away without waiting 3. If neither success nor failure occurs within 300ms, start the next candidate in parallel while keeping current attempts going 4. First successful TCP handshake wins; remaining attempts are cancelled This reduces connection amplification: with 10 DNS entries and a 300ms delay, typically only 2 connections are made (assuming ~500ms handshake time) instead of 10. Co-authored-by: keithmattix <1531662+keithmattix@users.noreply.github.com>

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

Remove the representative field from UpstreamTarget::Race and instead pass the original destination address (orig_dst) to proxy_to_race. ConnectionResultBuilder is now constructed after racing determines the actual winner via peer_addr(), ensuring metrics report the real destination rather than a pre-selected representative. Move connection tracking (track_outbound) into proxy_to_race so it also uses the actual winning address.

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

…1758)

* prioritize canonical services on inbound * delete MatchReason * fix itertools thing * continue not return * drop continue for into_iter * as deref * gencheck

* dry-run no allow policies match Signed-off-by: Ian Rudie <ian.rudie@solo.io> * authpol logging macro + env var to allow info log level if desired Signed-off-by: Ian Rudie <ian.rudie@solo.io> * move macro to better location Signed-off-by: Ian Rudie <ian.rudie@solo.io> * clean up Signed-off-by: Ian Rudie <ian.rudie@solo.io> * update tests Signed-off-by: Ian Rudie <ian.rudie@solo.io> * fmt Signed-off-by: Ian Rudie <ian.rudie@solo.io> --------- Signed-off-by: Ian Rudie <ian.rudie@solo.io>

…1763)

* proxy: handle peer_addr() failure gracefully instead of panicking Replace .expect("must receive peer addr") with proper error handling in outbound and socks5 proxy paths. A client can send RST immediately after the TCP handshake completes, causing getpeername(2) to return ENOTCONN on the already-queued socket. The previous .expect() converted this transient OS-level error into a panic that killed the Tokio task. - outbound: match on peer_addr(), log debug and return early on error - socks5/handle_socks_connection: same pattern - socks5/negotiate_socks_connection: use ? operator (From<io::Error> for SocksError already exists) Signed-off-by: Jose Luis Ojosnegros Manchón <jojosneg@redhat.com> * i socket: avoid panic in orig_dst_addr diagnostic log peer_addr() and local_addr() inside the warn!() in orig_dst_addr() were called with .unwrap(), which could panic if the socket became unavailable before the log statement was reached. Replace with .map(...).unwrap_or_else() so the warning logs "N.A." instead of crashing when the address cannot be retrieved. Signed-off-by: Jose Luis Ojosnegros Manchón <jojosneg@redhat.com> --------- Signed-off-by: Jose Luis Ojosnegros Manchón <jojosneg@redhat.com>

…1765)

…1770)

Replace the Request struct + UpstreamTarget enum with a Request enum having four variants: Tcp, RaceTcp, Hbone, DoubleHbone. Each variant is a separate struct carrying only the fields relevant to that protocol mode: - TcpRequest: direct TCP to single address - RaceTcpRequest: race TCP connections to multiple candidates - HboneRequest: single HBONE tunnel (hbone_target_destination is non-optional) - DoubleHboneRequest: nested HBONE via E/W gateway (includes final_sans) Key changes: - Remove UpstreamTarget enum (subsumed into Request variants) - Each proxy_to_* handler takes its specific variant type and owns its own connection tracking and telemetry setup - proxy_to() is now a clean match dispatch with no pre-dispatch setup - send_hbone_request and create_hbone_request take individual fields instead of &Request - baggage() takes &Workload directly - conn_metrics() takes shared fields as parameters Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

istio-testing · 2026-03-04T17:51:35Z

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

…d-request-types

Copilot AI and others added 27 commits February 11, 2026 05:20

Initial plan

df8d920

Address review feedback: add comment explaining selected_workload_ip …

d747592

…placeholder Co-authored-by: keithmattix <1531662+keithmattix@users.noreply.github.com>

Simplify UpstreamTarget: merge tracking_addr/single_addr into addr(),…

a12917a

… fix error log level Co-authored-by: keithmattix <1531662+keithmattix@users.noreply.github.com>

Fix benchmark compilation

30dff93

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

Apply suggestion from @keithmattix

5acebe4

Cargo fmt

f744d59

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

Rename to DnsConnectStrategy

d97653d

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

More rename

d5c0a49

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

Cargo fmt

71d05c7

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

Fix connection reporting

f3c4d08

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

Add test to ensure connection reporting works

ce128df

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

Pin the proxy_to_race param on the heap

ba49bf8

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

cargo fmt

b27e47e

Signed-off-by: Keith Mattix II <keithmattix@microsoft.com>

example localhost

951f749

Automator: update common-files@master in istio/ztunnel@master (istio#…

68086bb

…1758)

prioritize canonical services on inbound (istio#1746)

f04a15a

* prioritize canonical services on inbound * delete MatchReason * fix itertools thing * continue not return * drop continue for into_iter * as deref * gencheck

Automator: update common-files@master in istio/ztunnel@master (istio#…

6e207a3

…1763)

Automator: update common-files@master in istio/ztunnel@master (istio#…

efa1db1

…1765)

Automator: update common-files@master in istio/ztunnel@master (istio#…

fad0e39

…1770)

final diff

74b64bc

istio-testing added do-not-merge/work-in-progress Block merging of a PR because it isn't ready yet. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 4, 2026

Merge remote-tracking branch 'up/master' into refactor/better-outboun…

391551b

…d-request-types

istio-testing added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Mar 4, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor/better outbound request types#1772

Refactor/better outbound request types#1772
Stevenjin8 wants to merge 28 commits intoistio:masterfrom
Stevenjin8:refactor/better-outbound-request-types

Stevenjin8 commented Mar 4, 2026

Uh oh!

istio-testing commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

Stevenjin8 commented Mar 4, 2026

Uh oh!

istio-testing commented Mar 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants