Provide Spire Workload Attestation Support to ztunnel#1676
Provide Spire Workload Attestation Support to ztunnel#1676MikeZappa87 wants to merge 14 commits intoistio:masterfrom
Conversation
|
Hi @MikeZappa87. Thanks for your PR. I'm waiting for a istio member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
c4574f8 to
86ee7b5
Compare
9105d4e to
c20b489
Compare
|
Hey @MikeZappa87 we are (RH) interested in this feature, can I somehow help to push it forward ? |
@Dimss feel free to msg myself and Arndt on istio slack to discuss. We went the istio community sync before the holidays and had a couple action items. I removed the selector approach as it would reduce the friction with this PR. Selector mode is a SPIRE specific implementation and does not exist in the SPIFFE specification. Right now, Arndt is doing work on the SPIFFE broker API spec which is what I believe the istio community would want as the current implementation is spire specific as a spiffe api does not exist yet. Slack thread: https://istio.slack.com/archives/C049TCZMPCP/p1765304313250799 |
|
@MikeZappa87 hey Mike! This is going to be super helpful... Anything I can do to help move this along? I've got a few commits on SPIRE, do you need help pushing anything on that side forward? |
The istio community doesn't like the spire specific delegated identity api and want the spiffe broker endpoint api. We are working with the spiffe community to get that moving. Reach out to me on the istio slack, I can add you to the chat. |
fc12b24 to
dc0a2dc
Compare
dc0a2dc to
87400d1
Compare
87400d1 to
87589b0
Compare
87589b0 to
9f7f755
Compare
81e6078 to
0080bfd
Compare
38e718f to
857e242
Compare
944e2c0 to
4596b38
Compare
Document that this fork adds SPIRE as an alternative Certificate Authority for workload identity, and explain the relationship with Cilium as a temporary sub-project until upstream merge. Co-authored-by: Bill Mulligan <billmulligan516@gmail.com> Signed-off-by: Quang Nguyen <nguyenquang@microsoft.com>
Change `readiness_addr` from binding to `0.0.0.0` (all interfaces) to `127.0.0.1` (localhost only). Since ztunnel runs with `hostNetwork: true`, binding to all interfaces unnecessarily exposes the readiness endpoint to the network. The kubelet runs on the same node and can reach localhost for health probes. This reduces attack surface for hostNetwork pods. Update `malicious_calls_inpod` test expectations for port 15021: captured clients now get a connection reset (Request) and uncaptured clients get connection refused (Connection) since readiness no longer listens on the node IP. Signed-off-by: Quang Nguyen <nguyenquang@microsoft.com>
Signed-off-by: Quang Nguyen <nguyenquang@microsoft.com>
Add CI and release workflows for building multi-arch container images. CI workflow (build-images-ci.yaml): - Triggers on push to master and PRs - Builds binaries for amd64/arm64 with both aws-lc and boring TLS modes - Pushes multi-arch images to dev registry with SHA tags - FIPS variant uses tag suffix (-fips) on the same repo Release workflow (build-images-releases.yaml): - Triggers on version tags (v*) - Same binary matrix as CI - Pushes to release registry with version tags - Creates GitHub Release with auto-generated changelog - Attaches binaries and container image tarballs as release assets Supporting files: - Dockerfile using distroless base with TARGETARCH for multi-arch - Makefile.docker with local build and multi-arch push targets - build-release target added to Makefile.core.mk Signed-off-by: Quang Nguyen <nguyenquang@microsoft.com>
Filter artifact download to ztunnel-* pattern to avoid downloading unrelated dockerbuild metadata artifacts. Add release environment to create-release job so Quay.io secrets are available for skopeo login. Signed-off-by: Quang Nguyen <nguyenquang@microsoft.com>
Enable xds_address to use Unix domain sockets via the unix:// URI scheme. This allows ztunnel to connect to local control plane components through Unix sockets instead of only TCP/TLS. Changes: - Add unix_socket_path() helper for string-based unix: URI detection (hyper's Uri parser does not support the unix scheme) - Add UdsConnector and UdsGrpcChannel for Unix socket gRPC connections - Add GrpcChannel enum to support both TLS and Unix socket channels - Update grpc_channel() factory to route unix: URIs to UDS path - Update XDS client to detect Unix scheme and skip TLS config - Validate unix: URIs in config to reject empty socket paths - Revert CA client to TLS-only (Unix socket scoped to XDS) - Add unit tests for URI validation, channel factory, and connector Unix sockets don't require TLS as they are local trusted paths. Auth headers are still injected for control plane authentication. Usage: export XDS_ADDRESS="unix:///var/run/xds.sock" Signed-off-by: Quang Nguyen <nguyenquang@microsoft.com>
5d76731 to
2933189
Compare
SPIRE Delegated Identity API Integration for ztunnel
Overview
This document describes the design and implementation of SPIRE integration in ztunnel using the Delegated Identity API. The implementation supports one attestation mode: PID-based, each with different security and efficiency trade-offs.
Background
Current ztunnel Certificate Management
The existing ztunnel certificate management uses
SecretManagerto cache certificates byIdentity(SPIFFE ID). When multiple pods share the same service account, they share a single cached certificate, reducing CA calls and memory usage.SPIRE Delegated Identity API
SPIRE's Delegated Identity API allows a trusted delegate (ztunnel) to request certificates on behalf of workloads. The API supports two attestation methods:
Design
Attestation Modes
In PID mode, each workload is attested individually using its container process ID. This approach:
CompositeId Design
Motivation
The original
SecretManagerusedIdentityas the cache key. To support PID-based attestation while maintaining backward compatibility with the existingCaClientTraitinterface, we introducedCompositeId<RequestKeyEnum>.Structure
Trade-offs
This design was chosen to maintain backward compatibility with
CaClientTrait:Benefits:
SecretManagercan track per-workload state when neededConsequences:
SecretManagercaches byCompositeId, resulting in one cache entry per pod even if they share the same identityPID Verification Flow
In PID mode, ztunnel performs the following steps:
WorkloadUidComparison Summary
CompositeIdwithIdentitykeyCompositeIdwithWorkloadUidkeyConfiguration
Future Considerations
Certificate Caching with Per-Pod Attestation: In PID mode, we should cache and reuse certificates by
Identitywhile still attesting every pod individually. This would reduce SPIRE server load and memory usage—multiple pods with the same identity would share one certificate after each pod passes local PID verification. The first pod triggers a SPIRE call; subsequent pods with the same identity only require local PID verification before reusing the cached certificate.Collaborate with SPIRE/SPIFFE Community: Work with the SPIRE and SPIFFE community to improve the Delegated Identity API and related interfaces to better support delegated attestation use cases like ztunnel's.
Consider a different trait for attested workloads instead of modifying fetch_certificate.