Skip to content

Add Calico Ingress Gateway to calico system networkpolicy#4581

Draft
alexh-tigera wants to merge 4 commits intotigera:masterfrom
alexh-tigera:ev-6476-add-gateway-to-calico-system-networkpolicy
Draft

Add Calico Ingress Gateway to calico system networkpolicy#4581
alexh-tigera wants to merge 4 commits intotigera:masterfrom
alexh-tigera:ev-6476-add-gateway-to-calico-system-networkpolicy

Conversation

@alexh-tigera
Copy link
Copy Markdown
Member

Description

Fixed an issue where the Calico Ingress Gateway failed to deploy when the recommended default deny policy was applied.

This adds:

  • DNS and api-server egress rules for the certgen job in order to apply the generated certificates.
  • DNS and api-server egress rules for envoy-gateway to manage envoy proxies.

Release Note

* Fixed an issue where Calico Ingress Gateway failed to deploy when default-deny rules were applied.

For PR author

  • Tests for change.
  • ~~ If changing pkg/apis/, run make gen-files~~
  • ~~ If changing versions, run make gen-versions~~

For PR reviewers

A note for code reviewers - all pull requests must have the following:

  • Milestone set according to targeted release.
  • Appropriate labels:
    • kind/bug if this is a bugfix.
    • kind/enhancement if this is a a new feature.
    • enterprise if this PR applies to Calico Enterprise only.

Copy link
Copy Markdown
Member

@nelljerram nelljerram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good, except

  • I don't think operator should unconditionally install default-deny policy for tigera-gateway namespace
  • I do think we should have an FV test that does that, and then checks CIG can be successfully deployed.

Comment thread pkg/render/gatewayapi/gateway_api.go Outdated
objs = append(objs,
pr.gatewayCertgenAllowTigeraPolicy(),
pr.gatewayControllerAllowTigeraPolicy(),
networkpolicy.CalicoSystemDefaultDeny(common.TigeraGatewayNamespace),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the operator should unconditionally install this default deny, right? Were you only doing this for testing? Instead I think we should have an FV test that applies the default-deny policy and then verifies that CIG can still be successfully deployed.

Copy link
Copy Markdown
Member Author

@alexh-tigera alexh-tigera Apr 14, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I missed that after removing the other changes I made. While doing that I also realized that it needs a watch for the network policies, so I've fix the default deny and added the watches in 48a0ab8.

Where do FV tests like that live? I see some similar things in felix/fv/ ? Nevermind, I thought that the test/ dir was unit tests in this repo.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's the one, test/*.go in this repo. Also see the fv target in the Makefile.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do LMK when the PR is ready for another review.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, it took me a while to understand how to implement the FV, but I'm happy with it now. Please have a look.

@alexh-tigera alexh-tigera force-pushed the ev-6476-add-gateway-to-calico-system-networkpolicy branch 9 times, most recently from 2f7336c to 35a7a8f Compare April 15, 2026 21:26
alexh-tigera and others added 4 commits April 16, 2026 11:10
The certgen job in the tigera-gateway namespace needs DNS and
kube-apiserver egress to create TLS secrets. Under a global
default-deny policy, this traffic is blocked causing the job to
fail with i/o timeout reaching the API server.

Add a NetworkPolicy in the calico-system tier that allows the
certgen pods (app == 'certgen') DNS and kube-apiserver egress,
plus a default-deny policy for the tigera-gateway namespace,
matching the pattern used by other components.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The envoy-gateway controller needs DNS and kube-apiserver egress
to function. Under a global default-deny policy, the controller
fails with i/o timeout reaching the API server.

Add a NetworkPolicy in the calico-system tier that allows the
controller pods (k8s-app == 'gateway-api-controller') DNS egress,
kube-apiserver egress, and ingress on its serving ports (xDS gRPC,
ratelimit, wasm, metrics, webhook).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Add WaitToAddTierWatch and WaitToAddNetworkPolicyWatches for the
gateway controller's allow policies, matching the pattern used by
other controllers. Add tier existence check during reconciliation.

Remove the namespace-scoped default deny from tigera-gateway because
Gateway resources dynamically create Envoy proxy pods whose traffic
patterns are determined by the customer's Gateway and HTTPRoute
configuration. Keep the controller-access and certgen-access allow
policies so control plane pods still function under a customer's
global default deny.

Add FV test verifying the allow policies are created.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Covers the scenario the calico-system tier allow policies were added
to support: a default-tier v3 NetworkPolicy with selector all() is
applied to tigera-gateway, then a GatewayAPI CR is created. The test
asserts the envoy-gateway-certgen Job reaches Succeeded and the
envoy-gateway Deployment becomes Ready despite the default-deny,
which is only possible if gateway-api-certgen-access and
gateway-api-controller-access in the higher-precedence calico-system
tier correctly permit the traffic those workloads need.

Also exercises the v3.NetworkPolicy watch by deleting each allow
policy and asserting it's re-created, guarding against regressions
in WaitToAddNetworkPolicyWatches.

Adds calico/envoy-gateway to the FV kind cluster image load list so
the gateway Deployment's image is available.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@alexh-tigera alexh-tigera force-pushed the ev-6476-add-gateway-to-calico-system-networkpolicy branch from 5dbd768 to cba4016 Compare April 16, 2026 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants