Description
After cluster-api-provider-kubevirt began exposing all VMI interface IPs for dual-stack support (kubernetes-sigs/cluster-api-provider-kubevirt#366, synced via openshift/cluster-api-provider-kubevirt#347), all KubeVirt-based HyperShift NodePools report a false-positive ClusterNetworkCIDRConflict condition.
Root Cause
The CAPK Addresses() method now collects IPs from all vmiInstance.Status.Interfaces, including OVN-Kubernetes internal interfaces (e.g. ovn-k8s-mp0). These management port IPs are, by design, within the hosted cluster's own clusterNetwork CIDR.
The setCIDRConflictCondition function in the NodePool controller (introduced in PR #3880) iterates over every MachineInternalIP and MachineExternalIP and flags any address within the cluster network. It does not distinguish between:
- A machine whose infrastructure IP collides with the pod CIDR (a real problem)
- A machine that has a CNI-internal IP within the pod CIDR (expected, not a problem)
Impact
- Affects all KubeVirt NodePools when using MCE 2.11.0+ (which includes the CAPK dual-stack change)
- Does NOT occur on MCE 2.9.2 (which uses the older CAPK that only reported the primary IP)
- The condition is informational only; no functional impact on cluster operations
How to Reproduce
- Deploy a hub cluster with OCP 4.19 and MCE 2.11.0
- Create a KubeVirt-based hosted cluster using OVN-Kubernetes with a secondary network via Multus/bridge
- Scale a NodePool to 1+ replicas
- Observe the
ClusterNetworkCIDRConflict condition on the NodePool
Example
type: ClusterNetworkCIDRConflict
status: "True"
reason: InvalidConfiguration
message: "machine [example-node-pool-abcde-x1y2z] with ip [10.128.0.2]
collides with cluster-network cidr [10.128.0.0/14], too many similar errors..."
Where 10.128.0.2 is the OVN management port IP (expected to be within the pod CIDR 10.128.0.0/14) and the machine's actual infrastructure address (e.g. 192.168.1.10 from DHCP) does not overlap.
Proposed Fix
Modify setCIDRConflictCondition to only report a collision when ALL of a machine's non-link-local addresses fall within the cluster network. When a machine has addresses both inside and outside the cluster network, the in-network addresses are treated as expected CNI-internal IPs.
Description
After
cluster-api-provider-kubevirtbegan exposing all VMI interface IPs for dual-stack support (kubernetes-sigs/cluster-api-provider-kubevirt#366, synced via openshift/cluster-api-provider-kubevirt#347), all KubeVirt-based HyperShift NodePools report a false-positiveClusterNetworkCIDRConflictcondition.Root Cause
The CAPK
Addresses()method now collects IPs from allvmiInstance.Status.Interfaces, including OVN-Kubernetes internal interfaces (e.g.ovn-k8s-mp0). These management port IPs are, by design, within the hosted cluster's ownclusterNetworkCIDR.The
setCIDRConflictConditionfunction in the NodePool controller (introduced in PR #3880) iterates over everyMachineInternalIPandMachineExternalIPand flags any address within the cluster network. It does not distinguish between:Impact
How to Reproduce
ClusterNetworkCIDRConflictcondition on the NodePoolExample
Where
10.128.0.2is the OVN management port IP (expected to be within the pod CIDR10.128.0.0/14) and the machine's actual infrastructure address (e.g.192.168.1.10from DHCP) does not overlap.Proposed Fix
Modify
setCIDRConflictConditionto only report a collision when ALL of a machine's non-link-local addresses fall within the cluster network. When a machine has addresses both inside and outside the cluster network, the in-network addresses are treated as expected CNI-internal IPs.