🐛 OCPBUGS-78455: fix(Boxcutter-Collision): Fix collision detection bypass after ClusterExtension upgrade#2578
Conversation
✅ Deploy Preview for olmv1 ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Pull request overview
Fixes a Boxcutter collision-detection bypass that could allow a second ClusterExtension to install the same package after the first ClusterExtension upgraded and bumped managed-object revision numbers.
Changes:
- Treat
ActionProgressedresults as collisions when the reconciled object is controller-owned by a differentClusterExtensionRevision(foreign ownerRef). - Add an E2E scenario covering “upgrade then duplicate install” and a new step to ensure both
ClusterExtensionresources are cleaned up. - Add unit tests validating the new “foreign revision ownerRef” collision behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
internal/operator-controller/controllers/clusterextensionrevision_controller.go |
Adds foreign-ownerRef collision detection for ActionProgressed objects and tracks sibling CER names during reconcile. |
internal/operator-controller/controllers/clusterextensionrevision_controller_test.go |
Adds unit coverage for foreign/sibling/non-CER controller ownerRef cases. |
test/e2e/steps/steps.go |
Adds a step to track the currently-applied ClusterExtension for cleanup when a scenario applies a second CE. |
test/e2e/features/update.feature |
Adds an E2E scenario asserting collisions are detected after an upgrade when installing a duplicate CE. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Outdated
Show resolved
Hide resolved
a73cf6d to
8b504c8
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2578 +/- ##
==========================================
+ Coverage 65.84% 67.89% +2.04%
==========================================
Files 137 137
Lines 9560 9586 +26
==========================================
+ Hits 6295 6508 +213
+ Misses 2795 2581 -214
- Partials 470 497 +27
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
| // nil when the controller is a sibling revision (same ClusterExtension) or not a CER at all. | ||
| func getForeignRevisionController(obj metav1.Object, siblingRevisionNames sets.Set[string]) *metav1.OwnerReference { | ||
| for i := range obj.GetOwnerReferences() { | ||
| ref := obj.GetOwnerReferences()[i] |
There was a problem hiding this comment.
why are we avoiding setting this to a local and iterating over it? If we're concerned that the contents can change over the course of this call then couldn't we end up in a situation where the first call determines a bounds which exceeds the returned results later in the loop?
Like if L551 says i := 12, but then obj.GetOwnerReferences() is updated to hold fewer than 12 items.
Why not
refs := obj.GetOwnerReferences()
for i := range refs {
ref := refs[i]
There was a problem hiding this comment.
I think we can. Goog Catcher 🥇
8b504c8 to
36f661c
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextensionrevision_controller_test.go
Outdated
Show resolved
Hide resolved
36f661c to
2b87748
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextensionrevision_controller.go
Outdated
Show resolved
Hide resolved
internal/operator-controller/controllers/clusterextensionrevision_controller_test.go
Outdated
Show resolved
Hide resolved
6ad05bf to
b7f071a
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if ores.Action() == machinery.ActionProgressed { | ||
| if controllerRef := getForeignRevisionController(ores.Object(), siblingRevisionNames); controllerRef != nil { | ||
| collidingObjs = append(collidingObjs, ores.String()+fmt.Sprintf("\nConflicting Owner: %s", controllerRef.String())) | ||
| } | ||
| } |
There was a problem hiding this comment.
The new foreign-owner collision check depends on siblingRevisionNames being complete. If labels.OwnerNameKey is missing on the reconciling CER, buildBoxcutterPhases only seeds the set with cer.Name, so an ActionProgressed object still controlled by another revision of the same ClusterExtension will be misclassified as a foreign collision. Consider skipping the foreign-owner collision check (or treating all CER controllers as "sibling") when the owner label is missing, to preserve the current “missing label => no sibling lookup” behavior.
| func (c *ClusterExtensionRevisionReconciler) buildBoxcutterPhases(ctx context.Context, cer *ocv1.ClusterExtensionRevision) ([]boxcutter.Phase, []boxcutter.RevisionReconcileOption, sets.Set[string], error) { | ||
| siblingRevisionNames := sets.New[string](cer.Name) | ||
| var previousObjs []client.Object | ||
|
|
||
| if ownerLabel, ok := cer.Labels[labels.OwnerNameKey]; ok { | ||
| revList := &ocv1.ClusterExtensionRevisionList{} | ||
| if err := c.TrackingCache.List(ctx, revList, client.MatchingLabels{ | ||
| labels.OwnerNameKey: ownerLabel, |
There was a problem hiding this comment.
When labels.OwnerNameKey is missing on the CER, this function returns siblingRevisionNames containing only cer.Name and an empty previousObjs. With the new ActionProgressed foreign-controller detection, that incomplete sibling set can cause false-positive collisions (any CER controller ref other than cer.Name will look “foreign”). Consider returning an explicit hasOwnerLabel flag (or otherwise ensuring the sibling set is complete) so callers can safely decide whether to enforce the foreign-owner collision check.
b7f071a to
4bc4776
Compare
…rExtension upgrade When a ClusterExtension was upgraded (e.g., v1.0.0 to v1.0.1), the upgrade set a higher revision number on managed objects. A second ClusterExtension installing the same package was then allowed because boxcutter's revision linearity check returned ActionProgressed instead of ActionCollision, silently skipping the conflict. Fix: after boxcutter reconcile, check ActionProgressed results for objects whose controller ownerReference belongs to a revision from a different ClusterExtension, and treat those as collisions. Generated-by: Claude/Cursor
4bc4776 to
b3688d3
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| func (c *ClusterExtensionRevisionReconciler) buildBoxcutterPhases(ctx context.Context, cer *ocv1.ClusterExtensionRevision) ([]boxcutter.Phase, []boxcutter.RevisionReconcileOption, sets.Set[string], error) { | ||
| var siblingRevisionNames sets.Set[string] | ||
| previousObjs := make([]client.Object, 0) | ||
|
|
||
| if ownerLabel, ok := cer.Labels[labels.OwnerNameKey]; ok { | ||
| revList := &ocv1.ClusterExtensionRevisionList{} | ||
| if err := c.TrackingCache.List(ctx, revList, client.MatchingLabels{ | ||
| labels.OwnerNameKey: ownerLabel, | ||
| }); err != nil { | ||
| return nil, nil, nil, fmt.Errorf("listing revisions: %w", err) | ||
| } | ||
|
|
||
| // Convert to []client.Object for boxcutter | ||
| previousObjs := make([]client.Object, len(previous)) | ||
| for i, rev := range previous { | ||
| previousObjs[i] = rev | ||
| siblingRevisionNames = sets.New[string](cer.Name) | ||
| for i := range revList.Items { | ||
| r := &revList.Items[i] | ||
| siblingRevisionNames.Insert(r.Name) | ||
| if r.Name == cer.Name { | ||
| continue | ||
| } | ||
| if r.Spec.LifecycleState == ocv1.ClusterExtensionRevisionLifecycleStateArchived || | ||
| !r.DeletionTimestamp.IsZero() { | ||
| continue | ||
| } | ||
| if r.Spec.Revision >= cer.Spec.Revision { | ||
| continue | ||
| } | ||
| previousObjs = append(previousObjs, r) | ||
| } |
There was a problem hiding this comment.
buildBoxcutterPhases re-implements the same filtering/listing logic as listPreviousRevisions (owner label lookup, TrackingCache.List, and the archived/deleting/revision-number filters). This duplication makes it easy for the two paths (previous owners used for boxcutter vs. previous revisions archived on completion) to drift over time. Consider refactoring so buildBoxcutterPhases reuses listPreviousRevisions for previousObjs and only does the additional sibling-name collection separately (or extend listPreviousRevisions to optionally return both).
When a ClusterExtension was upgraded (e.g., v1.0.0 to v1.0.1), the upgrade set a higher revision number on managed objects. A second
ClusterExtension installing the same package was then allowed because boxcutter's revision linearity check returned ActionProgressed instead of ActionCollision, silently skipping the conflict.
Fix: after boxcutter reconcile, check ActionProgressed results for objects whose controller ownerReference belongs to a revision from a different ClusterExtension, and treat those as collisions.
Motivated by: https://redhat.atlassian.net/browse/OCPBUGS-78455