[POC] test: e2e test for UIPlugin cleanup on operator uninstall (COO-1404)#1083
[POC] test: e2e test for UIPlugin cleanup on operator uninstall (COO-1404)#1083DavidRajnoha wants to merge 1 commit intorhobs:mainfrom
Conversation
Adds TestUIPluginUninstallCleanup to reproduce the bug where UIPlugin operands (Deployments, Services, ServiceAccounts, ClusterRoles, ClusterRoleBindings, pods) are orphaned when COO is uninstalled via OLM without manually deleting UIPlugin CRs first. The test creates a monitoring UIPlugin with health-analyzer enabled, simulates OLM uninstall by deleting the CSV and Subscription, then asserts all child resources are cleaned up. All 9 assertions currently fail, confirming the bug. Does not use Framework.AssertResourceAbsent due to a silent-pass bug (rhobs#1082); uses a custom waitForResourceAbsent with correct polling semantics. Includes self-healing cleanup that reinstalls the Subscription after the test, and a -postpone-restoration flag for manual cluster inspection. Co-authored-by: Cursor <cursoragent@cursor.com>
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: DavidRajnoha The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
|
Important Review skippedAuto reviews are limited based on label configuration. 🚫 Review skipped — only excluded labels are configured. (1)
Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Repository YAML (base), Organization UI (inherited) Review profile: CHILL Plan: Enterprise Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Skipping CI for Draft Pull Request. |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
test/e2e/uiplugin_uninstall_test.go (1)
189-190: 💤 Low value
"--- parallel resource checks done ---"fires before any parallel subtest runsIn Go's testing model, subtests that call
t.Parallel()pause immediately and are released only when the parent function returns. All ninet.Runcalls at lines 144–187 return instantly (the subtests are queued, not executing). Line 190 therefore logsdonebefore any parallel work begins. Consider removing or relabelling it (e.g.,"--- parallel resource checks released ---").🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@test/e2e/uiplugin_uninstall_test.go` around lines 189 - 190, The log message "t.Log(\"--- parallel resource checks done ---\")" fires before any t.Parallel() subtests actually start because parent returns immediately after queuing subtests; update the message or remove it: locate the t.Log call in the test containing the sequence of t.Run(...) subtests that invoke t.Parallel() and either delete that log or change it to a clearer phrase such as "--- parallel resource checks released ---" (or move the log to after waiting for parallel work to complete if you add explicit synchronization).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@test/e2e/uiplugin_uninstall_test.go`:
- Around line 144-187: The tests rely on hardcoded resource names and call
waitForResourceAbsent which treats NotFound as success, so a wrong name silently
passes; modify the test flow to verify the resource actually exists before
waiting for its deletion: add a pre-existence Get (using the same client/kube
context used by waitForResourceAbsent) for each named resource (e.g.,
"monitoring-sa", "components-health-view", "monitoring-components-health-view")
and fail the subtest immediately if Get returns NotFound or another unexpected
error, then call waitForResourceAbsent to poll for eventual deletion; update
waitForResourceAbsent (or its callers in uiplugin_uninstall_test.go) to perform
this two-step check so wrong names no longer produce instant, false-positive
success.
---
Nitpick comments:
In `@test/e2e/uiplugin_uninstall_test.go`:
- Around line 189-190: The log message "t.Log(\"--- parallel resource checks
done ---\")" fires before any t.Parallel() subtests actually start because
parent returns immediately after queuing subtests; update the message or remove
it: locate the t.Log call in the test containing the sequence of t.Run(...)
subtests that invoke t.Parallel() and either delete that log or change it to a
clearer phrase such as "--- parallel resource checks released ---" (or move the
log to after waiting for parallel work to complete if you add explicit
synchronization).
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Organization UI (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: ddaac954-c82b-4a9b-bcff-d26c643d11f9
📒 Files selected for processing (2)
test/e2e/main_test.gotest/e2e/uiplugin_uninstall_test.go
| t.Run("UIPlugin CR is deleted", func(t *testing.T) { | ||
| t.Parallel() | ||
| waitForResourceAbsent(t, "monitoring", "", &uiv1.UIPlugin{}, cleanupTimeout) | ||
| }) | ||
|
|
||
| t.Run("monitoring plugin deployment is deleted", func(t *testing.T) { | ||
| t.Parallel() | ||
| waitForResourceAbsent(t, "monitoring", ns, &appsv1.Deployment{}, cleanupTimeout) | ||
| }) | ||
|
|
||
| t.Run("health-analyzer deployment is deleted", func(t *testing.T) { | ||
| t.Parallel() | ||
| waitForResourceAbsent(t, "health-analyzer", ns, &appsv1.Deployment{}, cleanupTimeout) | ||
| }) | ||
|
|
||
| t.Run("health-analyzer service is deleted", func(t *testing.T) { | ||
| t.Parallel() | ||
| waitForResourceAbsent(t, "health-analyzer", ns, &corev1.Service{}, cleanupTimeout) | ||
| }) | ||
|
|
||
| t.Run("monitoring plugin service is deleted", func(t *testing.T) { | ||
| t.Parallel() | ||
| waitForResourceAbsent(t, "monitoring", ns, &corev1.Service{}, cleanupTimeout) | ||
| }) | ||
|
|
||
| t.Run("monitoring plugin service account is deleted", func(t *testing.T) { | ||
| t.Parallel() | ||
| waitForResourceAbsent(t, "monitoring-sa", ns, &corev1.ServiceAccount{}, cleanupTimeout) | ||
| }) | ||
|
|
||
| t.Run("components-health-view ClusterRole is deleted", func(t *testing.T) { | ||
| t.Parallel() | ||
| waitForResourceAbsent(t, "components-health-view", "", &rbacv1.ClusterRole{}, cleanupTimeout) | ||
| }) | ||
|
|
||
| t.Run("components-health-view ClusterRoleBinding is deleted", func(t *testing.T) { | ||
| t.Parallel() | ||
| waitForResourceAbsent(t, "monitoring-components-health-view", "", &rbacv1.ClusterRoleBinding{}, cleanupTimeout) | ||
| }) | ||
|
|
||
| t.Run("no UIPlugin-managed pods remain in operator namespace", func(t *testing.T) { | ||
| t.Parallel() | ||
| assertNoManagedPodsRemain(t, ctx, ns) | ||
| }) |
There was a problem hiding this comment.
Hardcoded resource names create silent false-pass risk
waitForResourceAbsent polls Get and returns success immediately when apierrors.IsNotFound — which is indistinguishable from a wrong name and an actually-absent resource. If any of the hardcoded names ("monitoring-sa", "components-health-view", "monitoring-components-health-view") don't match what the operator actually creates, the subtests pass instantly regardless of whether orphaned resources remain, defeating the test's purpose.
A lightweight fix is to verify each resource exists before entering the absence poll:
🛡️ Proposed pre-existence check in waitForResourceAbsent
func waitForResourceAbsent(t *testing.T, name, namespace string, obj client.Object, timeout time.Duration) {
t.Helper()
key := client.ObjectKey{Name: name, Namespace: namespace}
+ // Confirm the resource actually exists before waiting for it to disappear,
+ // so a wrong name doesn't silently pass the assertion.
+ if err := f.K8sClient.Get(context.Background(), key, obj); apierrors.IsNotFound(err) {
+ t.Logf("warning: %T %s/%s not found before waiting for absence — verify the resource name is correct", obj, namespace, name)
+ }
err := wait.PollUntilContextTimeout(context.Background(), 5*time.Second, timeout, true, func(ctx context.Context) (bool, error) {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@test/e2e/uiplugin_uninstall_test.go` around lines 144 - 187, The tests rely
on hardcoded resource names and call waitForResourceAbsent which treats NotFound
as success, so a wrong name silently passes; modify the test flow to verify the
resource actually exists before waiting for its deletion: add a pre-existence
Get (using the same client/kube context used by waitForResourceAbsent) for each
named resource (e.g., "monitoring-sa", "components-health-view",
"monitoring-components-health-view") and fail the subtest immediately if Get
returns NotFound or another unexpected error, then call waitForResourceAbsent to
poll for eventual deletion; update waitForResourceAbsent (or its callers in
uiplugin_uninstall_test.go) to perform this two-step check so wrong names no
longer produce instant, false-positive success.
Summary
TestUIPluginUninstallCleanupe2e test that reproduces the bug where UIPlugin operands are orphaned when COO is uninstalled via OLM without manually deleting UIPlugin CRs firstwaitForResourceAbsentinstead ofFramework.AssertResourceAbsentdue to a silent-pass bug in the framework (e2e framework: AssertResourceAbsent silently passes when resource still exists #1082)t.Cleanupthat reinstalls the Subscription after the testRun modes
Example output (all subtests fail as expected)
Related
Test plan
-retainflag skips cleanup for manual inspection-postpone-restoration=10mdelays cleanupgo vetpassesMade with Cursor