OCPBUGS-62629: reflect workload controller changes in the deployment … by tjungblu · Pull Request #2196 · openshift/library-go

tjungblu · 2026-05-04T07:18:07Z

…controller

Contains the degraded/progressing condition changes made to the workload controller from #2128.

Summary by CodeRabbit

Refactor
- Improved deployment status reporting and container-level messages, including explicit messaging when no pods match selectors and clearer diagnostics for failing or unavailable pods.
- Reworked controller sync flow to compute Available, Progressing, and Degraded more accurately during rollouts, timeouts, and deletions.
Documentation
- Clarified when Degraded is reported (progress timeouts, reduced pod availability outside rollouts, or sync errors).
Tests
- Updated and added tests validating degraded-condition reasons and status messages.

openshift-ci-robot · 2026-05-04T07:18:13Z

@tjungblu: This pull request references Jira Issue OCPBUGS-62629, which is valid.

3 validation(s) were run on this bug

bug is open, matching expected state (open)
bug target version (5.0.0) matches configured target version for branch (5.0.0)
bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @gangwgr

The bug has been updated to refer to the pull request using the external bug tracker.

Details

In response to this:

…controller

Contains the degraded/progressing condition changes made to the workload controller from #2128.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

tjungblu · 2026-05-04T07:18:14Z

/hold

coderabbitai · 2026-05-04T07:18:20Z

Note

Currently processing new changes in this PR. This may take a few minutes, please wait...

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c402234d-f742-451e-ab6d-12b7d0280738

📥 Commits

Reviewing files that changed from the base of the PR and between 7d80de2 and c44884e.

📒 Files selected for processing (5)

pkg/apps/deployment/status.go
pkg/operator/csi/csidrivercontrollerservicecontroller/csi_driver_controller_service_controller.go
pkg/operator/deploymentcontroller/deployment_controller.go
pkg/operator/deploymentcontroller/deployment_controller_test.go
pkg/operator/deploymentcontroller/helpers_test.go

Walkthrough

Reworks deployment degraded-condition computation and diagnostics: exports container-message builder, refactors container-status helpers, rewrites syncManaged to compute Available/Progressing/Degraded from the applied deployment and matched pods, changes sync wiring for sync-error degraded behavior, updates deletion routing, and expands tests to assert degraded reasons/messages.

Changes

Degraded Condition Handling for Deployments

Layer / File(s)	Summary
Helper Export / Data Shape `pkg/apps/deployment/status.go`	Adds exported `ContainerMessagesForPods(deployment appsv1.Deployment, pods []corev1.Pod) []string` that appends a "no pods found with labels ..." message when `pods` is empty. Refactors `containerStatusMessagesForPods` to accept only `deploymentPods []*corev1.Pod` and return `[]string` of container-state messages.
Core Condition Logic `pkg/operator/deploymentcontroller/deployment_controller.go`	Rewrites `syncManaged` to use the applied `deploymentObj` for identity/generation, set `Available` from `deploymentObj.Status.AvailableReplicas`, determine `Progressing` via `hasDeploymentProgressed`/`hasDeploymentTimedOutProgressing`, detect failing/non-ready pods via `listOperandPodsForDiagnostics`/`nonDeletingPods`/`hasFailingPods`/`findPodReadyCondition`, and set `Degraded` (when enabled) for timeouts (`ProgressDeadlineExceeded`) or reduced availability with optional per-container messages from `ContainerMessagesForPods`. Removes prior `isProgressing`/`hasFinishedProgressing`.
Sync Error / Wiring `pkg/operator/deploymentcontroller/deployment_controller.go`	`sync` now defers `applySyncErrorDegraded` only when the controller's `c.conditions` includes `OperatorStatusTypeDegraded` and applies it if `sync` returns an error. `ToController` wires `WithSyncDegradedOnError` only when Degraded is not present in `c.conditions`.
Reconciliation Router `pkg/operator/deploymentcontroller/deployment_controller.go`	Deletion path now calls `syncDeleting(ctx, opSpec, syncContext)` (no `opStatus`); managed path calls `syncManaged(ctx, opSpec, opStatus, syncContext)`. Added `applySyncErrorDegraded` helper.
Pod Listing / Diagnostics Helpers `pkg/operator/deploymentcontroller/deployment_controller.go`	Adds `listOperandPodsForDiagnostics` (selector-based pod listing) and helpers `nonDeletingPods`, `hasFailingPods`, `findPodReadyCondition`, `hasDeploymentProgressed`, `hasDeploymentTimedOutProgressing`.
Tests / Test Helpers `pkg/operator/deploymentcontroller/deployment_controller_test.go`, `pkg/operator/deploymentcontroller/helpers_test.go`	Introduces `conditionDegraded` constant, extends test harness with `assertOperatorStatus` callback to inspect Condition Reason/Message pre-sanitization, updates many `TestSync` cases to assert degraded behavior (including ProgressDeadlineExceeded case), and adds `withDeploymentConditionWithMessage` test helper to inject deployment conditions with Message.
Comments / Documentation `pkg/operator/csi/.../csi_driver_controller_service_controller.go`, `pkg/operator/deploymentcontroller/deployment_controller.go`	Updates `<name>Degraded` comments to include progressing timeouts and failing-pod availability reductions (outside mid-rollout) as triggers in addition to sync errors.

Sequence Diagram(s)

sequenceDiagram
    participant Controller
    participant KubeAPI as "Kube API (Deployment)"
    participant PodAPI as "Kube API (Pods)"
    participant OperatorStatus

    Controller->>KubeAPI: Apply deployment (returns deploymentObj)
    Controller->>PodAPI: List pods matching deploymentObj.Spec.Selector
    PodAPI-->>Controller: Pod list
    Controller->>Controller: compute Available / Progressing (hasDeploymentProgressed / timed out)
    Controller->>Controller: filter non-deleting pods, detect failing pods, build container messages (ContainerMessagesForPods)
    alt progressing timed out
        Controller->>OperatorStatus: Set Degraded=True (Reason=ProgressDeadlineExceeded)
    else failing/unavailable pods (not mid-rollout)
        Controller->>OperatorStatus: Set Degraded=True (Reason=UnavailablePod) with container messages
    else no errors
        Controller->>OperatorStatus: Set Available/Progressing accordingly (Degraded=False)
    end
    Note right of Controller: If sync returns error and controller owns Degraded, applySyncErrorDegraded runs to mark Degraded=SyncError

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 9 | ❌ 3

❌ Failed checks (1 warning, 2 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The PR title is incomplete and truncated; it ends with 'deployment …' and doesn't convey the full change being made.	Complete the truncated title to clearly describe what deployment controller changes are being reflected (e.g., 'OCPBUGS-62629: reflect workload controller degraded/progressing changes in deployment controller').
Test Structure And Quality	❓ Inconclusive	The custom check is designed for Ginkgo BDD-style tests, but the PR uses standard Go table-driven testing patterns with zero Ginkgo constructs.	Clarify whether the custom check applies to standard Go testing patterns or only to Ginkgo-based tests, and provide appropriate guidelines.

✅ Passed checks (9 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names	✅ Passed	All test case names are stable, static strings with no dynamic generation, format specifiers, or variable concatenation.
Microshift Test Compatibility	✅ Passed	The pull request does not add any new Ginkgo e2e tests, only updates to standard Go unit tests and helper functions.
Single Node Openshift (Sno) Test Compatibility	✅ Passed	This pull request does not add any Ginkgo e2e tests; test files use Go's standard testing package with unit tests, not Ginkgo patterns.
Topology-Aware Scheduling Compatibility	✅ Passed	This pull request does not introduce any topology-aware scheduling constraints. The changes implement a generic deployment controller that manages status conditions without modifying pod specifications or scheduling fields.
Ote Binary Stdout Contract	✅ Passed	Library packages should not configure klog output settings; this responsibility belongs to binaries.
Ipv6 And Disconnected Network Test Compatibility	✅ Passed	The modified test files use standard Go testing patterns, not Ginkgo e2e tests, so IPv6/disconnected network checks do not apply.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

_{Review rate limit: 9/10 reviews remaining, refill in 6 minutes.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

openshift-ci · 2026-05-04T07:18:24Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: tjungblu
Once this PR has been reviewed and has the lgtm label, please assign bertinatto for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

pkg/operator/deploymentcontroller/deployment_controller_test.go (1)
456-671: ⚡ Quick win

Add one regression case that asserts the new reason/message fields.

These table updates prove the boolean transitions, but the behavior added here is really the new UnavailablePod / ProgressDeadlineExceeded / SyncError condition semantics. One focused case that keeps and asserts Reason/Message would catch the exact regressions this PR is trying to prevent.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/deploymentcontroller/deployment_controller_test.go` around lines
456 - 671, Add a single regression test case to the TestSync testCases table
that specifically asserts Reason and Message on operator conditions (e.g.
UnavailablePod, ProgressDeadlineExceeded, SyncError). Create a case (name like
"assert condition reason/message") that sets up initialObjects using
makeDeployment with appropriate withDeploymentStatus and
withDeploymentConditions to trigger the target condition, and an operator via
makeFakeOperatorInstance; then in expectedObjects.operator include an assertion
that the condition entries include the exact Reason and Message (extend or use a
helper similar to withTrueConditions/withFalseConditions but that checks
Reason/Message, or add a new helper e.g. withConditionsIncludingReasonMessage)
so the test fails if Reason/Message are not set as intended; insert this new
case into the existing testCases slice inside TestSync.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/apps/deployment/status.go`:
- Line 41: The loop currently appends pods[i].Status.InitContainerStatuses onto
pods[i].Status.ContainerStatuses which can mutate cached pod objects; instead
create a fresh combined slice (e.g., combined := make([]v1.ContainerStatus, 0,
len(pods[i].Status.ContainerStatuses)+len(pods[i].Status.InitContainerStatuses))
and append both slices into it) or iterate pods[i].Status.ContainerStatuses and
pods[i].Status.InitContainerStatuses in two separate loops; update the loop that
references the combined view to use this new local variable (referencing pods,
pods[i].Status.ContainerStatuses, pods[i].Status.InitContainerStatuses) so you
never modify the original pod object.

In `@pkg/operator/deploymentcontroller/deployment_controller.go`:
- Around line 381-388: The degraded message is built from the full operandPods
slice while hasFailingPods() intentionally ignores terminating/deleting pods,
causing mismatched details; modify the code around the decision branch (when
!workloadIsBeingUpdated && deploymentObj.Status.AvailableReplicas <
desiredReplicas) to first filter operandPods to remove deleting/terminating pods
into a new slice (e.g., livePods), pass that filtered slice into
hasFailingPods(deploymentObj, livePods, now) and into
deployment.ContainerMessagesForPods(deploymentObj, livePods) so both the failure
check and the generated container messages use the identical pod set.

---

Nitpick comments:
In `@pkg/operator/deploymentcontroller/deployment_controller_test.go`:
- Around line 456-671: Add a single regression test case to the TestSync
testCases table that specifically asserts Reason and Message on operator
conditions (e.g. UnavailablePod, ProgressDeadlineExceeded, SyncError). Create a
case (name like "assert condition reason/message") that sets up initialObjects
using makeDeployment with appropriate withDeploymentStatus and
withDeploymentConditions to trigger the target condition, and an operator via
makeFakeOperatorInstance; then in expectedObjects.operator include an assertion
that the condition entries include the exact Reason and Message (extend or use a
helper similar to withTrueConditions/withFalseConditions but that checks
Reason/Message, or add a new helper e.g. withConditionsIncludingReasonMessage)
so the test fails if Reason/Message are not set as intended; insert this new
case into the existing testCases slice inside TestSync.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 77692bf7-5ee0-4146-af3f-307cedb5c776

📥 Commits

Reviewing files that changed from the base of the PR and between d6efc4e and 426221c.

📒 Files selected for processing (4)

pkg/apps/deployment/status.go
pkg/operator/csi/csidrivercontrollerservicecontroller/csi_driver_controller_service_controller.go
pkg/operator/deploymentcontroller/deployment_controller.go
pkg/operator/deploymentcontroller/deployment_controller_test.go

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

pkg/operator/deploymentcontroller/deployment_controller_test.go (1)
673-709: ⚡ Quick win

Add one regression case for the new SyncError degraded path.

The suite now covers timeout and unavailable-pod degradation, but the wiring change in this PR is the deferred applySyncErrorDegraded() path in sync(). Please add a case that forces getDeployment() or the new pod lookup to fail and asserts conditionDegraded=True with Reason="SyncError", so the WithSyncDegradedOnError inversion stays protected.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/deploymentcontroller/deployment_controller_test.go` around lines
673 - 709, Add a regression test case in deployment_controller_test.go that
exercises the deferred applySyncErrorDegraded() path by forcing getDeployment()
or the pod lookup to return an error so sync() triggers the SyncError degraded
branch; create a new test entry (parallel to the existing
"ProgressDeadlineExceeded" case) that sets up initialObjects to produce a
failure (e.g., mock getDeployment/pod list to error or supply a nil/erroneous
object) and in assertOperatorStatus verify that conditionDegraded is present
with Status True and Reason "SyncError" (and an appropriate message), ensuring
the WithSyncDegradedOnError inversion remains protected and referencing sync(),
applySyncErrorDegraded(), and WithSyncDegradedOnError for locating the logic to
target.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/operator/deploymentcontroller/deployment_controller.go`:
- Around line 296-313: The current reconcile eagerly lists pods (using
c.kubeClient.CoreV1().Pods(...).List) whenever the controller has the Degraded
condition (slices.Contains(c.conditions, opv1.OperatorStatusTypeDegraded)),
causing transient list failures to surface as hard sync errors; move the pod
fetch logic (creation of operandPods from podList.Items and
metav1.LabelSelectorAsSelector on deploymentObj.Spec.Selector) into the specific
branch that handles UnavailablePod/container-failure diagnostics (where
container details are inspected), and change error handling so that pod list
failures are treated as best-effort diagnostics (log/record the lookup failure
but do not return error from syncManaged or the reconcile path). Ensure you
still validate deploymentObj.Spec.Selector before using it and only attempt the
pods.List when you actually need pod/container details.

---

Nitpick comments:
In `@pkg/operator/deploymentcontroller/deployment_controller_test.go`:
- Around line 673-709: Add a regression test case in
deployment_controller_test.go that exercises the deferred
applySyncErrorDegraded() path by forcing getDeployment() or the pod lookup to
return an error so sync() triggers the SyncError degraded branch; create a new
test entry (parallel to the existing "ProgressDeadlineExceeded" case) that sets
up initialObjects to produce a failure (e.g., mock getDeployment/pod list to
error or supply a nil/erroneous object) and in assertOperatorStatus verify that
conditionDegraded is present with Status True and Reason "SyncError" (and an
appropriate message), ensuring the WithSyncDegradedOnError inversion remains
protected and referencing sync(), applySyncErrorDegraded(), and
WithSyncDegradedOnError for locating the logic to target.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: 884898b7-e452-48ae-8bbd-18cbb0f259c8

📥 Commits

Reviewing files that changed from the base of the PR and between 426221c and 012e48a.

📒 Files selected for processing (5)

pkg/apps/deployment/status.go
pkg/operator/csi/csidrivercontrollerservicecontroller/csi_driver_controller_service_controller.go
pkg/operator/deploymentcontroller/deployment_controller.go
pkg/operator/deploymentcontroller/deployment_controller_test.go
pkg/operator/deploymentcontroller/helpers_test.go

✅ Files skipped from review due to trivial changes (2)

pkg/operator/csi/csidrivercontrollerservicecontroller/csi_driver_controller_service_controller.go
pkg/operator/deploymentcontroller/helpers_test.go

🚧 Files skipped from review as they are similar to previous changes (1)

pkg/apps/deployment/status.go

…controller Contains the degraded/progressing condition changes made to the workload controller from openshift#2128. Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>

coderabbitai · 2026-05-04T11:32:15Z

Caution

Failed to replace (edit) comment. This is likely due to insufficient permissions or the comment being deleted.

Error details

{}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

pkg/operator/deploymentcontroller/deployment_controller_test.go (1)
676-739: ⚡ Quick win

Add one regression case for pod-diagnostics failure in the unavailable-pod path.

The new assertions cover ProgressDeadlineExceeded and deferred SyncError, but not the branch where Pods().List() fails while the deployment is partially unavailable. That path now affects whether Degraded stays true or gets cleared, so it should be locked down with assertOperatorStatus.

Also applies to: 796-798
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/operator/deploymentcontroller/deployment_controller_test.go` around lines
676 - 739, Add a regression test case that simulates a failure from
Pods().List() in the unavailable-pod path and asserts operator Degraded remains
set (or not cleared) accordingly; specifically, add a new test entry in the
table (similar style to the existing "ProgressDeadlineExceeded" and "sync error"
cases) that injects a fake client error for Pods().List() when
getDeployment()/the unavailable-pod handling runs, then provide an
assertOperatorStatus closure that checks
v1helpers.FindOperatorCondition(status.Conditions, conditionDegraded) is non-nil
and has the expected Status/Reason/Message (matching the desired behavior for
the unavailable-pod error path). Ensure the test uses the same test harness
helpers (makeDeployment, makeFakeOperatorInstance, optionalManifestHooks if
needed) and names the case to indicate "pod-diagnostics failure in
unavailable-pod" so it covers the branch referenced in
getDeployment()/applySyncErrorDegraded logic.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/operator/deploymentcontroller/deployment_controller.go`:
- Around line 362-367: When AvailableReplicas < desiredReplicas and rollout is
not updating, don't let a nil/failed pod lookup from
listOperandPodsForDiagnostics clear Degraded; treat missing operandPods as a
failure. In the branch around listOperandPodsForDiagnostics -> operandPods, set
hasFailing to true if operandPods is nil or the diagnostics list indicates a
selector/list error (in addition to the existing hasFailingPods(deployment,
livePods, now) check), and still build containerMessages with whatever
diagnostics are available via dpm.ContainerMessagesForPods(deployment, livePods)
so the Degraded condition reflects the deployment shortage rather than
best‑effort pod lookup success. Ensure you reference
listOperandPodsForDiagnostics, nonDeletingPods, hasFailingPods,
dpm.ContainerMessagesForPods, deployment.Status.AvailableReplicas and
desiredReplicas when making the change.

---

Nitpick comments:
In `@pkg/operator/deploymentcontroller/deployment_controller_test.go`:
- Around line 676-739: Add a regression test case that simulates a failure from
Pods().List() in the unavailable-pod path and asserts operator Degraded remains
set (or not cleared) accordingly; specifically, add a new test entry in the
table (similar style to the existing "ProgressDeadlineExceeded" and "sync error"
cases) that injects a fake client error for Pods().List() when
getDeployment()/the unavailable-pod handling runs, then provide an
assertOperatorStatus closure that checks
v1helpers.FindOperatorCondition(status.Conditions, conditionDegraded) is non-nil
and has the expected Status/Reason/Message (matching the desired behavior for
the unavailable-pod error path). Ensure the test uses the same test harness
helpers (makeDeployment, makeFakeOperatorInstance, optionalManifestHooks if
needed) and names the case to indicate "pod-diagnostics failure in
unavailable-pod" so it covers the branch referenced in
getDeployment()/applySyncErrorDegraded logic.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: c402234d-f742-451e-ab6d-12b7d0280738

📥 Commits

Reviewing files that changed from the base of the PR and between 7d80de2 and c44884e.

📒 Files selected for processing (5)

pkg/apps/deployment/status.go
pkg/operator/csi/csidrivercontrollerservicecontroller/csi_driver_controller_service_controller.go
pkg/operator/deploymentcontroller/deployment_controller.go
pkg/operator/deploymentcontroller/deployment_controller_test.go
pkg/operator/deploymentcontroller/helpers_test.go

✅ Files skipped from review due to trivial changes (1)

pkg/operator/deploymentcontroller/helpers_test.go

🚧 Files skipped from review as they are similar to previous changes (2)

pkg/operator/csi/csidrivercontrollerservicecontroller/csi_driver_controller_service_controller.go
pkg/apps/deployment/status.go

coderabbitai · 2026-05-04T11:32:18Z

+		case !workloadIsBeingUpdated && deployment.Status.AvailableReplicas < desiredReplicas:
+			operandPods := c.listOperandPodsForDiagnostics(ctx, deployment, syncContext)
+			livePods := nonDeletingPods(operandPods)
+			hasFailing := hasFailingPods(deployment, livePods, now)
+			if hasFailing || deployment.Status.AvailableReplicas == 0 {
+				containerMessages := dpm.ContainerMessagesForPods(deployment, livePods)


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don’t let a failed pod lookup clear Degraded for partial outages.

When AvailableReplicas < desiredReplicas and rollout is already done, listOperandPodsForDiagnostics() collapses selector/list failures into nil, so hasFailingPods() becomes false and this branch reports Degraded=False unless availability dropped all the way to zero. That makes operator health depend on best-effort diagnostics instead of the deployment shortage itself.

Suggested direction

- operandPods := c.listOperandPodsForDiagnostics(ctx, deployment, syncContext) + operandPods, diagnosticsOK := c.listOperandPodsForDiagnostics(ctx, deployment, syncContext) livePods := nonDeletingPods(operandPods) - hasFailing := hasFailingPods(deployment, livePods, now) - if hasFailing || deployment.Status.AvailableReplicas == 0 { - containerMessages := dpm.ContainerMessagesForPods(deployment, livePods) + hasFailing := diagnosticsOK && hasFailingPods(deployment, livePods, now) + if deployment.Status.AvailableReplicas < desiredReplicas && (!diagnosticsOK || hasFailing || deployment.Status.AvailableReplicas == 0) { + var containerMessages []string + if diagnosticsOK { + containerMessages = dpm.ContainerMessagesForPods(deployment, livePods) + }

Also applies to: 401-417

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@pkg/operator/deploymentcontroller/deployment_controller.go` around lines 362 - 367, When AvailableReplicas < desiredReplicas and rollout is not updating, don't let a nil/failed pod lookup from listOperandPodsForDiagnostics clear Degraded; treat missing operandPods as a failure. In the branch around listOperandPodsForDiagnostics -> operandPods, set hasFailing to true if operandPods is nil or the diagnostics list indicates a selector/list error (in addition to the existing hasFailingPods(deployment, livePods, now) check), and still build containerMessages with whatever diagnostics are available via dpm.ContainerMessagesForPods(deployment, livePods) so the Degraded condition reflects the deployment shortage rather than best‑effort pod lookup success. Ensure you reference listOperandPodsForDiagnostics, nonDeletingPods, hasFailingPods, dpm.ContainerMessagesForPods, deployment.Status.AvailableReplicas and desiredReplicas when making the change.

openshift-ci · 2026-05-04T11:36:44Z

@tjungblu: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. labels May 4, 2026

openshift-ci Bot requested a review from gangwgr May 4, 2026 07:18

openshift-ci Bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 4, 2026

openshift-ci Bot requested review from bertinatto and jsafrane May 4, 2026 07:18

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Comment thread pkg/apps/deployment/status.go Outdated

Comment thread pkg/operator/deploymentcontroller/deployment_controller.go Outdated

tjungblu force-pushed the OCPBUGS-62629_deployment branch 4 times, most recently from 07ea5a5 to 012e48a Compare May 4, 2026 09:14

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Comment thread pkg/operator/deploymentcontroller/deployment_controller.go Outdated

tjungblu force-pushed the OCPBUGS-62629_deployment branch from 012e48a to 7d80de2 Compare May 4, 2026 10:40

OCPBUGS-62629: reflect workload controller changes in the deployment …

c44884e

…controller Contains the degraded/progressing condition changes made to the workload controller from openshift#2128. Signed-off-by: Thomas Jungblut <tjungblu@redhat.com>

tjungblu force-pushed the OCPBUGS-62629_deployment branch from 7d80de2 to c44884e Compare May 4, 2026 11:17

coderabbitai Bot reviewed May 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OCPBUGS-62629: reflect workload controller changes in the deployment …#2196

OCPBUGS-62629: reflect workload controller changes in the deployment …#2196
tjungblu wants to merge 1 commit intoopenshift:masterfrom
tjungblu:OCPBUGS-62629_deployment

tjungblu commented May 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

openshift-ci-robot commented May 4, 2026

Uh oh!

tjungblu commented May 4, 2026

Uh oh!

coderabbitai Bot commented May 4, 2026 •

edited

Loading

❌ Failed checks (1 warning, 2 inconclusive)

Uh oh!

openshift-ci Bot commented May 4, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot commented May 4, 2026

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 4, 2026

Uh oh!

openshift-ci Bot commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tjungblu commented May 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

openshift-ci-robot commented May 4, 2026

Uh oh!

tjungblu commented May 4, 2026

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

❌ Failed checks (1 warning, 2 inconclusive)

Uh oh!

openshift-ci Bot commented May 4, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot commented May 4, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 4, 2026

Choose a reason for hiding this comment

Uh oh!

openshift-ci Bot commented May 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tjungblu commented May 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading