Skip to content

(Not ready for review) OCPBUGS-57177: OCPBUGS-44290: Fix when machines are considered Degraded in MCP status based on MCN conditions #5110

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

isabella-janssen
Copy link
Member

@isabella-janssen isabella-janssen commented Jun 5, 2025

TODO:

  • Test with no TP in 4.20 since MCN & PIS should be GA for 4.20 & update the how to verify section accordingly.
  • Fill in the verify steps better.

- What I did

- How to verify it
To verify either bug, first launch a 4.20 cluster with tech preview enabled and this PR included in the build.

launch 4.20,openshift/machine-config-operator#5110 aws,techpreview

To verify OCPBUGS-57177:

  1. Apply an invalid PIS.
Example Invalid PIS
apiVersion: machineconfiguration.openshift.io/v1
kind: PinnedImageSet
metadata:
  name: test-pinned
  labels:
    machineconfiguration.openshift.io/role: "worker"
spec:
  pinnedImages:
   - name: quay.io/rh-ee-ijanssen/machine-config-operator@sha256:65d3a308767b1773b6e3ead2ec1bcae499dde6ef085753d7e20e685f78841079
   - name: quay.io/rh-ee-ijanssen/machine-config-operator@sha256:fd3692eff21338e900a244dfe62152c959b84d73f2dd4503893de0f3fae61b0b
$ oc apply -f <pis>
  1. Wait for the PIS to fail to apply. The PinnedImageSetsDegraded condition in the MCN resource should be True.
$ oc describe machineconfignode/<node-name>
...
Status:
  Conditions:
...
    Last Transition Time:  2025-06-20T17:46:51Z
    Message:               One or more PinnedImageSet is experiencing an error. See PinnedImageSet list for more details.
    Reason:                PrefetchFailed
    Status:                True
    Type:                  PinnedImageSetsDegraded
...
  1. Check that the targeted MCP is degraded & the number of degraded machines is equal to the number of machines targeted by the PIS.
$ oc get mcp/worker
NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
worker   rendered-worker-fab41227fb4410113ba9f96caae6499a   False     True       True       3              0                   0                     3                      171m
  1. Check that the PinnedImageSetsDegraded and Degraded conditions in the targeted MCP are True. TODO: FIX THIS!!!
$ oc describe mcp/worker
...
Status:
...
  Conditions:
    Last Transition Time:  2025-06-20T16:01:48Z
    Message:               
    Reason:                
    Status:                False
    Type:                  PinnedImageSetsDegraded
...
    Last Transition Time:  2025-06-20T17:46:24Z
    Message:               Node ip-10-0-106-239.us-west-1.compute.internal is reporting: "One or more PinnedImageSet is experiencing an error. See PinnedImageSet list for more details.", Node ip-10-0-49-251.us-west-1.compute.internal is reporting: "One or more PinnedImageSet is experiencing an error. See PinnedImageSet list for more details.", Node ip-10-0-108-156.us-west-1.compute.internal is reporting: "One or more PinnedImageSet is experiencing an error. See PinnedImageSet list for more details."
    Reason:                3 nodes are reporting degraded status on sync
    Status:                True
    Type:                  NodeDegraded
    Last Transition Time:  2025-06-20T17:46:24Z
    Message:               Node ip-10-0-106-239.us-west-1.compute.internal is reporting: "One or more PinnedImageSet is experiencing an error. See PinnedImageSet list for more details.", Node ip-10-0-49-251.us-west-1.compute.internal is reporting: "One or more PinnedImageSet is experiencing an error. See PinnedImageSet list for more details.", Node ip-10-0-108-156.us-west-1.compute.internal is reporting: "One or more PinnedImageSet is experiencing an error. See PinnedImageSet list for more details."
    Reason:                
    Status:                True
    Type:                  Degraded
...
  1. Delete the invalid PIS.
$ oc delete pinnedimageset/<pis>
  1. TODO: add custom pool example
  2. TODO: add standard node degrade condition

To verify OCPBUGS-44290:

- Description for the changelog

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Jun 5, 2025
Copy link
Contributor

openshift-ci bot commented Jun 5, 2025

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot openshift-ci-robot added jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 5, 2025
@openshift-ci-robot
Copy link
Contributor

@isabella-janssen: This pull request references Jira Issue OCPBUGS-44290, which is invalid:

  • expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

- What I did

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Contributor

openshift-ci bot commented Jun 5, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: isabella-janssen

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 5, 2025
@isabella-janssen isabella-janssen changed the title (Not ready for review) OCPBUGS-44290: Fix when machines are considered Degraded in MCP status based on MCN conditions (Not ready for review) OCPBUGS-57177: OCPBUGS-44290: Fix when machines are considered Degraded in MCP status based on MCN conditions Jun 6, 2025
@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels Jun 6, 2025
@openshift-ci-robot
Copy link
Contributor

@isabella-janssen: This pull request references Jira Issue OCPBUGS-57177, which is valid. The bug has been moved to the POST state.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state ASSIGNED, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sergiordlr

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

- What I did

- How to verify it

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot requested a review from sergiordlr June 6, 2025 14:00
@openshift-ci-robot
Copy link
Contributor

@isabella-janssen: This pull request references Jira Issue OCPBUGS-57177, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

Requesting review from QA contact:
/cc @sergiordlr

In response to this:

- What I did

- How to verify it

To verify OCPBUGS-57177:

  1. Launch 4.20 cluster with tech preview enabled and this PR included in the build.
  2. Apply an invalid PIS.
  3. Wait for the PIS to fail to apply. The PinnedImageSetsDegraded condition in the MCN resource should be True.
  4. Check that the targeted MCP is degraded & the number of degraded machines is equal to the number of machines targeted by the PIS.
  5. <add ways to restore from degrade (delete PIS & apply valid PIS>

To verify OCPBUGS-44290:

- Description for the changelog

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@isabella-janssen isabella-janssen force-pushed the ocpbugs-44290 branch 2 times, most recently from 471eb22 to 9442363 Compare June 16, 2025 19:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/severity-moderate Referenced Jira bug's severity is moderate for the branch this PR is targeting. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants