Skip to content

OCPBUGS-56725: Add max length validation for apiserver namedCertificates #2342

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 2, 2025

Conversation

enxebre
Copy link
Member

@enxebre enxebre commented May 27, 2025

The addition of maxLength check is a fix itself. In addition this will help hcp validations to contain cel validation budget

@openshift-ci-robot openshift-ci-robot added jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 27, 2025
Copy link
Contributor

openshift-ci bot commented May 27, 2025

Hello @enxebre! Some important instructions when contributing to openshift/api:
API design plays an important part in the user experience of OpenShift and as such API PRs are subject to a high level of scrutiny to ensure they follow our best practices. If you haven't already done so, please review the OpenShift API Conventions and ensure that your proposed changes are compliant. Following these conventions will help expedite the api review process for your PR.

@openshift-ci-robot
Copy link

@enxebre: This pull request references Jira Issue OCPBUGS-56725, which is invalid:

  • expected the bug to target the "4.20.0" version, but no target version was set

Comment /jira refresh to re-evaluate validity if changes to the Jira bug are made, or edit the title of this pull request to link to a different bug.

The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

The addition of maxLength check is a fix itself. In addition this will help hcp validations to contain cel validation budget

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci bot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label May 27, 2025
@openshift-ci openshift-ci bot requested review from deads2k and everettraven May 27, 2025 12:13
@enxebre
Copy link
Member Author

enxebre commented May 27, 2025

/jira refresh

@openshift-ci-robot openshift-ci-robot added jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. and removed jira/invalid-bug Indicates that a referenced Jira bug is invalid for the branch this PR is targeting. labels May 27, 2025
@openshift-ci-robot
Copy link

@enxebre: This pull request references Jira Issue OCPBUGS-56725, which is valid.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target version (4.20.0) matches configured target version for branch (4.20.0)
  • bug is in the state POST, which is one of the valid states (NEW, ASSIGNED, POST)

No GitHub users were found matching the public email listed for the QA contact in Jira ([email protected]), skipping review request.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@@ -155,6 +155,7 @@ type APIServerServingCerts struct {
// the defaultServingCertificate will be used.
// +optional
// +listType=atomic
// +kubebuilder:validation:MaxItems=20
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that this is technically a breaking change because it is introducing a restriction that did not exist before.

We need to:

  • Have insights into how many instances of this configuration exist where there are more than 20 items on both standalone OpenShift and HyperShift to better understand how many clusters may be impacted by this change.
  • Understand what happens in the case that a cluster does have more than 20 items and they upgrade to a version where this new validation is added.
    • For standalone OpenShift, we should be able to take advantage of ratcheting validation - we need to make sure we have testing in place to make sure the ratcheting behavior works as expected.
    • IIRC, HyperShift cannot guarantee any ratcheting behavior until the minimum supported version of OpenShift is 4.18 (I think this is when the ratcheting behavior was added, but @JoelSpeed should be able to confirm)

Also, why was a maximum of 20 chosen?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a test for ratcheting to the integration suite in this repo is also important, PTAL at how to do this in the README in the top level tests folder

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

20 is arbitrary, I changed to 100 so this change has no impact in practice.
We should be able to get some data from insights, I already started a thread on that but let's don't block this pr on it.
I just added a yaml for the ratcheting test (possibly not quite right yet) I'll be happy to address any feedback there.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be able to get some data from insights, I already started a thread on that but let's don't block this pr on it.

Why should we not block the PR until we know whether or not this will impact at least the clusters we collect telemetry for?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ultimately this is just a mean to provide business value. There's no value / real use case for >100 entries (If someone is doing that, It would be actually good to force a discussion and help them). Not having this is actually resulting in degraded business value. So in this scenario I see the 100 as just a cosmetic impl detail and insights as an interesting exercise that should not prevent us from moving forward and providing business value.

In any case, I appreciate you bringing it up. Data from insights shows that max namedCertificates is 3 and max namedCertificates.Names is 6. Captured those queries within the jira for anyone curious.

Copy link
Contributor

@everettraven everettraven May 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for getting the insights. With those numbers in mind I think we can set a more reasonable limit here instead of just choosing something totally arbitrary that we may choose to decrease later. Decreasing the max limit is also a breaking change.

Maybe something like 32 and 64 is reasonable based on the insights data we have (and assuming that there is a reasonable case where these values make sense)? That is ~10x for each one, to give plenty of buffer room for customers that might need this scale, and keeps the pattern of doubling the namedCertificates.Names.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@jparrill
Copy link

Some comments:

  • From my PoV ratcheting looks like a good practice and makes sense to have them in the repo but (IMHO) it's out of the scope of this PR.
  • 20 elements for the namedCertificates looks like a good amount of certificates (tbh I didn't see any config with more than 4 certs configured)
  • Just a question, I've seen you're using the OpenAPI standard for the validation, could make sense to (at some point) move to CEL validations which allows custom error messages? (to be clearer with the API consumer)

TL;DR: LGTM

@everettraven
Copy link
Contributor

From my PoV ratcheting looks like a good practice and makes sense to have them in the repo but (IMHO) it's out of the scope of this PR.

Ratcheting tests are not out of the scope of this PR. Before merging this PR we need to ensure that the ratcheting behavior works as expected. Please include them.

@enxebre enxebre force-pushed the apiserver-certs-max branch from fef7702 to 19cfc7f Compare May 27, 2025 14:06
@openshift-ci openshift-ci bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels May 27, 2025
@enxebre enxebre force-pushed the apiserver-certs-max branch from 19cfc7f to 612fa13 Compare May 27, 2025 14:13
- name: Should not error with an invalid persisted namedCertificates in spec
initialCRDPatches:
- op: remove
path: /spec/versions/0/schema/openAPIV3Schema/properties/spec/properties/fieldWithNewMaxLength/maximum
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This path should map to the field with the newly added max length. fieldWithNewMaxLength doesn't seem correct

initial: |
apiVersion: config.openshift.io/v1
kind: APIServer
spec:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything else here we can set in the spec? For ratcheting tests, we generally want to make sure that when the field was previously violating the new validation that we can still go and make valid modifications to other fields without having to update the now invalid field.

@enxebre enxebre force-pushed the apiserver-certs-max branch 2 times, most recently from e625a1e to 5fa236f Compare May 28, 2025 09:47
Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I've read correctly, the initial and updated are the same currently, would be better to check for an actual difference, otherwise this seems fine pending conversation about existing usage

Comment on lines +16 to +156
additionalCORSAllowedOrigins:
- "foo"
- "bar"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove this from initial so that the updated is actually different from the initial?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right, see if I get one green then I'll add several test cases

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a bunch of test cases now for creation, update and ratcheting, PTAL.

@enxebre enxebre force-pushed the apiserver-certs-max branch 3 times, most recently from 191ad2c to 151c85c Compare May 28, 2025 11:33
@openshift-ci openshift-ci bot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels May 28, 2025
@enxebre enxebre force-pushed the apiserver-certs-max branch from 151c85c to 1354b60 Compare May 28, 2025 13:16
@enxebre
Copy link
Member Author

enxebre commented May 29, 2025

@JoelSpeed @everettraven please let me know if there's more feedback I should address

- "63.kas.same-names-entry.com"
- "64.kas.same-names-entry.com"
- "65.kas.same-names-entry.com"
onUpdate:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't have onUpdate more than once in the file

- "63.kas.same-names-entry.com"
- "64.kas.same-names-entry.com"
expectedError: "spec.servingCerts.namedCertificates: Too many: 33: must have at most 32 items"
onCreate:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should only have onCreate once in the file, it's a list

- "63.kas.same-names-entry.com"
- "64.kas.same-names-entry.com"
- "65.kas.same-names-entry.com"
expectedError: "names: Too many: 65: must have at most 64 items"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit. missing final new line character

@enxebre enxebre force-pushed the apiserver-certs-max branch from 1354b60 to c825d5e Compare May 29, 2025 13:29
The addition of maxLength check is a fix itself. In addition this will help hcp validations to contain cel validation budget
@enxebre enxebre force-pushed the apiserver-certs-max branch from c825d5e to 83157e5 Compare May 29, 2025 15:53
@JoelSpeed
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added lgtm Indicates that a PR is ready to be merged. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels May 29, 2025
Copy link
Contributor

@everettraven everettraven left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

Copy link
Contributor

openshift-ci bot commented May 29, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enxebre, everettraven, JoelSpeed

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD ff66e60 and 2 for PR HEAD 83157e5 in total

1 similar comment
@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD ff66e60 and 2 for PR HEAD 83157e5 in total

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD e041b5e and 1 for PR HEAD 83157e5 in total

@JoelSpeed
Copy link
Contributor

/retest-required

Copy link
Contributor

openshift-ci bot commented Jun 2, 2025

@enxebre: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gcp 83157e5 link false /test e2e-gcp

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 91245a2 and 2 for PR HEAD 83157e5 in total

@openshift-merge-bot openshift-merge-bot bot merged commit b29811a into openshift:master Jun 2, 2025
25 of 26 checks passed
@openshift-ci-robot
Copy link

@enxebre: Jira Issue OCPBUGS-56725: Some pull requests linked via external trackers have merged:

The following pull requests linked via external trackers have not merged:

These pull request must merge or be unlinked from the Jira bug in order for it to move to the next state. Once unlinked, request a bug refresh with /jira refresh.

Jira Issue OCPBUGS-56725 has not been moved to the MODIFIED state.

In response to this:

The addition of maxLength check is a fix itself. In addition this will help hcp validations to contain cel validation budget

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-bot
Copy link

[ART PR BUILD NOTIFIER]

Distgit: ose-cluster-config-api
This PR has been included in build ose-cluster-config-api-container-v4.20.0-202506030014.p0.gb29811a.assembly.stream.el9.
All builds following this will include this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-bug Indicates that a referenced Jira bug is valid for the branch this PR is targeting. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants