Skip to content

⚠️ Deprecate reconcile.Result.Requeue #3107

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Feb 24, 2025

Conversation

alvaroaleman
Copy link
Member

@alvaroaleman alvaroaleman commented Feb 8, 2025

There is no good reason to use this setting, either an error or RequeueAfter should be used instead. Deprecate it to avoid confusion. From the godoc:

// This setting is deprecated as it causes confusion and there is
// no good reason to use it. When waiting for an external event to
// happen, either the duration until it is supposed to happen or an
// appropriate poll interval should be used, rather than an
// interval emitted by a ratelimiter whose purpose it is to control
// retry on error.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Feb 8, 2025
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Feb 8, 2025
There is no good reason to use this setting, either an error or
`RequeueAfter` should be used instead. Deprecate it to avoid confusion.
@fabriziopandini
Copy link
Member

fabriziopandini commented Feb 10, 2025

I just want to bring up some cases when requeue helped us in Cluster API

When a reconcile loop is depending on external objects that you cannot actively watch, having in controller runtime an option to requeue with an exponential backoff (without raising an error) helped CAPI to ensure a good reactivity of the system without overload it.

When we tried to use RequeueAfter for similar use cases instead we faced a few issues:

  • short RequeueAfter periods makes the system more reactive, but if you keep requeing frequently for long you can get into other problems, like e.g.
    • rate limiting on external resources
    • scalability problems in your controllers if too many objects are requeing frequently at the same time
  • long RequeueAfter periods makes the system less reactive

Note. In our experience the current implementation of requeue works when things in the external system happens quickly, less when the exponential backoff was getting too high.
Note. Whenever possible, we are replacing requeue/requreAfter with watches, but in some cases this is not feasible

@alvaroaleman
Copy link
Member Author

Note. In our experience the current implementation of requeue works when things in the external system happens quickly, less when the exponential backoff was getting too high.

Right, but wouldn't it be a lot better to set up your ratelimiter for this and use what interval it emits in RequeueAfter rather than re-using one that is not meant for that and where you already noticed that the ceiling is too high?

@fabriziopandini
Copy link
Member

fabriziopandini commented Feb 10, 2025

Right, but wouldn't it be a lot better to set up your ratelimiter for this and use what interval it emits in RequeueAfter rather than re-using one that is not meant for that and where you already noticed that the ceiling is too high?

What exists today in controller runtime worked for use in most cases, and this was enough for us in Cluster API to avoid creating our own custom rate limiter (having to implement one will be a step back).

Looking at this from another angle, if CR runtime is going to provide as a building block a rated limiter to use in controller with requreAfter, then the impact of deprecation/removal of requeue will be less relevant.

@alvaroaleman
Copy link
Member Author

Looking at this from another angle, if CR runtime is going to provide as a building block a rated limiter to use in controller with requreAfter, then the impact of deprecation/removal of requeue will be less relevant.

Not controller-runtime itself, but there are ratelimiters in the upstream workqueue package: https://pkg.go.dev/k8s.io/client-go/util/workqueue

@fabriziopandini
Copy link
Member

@alvaroaleman thanks for sharing your PoV, appreciated
As far as we have options to address more advanced use cases, I'm ok.

@sbueringer
Copy link
Member

/lgtm

/hold
/assign @vincepri

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 11, 2025
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 11, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

Git tree hash: 03aec75ade1ae9ba8d53fba906b118608bc4e2b6

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [alvaroaleman,vincepri]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@vincepri
Copy link
Member

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Feb 24, 2025
@k8s-ci-robot k8s-ci-robot merged commit e451c79 into kubernetes-sigs:main Feb 24, 2025
14 checks passed
@jonathan-innis
Copy link
Member

What exists today in controller runtime worked for use in most cases, and this was enough for us in Cluster API to avoid creating our own custom rate limiter

FWIW, we also thought the use of Requeue was reasonable in kubernetes-sigs/karpenter and were a bit surprised to see this deprecated in v0.21

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants