-
Notifications
You must be signed in to change notification settings - Fork 11
add flag to paginate list calls #177
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm for this code alone,but i have below concerns:
- have we tested that this pagination is really happening in apiServer from audit logs? are we use real client instead of cached one?
- edit: just read through the entire codebase, from the code seems we are using the cached client, i'm wondering how this would work at all..
- i'm curious that there isn't pagination support in client-go or controller-runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
k8sClient shall be a real client to make this work, and seems we are using cached one, unless i made some mistake here.
https://github.com/aws/amazon-network-policy-controller-k8s/blob/main/pkg/config/runtime_config.go#L96
@M00nF1sh thanks, yes it's a cached client. I'm running the controller in a cluster with 130k pods to confirm from apiserver side. |
I confirmed this behavior from 130k-pod cluster
|
What type of PR is this?
Which issue does this PR fix:
When the pod scale is large, the current unpaginated list pod call will heavy load apiserver and lead to apiserver timeout.
Add a flag
--list-page-size
to paginate k8s list pod calls, default value is1000
.What does this PR do / Why do we need it:
Avoid apiserver timeout issue that may lead to controller crash when pod scale is large.
If an issue # is not available please add steps to reproduce and the controller logs:
Testing done on this change:
Automation added to e2e:
Will this PR introduce any new dependencies?:
Will this break upgrades or downgrades. Has updating a running cluster been tested?:
Does this PR introduce any user-facing change?:
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.