-
Notifications
You must be signed in to change notification settings - Fork 219
Caching just Updated Custom Resource Optimization alternatives #588
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Shouldn't event only occur when the informer is updated, though? In which situation would there be events that are not associated to an informer update? |
It does eventually, the problem is the timing. So we update a resource after controller execution, but instantly execute the buffered events if there are. This is fater then the event from k8s api received in the informer. So the cache in the informer not necessary has the up to date CR, but we know there is one. So we can handle this situation (for example in a way mentioned above) |
One possible option, though a tricky one to implement would be to delay updating the resource on the server until all the events are processed. Another option would be to have an ephemeral cache of the resource that would only span the current interaction (though I guess defining that interaction might be tricky as well). |
Not sure how feasible it would be but guess if here is where the custom event filter comes into play as we can use the updated custom resource from the controller execution as one of the parameter of the filter (we need to update the events in the queue) and let the user decide what to do. |
I more and more leaning to a flag that we don't process the events until we receive the next event from the informer. This might be solvable relatively elegantly just by adding a new special event handler for this purpose from the CustomResourceEventSource informer to the DefaultEventHandler. Or just a flag into current event filtering. Note that we know what event we expecting, thus with what resource version, but would ignore that, thus not check it. |
What would happen in case of an update that does not result in a change to CR ?
|
An update is a CR or Status update by definition. If there is no such update we don't have this problem. So we will just check this if there was a CR update. |
Yes but the actual update may not happen if the resulting patch is empty right ? |
We don't support patches now (probably will), but yes. See line 312. |
In case we execute a controller and update the custom resource at the end, it can take time while we receive the custom resource event from the server and gets into the informers cache in
CustomResoureEventSource
. This causes problem in case when there are already events received during the execution. In next reconciliation , what happens instantly, we use an old custom resource, and the next update will fail on conflict.We handled this situation in V1 but simply caching manually the updated custom resource from the response of the update api call. Currently we cannot do this with informers (well not with a very hacky way)
Note that if there is retry in place this is not a correctness issue just an optimization, but this is possibly happening quite often, and it's quite ugly. Since we see lots of retries.
Alternative solutions to V2:
Don't execute the buffered events instantly, just schedule a new one time event in the close feature (~1-2 second?), usually we receive the update custom resource until that time into cache (not necessary to the Event Handler because of generation filtering). We have to be careful there this event producing should be cancelled in case the event arrives. (I prefer this)
(This would be also reusable to optimize the re scheduling of events)
Don't filter out the next event from custom resource event source. This will allow to receive the event about the update and execute the custom resource. Note that is hard to make correct (if not impossible now) in all cases, there are possible timing issues we have to investiage)
the combination of 1. and 2.
The text was updated successfully, but these errors were encountered: