Skip to content

Retry Support #252

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 12 commits into from
Dec 14, 2020
Merged

Retry Support #252

merged 12 commits into from
Dec 14, 2020

Conversation

csviri
Copy link
Collaborator

@csviri csviri commented Dec 8, 2020

This PR introduces the retry support back for new core with event sources.

@csviri csviri changed the title retry basic impl Retry Support Dec 8, 2020
@csviri csviri changed the title Retry Support Retry Support (WIP) Dec 8, 2020
# Conflicts:
#	operator-framework/src/main/java/io/javaoperatorsdk/operator/Operator.java
#	operator-framework/src/main/java/io/javaoperatorsdk/operator/processing/DefaultEventHandler.java
#	operator-framework/src/main/java/io/javaoperatorsdk/operator/processing/EventBuffer.java
#	operator-framework/src/main/java/io/javaoperatorsdk/operator/processing/EventDispatcher.java
#	operator-framework/src/main/java/io/javaoperatorsdk/operator/processing/ExecutionConsumer.java
#	operator-framework/src/main/java/io/javaoperatorsdk/operator/processing/PostExecutionControl.java
#	operator-framework/src/main/java/io/javaoperatorsdk/operator/processing/event/DefaultEventSourceManager.java
#	operator-framework/src/main/java/io/javaoperatorsdk/operator/processing/retry/GenericRetry.java
#	operator-framework/src/main/java/io/javaoperatorsdk/operator/processing/retry/GenericRetryExecution.java
#	operator-framework/src/test/java/io/javaoperatorsdk/operator/EventDispatcherTest.java
#	operator-framework/src/test/java/io/javaoperatorsdk/operator/processing/DefaultEventHandlerTest.java
#	operator-framework/src/test/java/io/javaoperatorsdk/operator/processing/event/DefaultEventSourceManagerTest.java
#	operator-framework/src/test/java/io/javaoperatorsdk/operator/processing/retry/GenericRetryExecutionTest.java
@csviri csviri changed the title Retry Support (WIP) Retry Support Dec 11, 2020
@@ -2,19 +2,24 @@

public class RetryInfo {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about turning it into an interface extended by RetryExecution and implemented by GenericRetryExecution?
so the execution can be returned as info and no need to instantiate a new RetryInfo on every call.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hehe, I like to work with Pojos, but will check how that would look like, I change it in case its nice.

Copy link
Collaborator

@metacosm metacosm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will try to look at things more closely over the week-end.

@@ -22,7 +22,7 @@

@Override
public void run() {
PostExecutionControl postExecutionControl = eventDispatcher.handleEvent(executionScope);
PostExecutionControl postExecutionControl = eventDispatcher.handleExecution(executionScope);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just loud thinking, wouldn't the code be simpler if we had 2 ExecutionConsumers called
SimpleConsumer and RetryableConsumer and let the consumer to take care of the retrying rather than the handler?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@psycho-ir thing is this is closely relates with the events are comming in while the controller is executed, that layer does not have enough info basically now. (Also its just really a wraper around executor).

@csviri
Copy link
Collaborator Author

csviri commented Dec 11, 2020

Will make the changes an fix this test next week.

@adam-sandor
Copy link
Collaborator

How does the user configure the retries?

@csviri
Copy link
Collaborator Author

csviri commented Dec 14, 2020

How does the user configure the retries?

Just as before, when registering the controller.

@csviri
Copy link
Collaborator Author

csviri commented Dec 14, 2020

@adam-sandor @metacosm @psycho-ir and ALL. This is the MVP of the Retry. This works in a way that if there is an exception thrown from the controller, we will schedule a timer event to retry the execution. (Also we are putting back the events, which were in the scope of the current execution). The retry can be configured when registering the controller.

Note that there are all kind of situations which can happen, like an event is received while there is already a reatry planned or a new event is received when there is already the max attempt reached of a retry. In all these situations we execute the controller. So basically we are not blocking the execution because of an ongoing retry. This is something that we might discuss and change in the future. I think its not straightforward how this should go.

Maybe it would be more intuitive behavior that if there is already a scheduled event for retry, we don't execute the controller for new events just for the retry event. But this is up to discussion.

Pls let me know if you have any questions, we can also go through the impl. Otherwise would like to merge it.

@metacosm
Copy link
Collaborator

Note that there are all kind of situations which can happen, like an event is received while there is already a reatry planned or a new event is received when there is already the max attempt reached of a retry. In all these situations we execute the controller. So basically we are not blocking the execution because of an ongoing retry. This is something that we might discuss and change in the future. I think its not straightforward how this should go.

Can you give a more concrete example of what you mean here? What are the different options?

Maybe it would be more intuitive behavior that if there is already a scheduled event for retry, we don't execute the controller for new events just for the retry event. But this is up to discussion.

The thing that I'm not clear about is what these "new events" are? If they are events from the cluster (i.e. there was a change in the resource state on the cluster), whatever the controller is doing should probably be stopped to try to reconcile the new state instead.

@s-soroosh
Copy link
Contributor

Maybe it would be more intuitive behavior that if there is already a scheduled event for retry, we don't execute the controller for new events just for the retry event. But this is up to discussion.

I would say the current implementation is more reasonable, stop reconciling on new events may lead to stop the world for a while just because of a runtime exception happened.

The changes LGTM, What IMHO is crucial is documenting this feature and also the event sources via javadoc ASAP.

@s-soroosh s-soroosh merged commit 62ea9f6 into master Dec 14, 2020
@s-soroosh s-soroosh deleted the retry-experiment branch December 14, 2020 21:11
@Mallington
Copy link

Looks Great, this is exactly what I was looking for. Are you planning on creating a new release tag with these changes? I plan to uptake them once they are released.
Kind regards.

@s-soroosh
Copy link
Contributor

Looks Great, this is exactly what I was looking for. Are you planning on creating a new release tag with these changes? I plan to uptake them once they are released.
Kind regards.

You should already be able to play around with these changes by changing your dependency version to 1.4.1-SNAPSHOT,
We will do the actual release of 1.5.0 very soon (most probably tomorrow).

@jfgosselin
Copy link

@csviri
Copy link
Collaborator Author

csviri commented Dec 17, 2020

@jfgosselin this is a leftover, should not be there, will remove it, thx!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants