Skip to content

Infinite loop reconcile due to Rancher annotations on Ingress resources #11414

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
HydraCro opened this issue May 6, 2025 · 4 comments
Open

Comments

@HydraCro
Copy link

HydraCro commented May 6, 2025

Bug Description

We are running Strimzi (v0.45.0) Kafka (v3.9.0) alongside Rancher (v2.7.9).
There is an issue with Ingress resource, where our Nginx ingress controller keeps syncing with new Ingress that has been reconciled by Strimzi Operator. Rancher agent and Strimzi Operator keep competing over annotation, and every 2 seconds Ingress gets updated and new resourceVersion assigned.

Here is the operator log:
2025-05-06 14:49:06 DEBUG ResourceDiff:65 - Reconciliation #1652(watch) Kafka(kafka/xxx-cluster): Ingress xxx-cluster-kafka-externaltls-bootstrap differs: {"op":"remove","path":"/metadata/annotations/field.cattle.io~1publicEndpoints"}

2025-05-06T16:49:06.065012677+02:00 2025-05-06 14:49:06 DEBUG ResourceDiff:66 - Reconciliation #1652(watch) Kafka(kafka/xxx-cluster): Current Ingress xxx-cluster-kafka-externaltls-bootstrap path /metadata/annotations/field.cattle.io~1publicEndpoints has value

2025-05-06T16:49:06.065029748+02:00 2025-05-06 14:49:06 DEBUG ResourceDiff:67 - Reconciliation #1652(watch) Kafka(kafka/xxx-cluster): Desired Ingress xxx-cluster-kafka-externaltls-bootstrap path /metadata/annotations/field.cattle.io~1publicEndpoints has value

2025-05-06 14:49:06 DEBUG AbstractNamespacedResourceOperator:236 - Reconciliation #1652(watch) Kafka(kafka/xxx-cluster): Ingress xxx-cluster-kafka-externaltls-bootstrap in namespace kafka has been patched

I have noticed there was a bug report already on this long time ago, and that it was fixed with LOADBALANCER_ANNOTATION_IGNORELIST (#4035), but we are still experiencing the problem.

Steps to reproduce

No response

Expected behavior

Resources updated by Rancher annotations should not trigger reconciliation.

Strimzi version

0.45.0

Kubernetes version

1.28.15

Installation method

Helm chart

Infrastructure

Bare-metal

Configuration files and logs

2025-05-06 14:49:06 DEBUG ResourceDiff:65 - Reconciliation #1652(watch) Kafka(kafka/xxx-cluster): Ingress xxx-cluster-kafka-externaltls-bootstrap differs: {"op":"remove","path":"/metadata/annotations/field.cattle.io~1publicEndpoints"}

2025-05-06T16:49:06.065012677+02:00 2025-05-06 14:49:06 DEBUG ResourceDiff:66 - Reconciliation #1652(watch) Kafka(kafka/xxx-cluster): Current Ingress xxx-cluster-kafka-externaltls-bootstrap path /metadata/annotations/field.cattle.io~1publicEndpoints has value

2025-05-06T16:49:06.065029748+02:00 2025-05-06 14:49:06 DEBUG ResourceDiff:67 - Reconciliation #1652(watch) Kafka(kafka/xxx-cluster): Desired Ingress xxx-cluster-kafka-externaltls-bootstrap path /metadata/annotations/field.cattle.io~1publicEndpoints has value

2025-05-06 14:49:06 DEBUG AbstractNamespacedResourceOperator:236 - Reconciliation #1652(watch) Kafka(kafka/xxx-cluster): Ingress xxx-cluster-kafka-externaltls-bootstrap in namespace kafka has been patched

Additional context

No response

@scholzj
Copy link
Member

scholzj commented May 6, 2025

Assuming the annotation has a stable value, you can set it in the template section to avoid the conflict. #4035 and the related #4076 PR were only about the Service resources. So you (or someone else if needed) would need to extend it to the Ingress resources as well in order to fix this permanently.

@scholzj scholzj added enhancement and removed bug labels May 6, 2025
@scholzj scholzj changed the title [Bug]: Infinite loop reconcile due to Rancher annotations Infinite loop reconcile due to Rancher annotations on Ingress resources May 6, 2025
@HydraCro
Copy link
Author

HydraCro commented May 8, 2025

Thank you for the advice!
What I ended up doing is I added these annotations to Kafka YAML configuration:

  - name: externaltls
        port: 9094
        type: ingress
        tls: true
        authentication:
          type: scram-sha-512
        configuration:
          brokerCertChainAndKey:
            secretName: xxx
            certificate: tls.crt
            key: tls.key
          bootstrap:
            host: bootstrap-hostname.com
            annotations:
              field.cattle.io/publicEndpoints: >-
                [{"addresses":["111.11.11.1","111.11.11.2"],"port":443,"protocol":"HTTPS","serviceName":"kafka:xxx-cluster-kafka-externaltls-bootstrap","ingressName":"kafka:xxx-cluster-kafka-externaltls-bootstrap","hostname":"bootstrap-hostname.com","path":"/","allNodes":false}]
          brokers:
            - broker: 0
              host: 0-broker-kafka.cloud.com
              annotations:
                  field.cattle.io/publicEndpoints: >-
                    [{"addresses":["111.11.11.1","111.11.11.2"],"port":443,"protocol":"HTTPS","serviceName":"kafka:xxx-cluster-kafka-broker-externaltls-0","ingressName":"kafka:xxx-cluster-kafka-broker-externaltls-0","hostname":"0-broker-kafka.cloud.com","path":"/","allNodes":false}]
            - broker: 1
              host: 1-broker-kafka.cloud.com
              annotations:
                  field.cattle.io/publicEndpoints: >-
                    [{"addresses":["111.11.11.1","111.11.11.2"],"port":443,"protocol":"HTTPS","serviceName":"kafka:xxx-cluster-kafka-broker-externaltls-1","ingressName":"kafka:xxx-cluster-kafka-broker-externaltls-1","hostname":"1-broker-kafka.cloud.com","path":"/","allNodes":false}]
            - broker: 2
              host: 2-broker-kafka.cloud.com
              annotations:
                  field.cattle.io/publicEndpoints: >-
                    [{"addresses":["111.11.11.1","111.11.11.2"],"port":443,"protocol":"HTTPS","serviceName":"kafka:xxx-cluster-kafka-broker-externaltls-2","ingressName":"kafka:xxx-cluster-kafka-broker-externaltls-2","hostname":"2-broker-kafka.cloud.com","path":"/","allNodes":false}]
          class: nginx

Is this what you have suggested?

There is a template section we use for pod labels kafka-metrics: 'true', but I am not sure how to point to ingress only there, just specifying ingress doesnt seem to work.

However even with my changes it is better, i no longer see new Ingress version being applied every 2 seconds, but I do still see updates in the log:

2025-05-08 07:31:00 DEBUG ResourceDiff:65 - Reconciliation #70910(watch) Kafka(kafka/xxx-cluster): Service xxx-cluster-kafka-broker-externaltls-2 differs: {"op":"replace","path":"/metadata/annotations/field.cattle.io~1publicEndpoints","value":"[{\"addresses\":[\"111.11.11.1\",\"111.11.11.2\"],\"port\":443,\"protocol\":\"HTTPS\",\"serviceName\":\"kafka:xxx-cluster-kafka-broker-externaltls-2\",\"ingressName\":\"kafka:xxx-cluster-kafka-broker-externaltls-2\",\"hostname\":\"2-broker-kafka.cloud.com\",\"path\":\"/\",\"allNodes\":false}]"}
2025-05-08 07:31:00 DEBUG ResourceDiff:66 - Reconciliation #70910(watch) Kafka(kafka/xxx-cluster): Current Service xxx-cluster-kafka-broker-externaltls-2 path /metadata/annotations/field.cattle.io~1publicEndpoints has value 
2025-05-08 07:31:00 DEBUG ResourceDiff:67 - Reconciliation #70910(watch) Kafka(kafka/xx-cluster): Desired Service xxx-cluster-kafka-broker-externaltls-2 path /metadata/annotations/field.cattle.io~1publicEndpoints has value 
2025-05-08 07:31:00 DEBUG AbstractNamespacedResourceOperator:236 - Reconciliation #70910(watch) Kafka(kafka/xxx-cluster): Service xxx-cluster-kafka-broker-externaltls-2 in namespace kafka has been patched
2025-05-08 07:31:00 DEBUG AbstractNamespacedResourceOperator:141 - Reconciliation #70910(watch) Kafka(kafka/xxx-cluster): Reconciling existing Ingress resources [xxx-cluster-kafka-broker-externaltls-0, xxx-cluster-kafka-broker-externaltls-1, xxx-cluster-kafka-broker-externaltls-2, xxx-cluster-kafka-externaltls-bootstrap] against the desired Ingress resources
2025-05-08 07:31:00 DEBUG AbstractNamespacedResourceOperator:153 - Reconciliation #70910(watch) Kafka(kafka/xxx-cluster): Ingress kafka/[] should be deleted
2025-05-08 07:31:00 DEBUG AbstractNamespacedResourceOperator:104 - Reconciliation #70910(watch) Kafka(kafka/xxx-cluster): Ingress kafka/xxx-cluster-kafka-externaltls-bootstrap already exists, updating it

So I am not sure what does Service xxx-cluster-kafka-broker-externaltls-2 in namespace kafka has been patched do then?

This is of course not a fix, just a temporary workaround because field.cattle.io/publicEndpoints annotation can change depending on which nodes are assigned as our cluster entrypoints (or if we change some naming of Kafka resources). For now I put it as {"addresses":["111.11.11.1","111.11.11.2"],"port":443,... to fix it to current IP addresses we use.

Its been a while since I did any Java, but I can check how easy it would be to make commit with changes.

@scholzj
Copy link
Member

scholzj commented May 8, 2025

Is this what you have suggested?

As the annotation seems to be different on each Ingress resource, then what you did is the right workaround, yes 👍.

So I am not sure what does Service xxx-cluster-kafka-broker-externaltls-2 in namespace kafka has been patched do then?

To be honest, it is not easy for me to say what exactly it means as I do not have any Rancher clusters to test it with and see what exactly it adds to different resources etc. Sorry.

Its been a while since I did any Java, but I can check how easy it would be to make commit with changes.

Obviously, if you would be able to contribute it, it would be great. If not, there might be sooner or later someone else interested in some contribution who can do this. So up to you ... but even as an issue report without a fix this is of course valuable and appreciated.

@im-konge
Copy link
Member

Triaged on 29.5.2025: @HydraCro are you good with the Jakub's comments and suggestions or are you willing to contribute a fix for it?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants