Bug 1798282: DROP: Avoid unnecessary calls to the cloud provider #24532

Miciah · 2020-02-13T04:38:44Z

Drop UPSTREAM: 63926, which causes deletion of "LoadBalancer"-type services to get stuck.

#19742 added the carry in order to reduce unnecessary cloud-provider API calls, but it is no longer needed and causes problems with recent changes to the clean-up logic for services with type "LoadBalancer".

At the time the PR was written, the service controller added every newly created service to its work queue, which was also used for update and delete events. When a worker for this queue subsequently processed the service, it would check if the service needed an external load-balancer, and if not, it would then check if a load balancer existed for the service (in case the service had been added to the queue by an update that changed its type from "LoadBalancer", in which case any previously provisioned load-balancer would need to be deleted). To perform this check, the worker would make a GetLoadBalancer API call using the cloud provider.

#19742 added a check to skip the GetLoadBalancer call if the service's status indicated that it had no associated load-balancer, which should always be the case for a newly created service if its type is not "LoadBalancer".

Later, upstream commit kubernetes/kubernetes@aa3f81d modified the service controller's logic to add a newly created service to the work queue only if its type were "LoadBalancer", thereby obviating the need for the check that #19742 had added. The upstream commit also modified the service controller's load-balancer clean-up logic, and the check that #19742 added breaks this new logic.

vendor/k8s.io/kubernetes/pkg/controller/service/controller.go (syncLoadBalancerIfNeeded): Delete check for empty load balancer status.

openshift-ci-robot · 2020-02-13T04:38:50Z

@Miciah: This pull request references Bugzilla bug 1798282, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

In response to this:

Bug 1798282: DROP: Avoid unnecessary calls to the cloud provider

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Miciah · 2020-02-17T19:33:16Z

In the e2e-gcp-upgrade CI job, both the control-plane-upgrade test and the k8s-service-upgrade test failed. The control plane uses a load balancer that is not managed by the service controller, so this change cannot be responsible for that failure; I suspect the other failure is a flake.

The other jobs failed before the cluster was up. In e2e-cmd, the installer failed with the following:

level=error msg="Error: Error applying IAM policy to project \"openshift-gce-devel-ci\": Too many conflicts.  Latest error: Error setting IAM policy for project \"openshift-gce-devel-ci\": googleapi: Error 409: There were concurrent policy changes. Please retry the whole read-modify-write with exponential backoff., aborted"

In e2e-aws-fips, the job filed with the following:

could not resolve inputs: could not determine inputs for step [input:machine-os-content-base]: could not resolve base image: imagestreamtags.image.openshift.io "4.4:machine-os-content" is forbidden: User "system:anonymous" cannot get imagestreamtags.image.openshift.io in the namespace "ocp": no RBAC policy matched

/retest

ironcladlou · 2020-04-20T14:43:28Z

/approve
/lgtm

knobunc · 2020-05-20T17:58:38Z

/approve

openshift-bot · 2020-05-20T18:29:09Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-05-20T18:42:14Z

/retest

Please review the full test history for this PR and help us cut down flakes.

Drop UPSTREAM: 63926, which causes deletion of "LoadBalancer"-type services to get stuck. Commit aaf46c4 added the carry in order to reduce unnecessary cloud-provider API calls, but it is no longer needed and causes problems with recent changes to the clean-up logic for services with type "LoadBalancer". At the time the commit was written, the service controller added every newly created service to its work queue, which was also used for update and delete events. When a worker for this queue subsequently processed the service, it would check if the service needed an external load-balancer, and if not, it would then check if a load balancer existed for the service (in case the service had been added to the queue by an update that changed its type from "LoadBalancer", in which case any previously provisioned load-balancer would need to be deleted). To perform this check, the worker would make a GetLoadBalancer API call using the cloud provider. Commit aaf46c4 added a check to skip the GetLoadBalancer call if the service's status indicated that it had no associated load-balancer, which should always be the case for a newly created service if its type is not "LoadBalancer". Later, upstream commit aa3f81d modified the service controller's logic to add a newly created service to the work queue only if its type were "LoadBalancer", thereby obviating the need for the check that commit aaf46c4 had added. The upstream commit also modified the service controller's load-balancer clean-up logic, and the check that commit aaf46c4 added breaks this new logic. This commit fixes bug 1798282. https://bugzilla.redhat.com/show_bug.cgi?id=1798282 * vendor/k8s.io/kubernetes/pkg/controller/service/controller.go (syncLoadBalancerIfNeeded): Delete check for empty load balancer status.

openshift-bot · 2020-05-20T19:08:06Z

/retest

Please review the full test history for this PR and help us cut down flakes.

Miciah · 2020-05-20T20:39:38Z

/test e2e-gcp-upgrade

Miciah · 2020-05-21T14:59:27Z

/retest

danehans · 2020-05-21T15:38:59Z

/lgtm

openshift-ci-robot · 2020-05-21T15:39:18Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danehans, ironcladlou, knobunc, Miciah

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [knobunc]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

openshift-bot · 2020-05-21T21:08:36Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-05-21T21:47:36Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-05-21T22:39:31Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-05-21T23:31:31Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-05-21T23:44:57Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-bot · 2020-05-22T03:51:35Z

/retest

Please review the full test history for this PR and help us cut down flakes.

openshift-ci-robot · 2020-05-22T07:49:49Z

@Miciah: All pull requests linked via external trackers have merged: openshift/origin#24532. Bugzilla bug 1798282 has been moved to the MODIFIED state.

In response to this:

Bug 1798282: DROP: Avoid unnecessary calls to the cloud provider

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-ci-robot added the bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. label Feb 13, 2020

openshift-ci-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label Feb 13, 2020

openshift-ci-robot requested review from deads2k and tbielawa February 13, 2020 04:42

openshift-ci-robot added the vendor-update Touching vendor dir or related files label Feb 13, 2020

openshift-ci-robot assigned ironcladlou Apr 20, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label Apr 20, 2020

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 20, 2020

Miciah force-pushed the BZ1798282-drop-avoid-unnecessary-calls-to-the-cloud-provider branch from 5cd7b08 to 6e34acc Compare May 20, 2020 19:09

openshift-ci-robot removed the lgtm Indicates that a PR is ready to be merged. label May 20, 2020

openshift-ci-robot assigned danehans May 21, 2020

openshift-ci-robot added the lgtm Indicates that a PR is ready to be merged. label May 21, 2020

openshift-merge-robot merged commit f8622a2 into openshift:master May 22, 2020

Bug 1798282: DROP: Avoid unnecessary calls to the cloud provider #24532

Bug 1798282: DROP: Avoid unnecessary calls to the cloud provider #24532

Uh oh!

Conversation

Miciah commented Feb 13, 2020

Uh oh!

openshift-ci-robot commented Feb 13, 2020

Uh oh!

Miciah commented Feb 17, 2020

Uh oh!

ironcladlou commented Apr 20, 2020

Uh oh!

knobunc commented May 20, 2020

Uh oh!

openshift-bot commented May 20, 2020

Uh oh!

openshift-bot commented May 20, 2020

Uh oh!

openshift-bot commented May 20, 2020

Uh oh!

Miciah commented May 20, 2020

Uh oh!

Miciah commented May 21, 2020

Uh oh!

danehans commented May 21, 2020

Uh oh!

openshift-ci-robot commented May 21, 2020

Uh oh!

openshift-bot commented May 21, 2020

Uh oh!

openshift-bot commented May 21, 2020

Uh oh!

openshift-bot commented May 21, 2020

Uh oh!

openshift-bot commented May 21, 2020

Uh oh!

openshift-bot commented May 21, 2020

Uh oh!

openshift-bot commented May 22, 2020

Uh oh!

openshift-ci-robot commented May 22, 2020

Uh oh!

Uh oh!