Skip to content

[openstack-cloud-controller-manager] Shared tenancy OpenStack project service type LoadBalancer collision #2241

Closed
@scrothers

Description

@scrothers

/kind bug

What happened:
Two Kubernetes clusters in the same project were created. The following steps were taken along with their timeline.

  • Two different Kubernetes clusters were created using RKE2 in the same OpenStack project.
  • Both Kubernetes clusters came online and were fully functional.
  • Kubernetes cluster A had the default:kubernetes service changed from ClusterIP to LoadBalancer.
  • A load balancer was created and the Kubernetes service was made public successfully.
  • Kubernetes cluster B had the default:kubernetes service changed from ClusterIP to LoadBalancer.
  • OCCM identified the first load balancer in Octavia as it's own because the name was kube_service_kubernetes_default_kubernetes and rebuilt it to point to Kubernetes cluster B.
  • Kubernetes cluster A is now desynchronized and has a LoadBalancer entry which no longer exists.

What you expected to happen:
Ideally, I would like to see the load balancer be created with a random suffix in OpenStack (IE: kube_service_kubernetes_default_kubernetes_vxwmx). We use suffixes in other places in Kubernetes, makes sense to use them here too. Further, if a load balancer collision were to exist, instead of deleting/recreating a load balancer of the same name, perhaps an error event should be fired instead if the suffix option is unavailable?

Also, if the occm is going to delete an existing load balancer, there should be a more obvious log entry for this. Currently it just transitions from ACTIVE to PENDING_CREATE in the logs. It's very ambiguous and should have more detailed output.

How to reproduce it:

  • Create two Kubernetes clusters in OpenStack in the same project, change the default:kubernetes service on each one from ClusterIP to LoadBalancer with a 5 minute gap between the first cluster being fully provisioned.
  • Check the Octavia API, only one load balancer will exist pointed to the second cluster.

Anything else we need to know?:

  • Kubernetes cluster A and Kubernetes cluster B have two different Octavia UUIDs, so the LoadBalancer was deleted/created, not simply modified.
  • Container image: docker.io/k8scloudprovider/openstack-cloud-controller-manager:latest

Environment:

  • openstack-cloud-controller-manager version: latest
  • OpenStack version: Zed
  • Others:

Metadata

Metadata

Assignees

No one assigned

    Labels

    kind/bugCategorizes issue or PR as related to a bug.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions