Skip to content

20% of the openstack-cinder-csi-driver-controller-metrics is down #2126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ProbstDJakob opened this issue Mar 26, 2025 · 5 comments
Open
Labels
bug Bug

Comments

@ProbstDJakob
Copy link

ProbstDJakob commented Mar 26, 2025

Describe the bug
The openstack-cinder-csi-driver-controller-metrics.openshift-cluster-csi-drivers.svc:9202/metrics endpoint requested by the openstack-cinder-csi-driver-controller-monitor service monitor is down since upgrading from 4.18.0-okd-scos.2 to 4.18.0-okd-scos.4. It seems that the csi-driver container of the openstack-cinder-csi-driver-controller deployment does not listen on the port 8202 as requested by the kube-rbac-proxy-8202 (which provides the metrics endpoint). The logs of the kube-rbac-proxy-8202 container contain the following log:

I0326 15:20:44.249257 1 log.go:245] http: proxy error: dial tcp 127.0.0.1:8202: connect: connection refused

Version
4.18.0-okd-scos.4 IPI on OpenStack

How reproducible
The issue is reproducible on our staging and production cluster.

Cluster Upgrade started Upgrade finished TargetDown alert firing since
staging 2025-03-24T09:41Z 2025-03-24T11:09Z 2025-03-24T11:02Z
production 2025-03-25T08:49Z 2025-03-25T10:24Z 2025-03-25T09:57Z
@GingerGeek
Copy link
Member

Are you still experiencing this issue?

@ProbstDJakob
Copy link
Author

ProbstDJakob commented Apr 19, 2025

Yes, it has not stopped firing since and the container still logs the error.
We have also not updated as the update process does not work as stated in the issue #2125.

@GingerGeek
Copy link
Member

My understanding is that the signing keys were fixed in 4.18.0-okd-scos.8.

With regards to your reported issue, I have found a similar report here: https://issues.redhat.com/browse/OCPBUGS-54975

Does this seem to match? It has an open PR here but has not yet been merged openshift/csi-operator#379

@ProbstDJakob
Copy link
Author

Yes, it is the same issue and as for the upgrade issue, we will wait until there is an official upgrade path.

@GingerGeek GingerGeek added kind/bug Categorizes issue or PR as related to a bug. bug Bug and removed kind/bug Categorizes issue or PR as related to a bug. labels Apr 22, 2025
@github-project-automation github-project-automation bot moved this to To triage in Bug Triage Apr 22, 2025
@GingerGeek GingerGeek moved this from To triage to In progress in Bug Triage Apr 22, 2025
@jonasbartho
Copy link

Can confirm the same behavior in 4.18.scos.8 and 4.18.scos.9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug
Projects
Status: In progress
Development

No branches or pull requests

3 participants