Skip to content

add ARO hosted_cluster_info metric and labels #6180

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sdminonne
Copy link
Contributor

@sdminonne sdminonne commented May 22, 2025

What this PR does / why we need it:
This PR adds a new metric hosted_cluster_info. It adds new labels:

  • Microsoft_subscriptionId
  • Microsoft_resourceGroupName
  • Microsoft_resourceType
  • Microsoft_resourceId
  • Location

Fixes #CNTRLPLANE-902

Checklist

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • [x ] This change includes unit tests.

Signed-off-by: Salvatore Dario Minonne <[email protected]>
@openshift-ci openshift-ci bot requested review from csrwng and hasueki May 22, 2025 08:53
@openshift-ci openshift-ci bot added the area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release label May 22, 2025
Copy link
Contributor

openshift-ci bot commented May 22, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: sdminonne
Once this PR has been reviewed and has the lgtm label, please assign sjenning for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cwbotbot
Copy link

cwbotbot commented May 22, 2025

Test Results

e2e-aws

e2e-aks

@sdminonne
Copy link
Contributor Author

/retest-required

Copy link
Contributor

openshift-ci bot commented May 22, 2025

@sdminonne: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/okd-scos-e2e-aws-ovn 86c70d3 link false /test okd-scos-e2e-aws-ovn

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@@ -92,6 +92,9 @@ const (

ClusterSizeOverrideMetricName = "hypershift_cluster_size_override_instances"
clusterSizeOverrideMetricHelp = "Number of HostedClusters with a cluster size override annotation"

HostedClusterInfoMetricName = "hosted_cluster_info"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we might want 2 info metrics

  • one hosted_cluster_info metric for all generic labels (at least hosted cluster name, namespace, _id and maybe platform type).
  • one specific for Azure platform (hosted_cluster_azure_info) when platform type = "Azure".

One suggestion I made somewhere else was that we could put everything into "hosted_cluster_info" but it's recommended to have consistent labels per metric. And we might get similar requests in the future for AWS, GCP, ....

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @simonpasquier I'll discuss this with my team and let you know.

Comment on lines +194 to +198
append(hclusterLabels, "Microsoft_subscriptionId",
"Microsoft_resourceGroupName",
"Microsoft_resourceType",
"Microsoft_resourceId",
"Location"), nil)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) the implicit convention is that label names are lower-case and words separated by underscores. If the metric becomes Azure specific, the "Microsoft_" prefix may be dropped.

Suggested change
append(hclusterLabels, "Microsoft_subscriptionId",
"Microsoft_resourceGroupName",
"Microsoft_resourceType",
"Microsoft_resourceId",
"Location"), nil)
append(hclusterLabels, "subscription_id",
"resource_ group_name",
"resource_type",
"resource_id",
"location"), nil)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note taken, if we'll split the metric we may want to get rid of the prefix

azInfo := hcluster.Spec.Platform.Azure
subID := azInfo.SubscriptionID
resGroup := azInfo.ResourceGroupName
resourceID := fmt.Sprintf("/subscriptions/%s/resourceGroups/%s/providers/Microsoft.RedHatOpenshift/hcpOpenShiftClusters/%s",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it make sense to have providers/Microsoft.RedHatOpenshift/hcpOpenShiftClusters? What about HyperShift deployments in Azure that aren't ARO?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I can't answer. I got this string from @geoberle and I don't have a real background. Need to collect more info. Ty!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants