Skip to content

Conversation

@richabanker
Copy link
Contributor

Which issue(s) this PR fixes:

@k8s-ci-robot k8s-ci-robot added the area/developer-guide Issues or PRs related to the developer guide label Dec 11, 2025
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: richabanker

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. labels Dec 11, 2025
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. labels Dec 11, 2025
@richabanker
Copy link
Contributor Author

cc @dgrisonnet @dashpole

@k8s-triage-robot
Copy link

Unknown CLA label state. Rechecking for CLA labels.

Send feedback to sig-contributor-experience at kubernetes/community.

/check-cla
/easycla

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 12, 2025
1. **Testing requirement**: The metric must have comprehensive tests that validate:
- The metric is registered and emitted correctly
- The metric has the expected labels and values under known conditions
- The metric is included in the [stable metrics list](https://github.com/kubernetes/kubernetes/blob/master/test/instrumentation/testdata/stable-metrics-list.yaml). See the [instrumentation test README](https://github.com/kubernetes/kubernetes/tree/master/test/instrumentation/README.md) for steps on how to generate this file correctly.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we can enforce with a presubmit? E.g. a hack verify script?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is already a script for that in https://github.com/kubernetes/kubernetes/blob/master/hack/verify-generated-stable-metrics.sh, but I don't know if we updated that to work for Beta

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This script should only check for (BETA, STABLE) levels as those are checked by default here unless the --allstabilityclasses flag is passed, which we dont when generating the stable metrics, ref

So looks like we are covered here for BETA and STABLE. Though the name "verify-generated-stable-metrics.sh" is a little misleading since it doesnt contain beta. I'll see if I can improve that in a follow up.


2. **Stability validation**: The metric should have been at Beta stability for at least one release to ensure it has been sufficiently validated in production environments.

3. **API Review**: Graduating a metric to Stable requires an API review by SIG Instrumentation, as it represents a contractual API agreement. See the [API Review](/contributors/devel/sig-instrumentation/metric-stability.md#api-review) section in the metrics stability documentation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think ideally we want to be part of beta graduation? Most of the time by the time something is going stable it is too late to make changes without breaking existing users.

Or maybe we should do API review when it is added and for beta graduation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to only do it during the Beta graduation process, otherwise we will need to have a look at all the changes to Alpha metrics and that could take quite a toll

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What Damien said makes sense, I'll change to add the API review at the time of graduation to beta as well. We can keep reviewing for stable graduation too, and if/when turns out to be quite unactionable from our end - we can remove ourselves from stable graduation review process at that time ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marking metric as stable is a commitment by SIG to maintain stability guarantees. I think it's more important for the owning SIG leads to review it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I added that for STABLE graduation, primarily a review from owning SIG is required along with an approval from SIG inst. Lmk if that sounds good or if we think that SIG Inst doesnt need to be present in the review process at all for graduation to STABLE.

@dgrisonnet
Copy link
Member

cc @rexagod

1. **Testing requirement**: The metric must have comprehensive tests that validate:
- The metric is registered and emitted correctly
- The metric has the expected labels and values under known conditions
- The metric is included in the [stable metrics list](https://github.com/kubernetes/kubernetes/blob/master/test/instrumentation/testdata/stable-metrics-list.yaml). See the [instrumentation test README](https://github.com/kubernetes/kubernetes/tree/master/test/instrumentation/README.md) for steps on how to generate this file correctly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the case even for BETA metrics, otherwise our approval will not be required to validate the graduation.

1. **Testing requirement**: The metric must have comprehensive tests that validate:
- The metric is registered and emitted correctly
- The metric has the expected labels and values under known conditions
- The metric is included in the [stable metrics list](https://github.com/kubernetes/kubernetes/blob/master/test/instrumentation/testdata/stable-metrics-list.yaml). See the [instrumentation test README](https://github.com/kubernetes/kubernetes/tree/master/test/instrumentation/README.md) for steps on how to generate this file correctly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is already a script for that in https://github.com/kubernetes/kubernetes/blob/master/hack/verify-generated-stable-metrics.sh, but I don't know if we updated that to work for Beta


2. **Stability validation**: The metric should have been at Beta stability for at least one release to ensure it has been sufficiently validated in production environments.

3. **API Review**: Graduating a metric to Stable requires an API review by SIG Instrumentation, as it represents a contractual API agreement. See the [API Review](/contributors/devel/sig-instrumentation/metric-stability.md#api-review) section in the metrics stability documentation.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is fine to only do it during the Beta graduation process, otherwise we will need to have a look at all the changes to Alpha metrics and that could take quite a toll

@richabanker richabanker force-pushed the metrics-graduation-guidelines branch from 13b5bb0 to e80ec21 Compare January 8, 2026 23:32
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 8, 2026
@richabanker richabanker force-pushed the metrics-graduation-guidelines branch from e80ec21 to 9de615d Compare January 8, 2026 23:32
@k8s-ci-robot k8s-ci-robot added size/S Denotes a PR that changes 10-29 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 8, 2026
@richabanker
Copy link
Contributor Author

Ping for another look

@dashpole @dgrisonnet @rexagod

@richabanker richabanker force-pushed the metrics-graduation-guidelines branch from d3fe14e to a33d767 Compare January 9, 2026 22:11
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 9, 2026
@richabanker richabanker force-pushed the metrics-graduation-guidelines branch from a33d767 to 0aa1588 Compare January 9, 2026 22:11

When graduating a metric from Alpha to Beta or from Beta to Stable, the following requirements must be met. For more information on stability levels and their guarantees, see [Metrics Stability](#metrics-stability).

#### Graduating to Beta
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a section on cardinality requirements as well, that states that the owning SIG must review and propose, to the best of their ability, the "least cardinal" version of the metric being introduced that also fulfills their use-case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Im not sure if we want to differentiate between ALPHA / BETA based on cardinality. Ideally thats a concern that should be thought through at the time when a new metric is being introduced (i.e. right from ALPHA). Plus "least cardinal" is kinda subjective too.. so not sure how to frame this clearly to avoid confusion.

@richabanker richabanker force-pushed the metrics-graduation-guidelines branch 2 times, most recently from 7b754e3 to 091e2db Compare January 16, 2026 05:55
@dgrisonnet
Copy link
Member

ping @dgrisonnet

@richabanker richabanker force-pushed the metrics-graduation-guidelines branch from 091e2db to 4d97b9e Compare January 30, 2026 22:28
@richabanker
Copy link
Contributor Author

I think we should prioritize merging this since there are a bunch of metric graduation PRs open and likely there will be more in future.

(sorry for the continuous pings!)

cc @dashpole @rexagod @dgrisonnet


Wherever possible, ensure that metrics graduating to Beta follow the [Prometheus metric naming best practices](https://prometheus.io/docs/practices/naming/).

1. **Testing requirement**: The metric must have a corresponding test that validates:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw I slightly revised the testing requirements here. Ideally we would like for tests to validate that the metric behaves as expected when the relevant code instrumented with that metric is executed (instead of just verifying that the metric is registered with the right name / stability level / labels etc.)

Does this sound reasonable?

cc @dashpole @dgrisonnet @rexagod

@richabanker richabanker force-pushed the metrics-graduation-guidelines branch from 1147118 to f4e5045 Compare January 30, 2026 23:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/developer-guide Issues or PRs related to the developer guide cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. sig/contributor-experience Categorizes an issue or PR as relevant to SIG Contributor Experience. sig/instrumentation Categorizes an issue or PR as relevant to SIG Instrumentation. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants