KEP-2837: Add instrumentation section and update GA criteria#6180
KEP-2837: Add instrumentation section and update GA criteria#6180ndixita wants to merge 1 commit into
Conversation
|
/assign @tallclair |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: ndixita The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| #### Admission & API Validation | ||
| These metrics track feature adoption, user intent, and validation friction within the control plane. | ||
|
|
||
| ##### `pod_level_resources_admission_total` |
There was a problem hiding this comment.
| ##### `pod_level_resources_admission_total` | |
| ##### `kubelet_pod_level_resources_admission_total` |
There was a problem hiding this comment.
+1. This should be the main adoption metric (ALPHA, removed in a future release). I recommend dropping the other adoption related metrics listed below.
Let's make a note here that this metrics is ALPHA, temporary, and removed in 2-3 release.
| These metrics track feature adoption, user intent, and validation friction within the control plane. | ||
|
|
||
| ##### `pod_level_resources_admission_total` | ||
| Total number of pods processed during Kubelet admission, categorized by resource configuration strategy. |
There was a problem hiding this comment.
How do you anticipate using this? I wonder whether kube-state-metrics could meet the use case?
There was a problem hiding this comment.
This is to monitor the adoption and if there's any friction in the new validation.
|
|
||
| This metric is recorded as a counter. | ||
|
|
||
| ##### `pod_level_resources_validation_errors_total` |
There was a problem hiding this comment.
Do we have any generic metrics tracking validation errors? I wonder whether we could count validation errors by field path (stripping out specific identifiers like index or key), or if the cardinality would be too high?
/cc @jpbetz @yongruilin
There was a problem hiding this comment.
Good question — we don't have a generic one today; the existing apiserver_validation_* metrics only track declarative-vs-handwritten parity.
I think the catch is cardinality: raw field paths are unbounded. We could strip subscripts (spec.containers[].resources.limits[]) or label by field.Error.Origin (the rule kind) instead.
But it feels more like declarative-validation-framework infra though, not really this KEP — maybe a separate issue under sig/api-machinery? Keeping pod_level_resources_validation_errors_total here for now.
There was a problem hiding this comment.
Cardinality is already high just for the individual fields across all versions of all APIs. If then we try to multiply that by any other labels it get's completely out of control (even resources are risky this way... consider the CRD cases). Better for feature owners to identify what is actually needed an add metrics for just those neesd.
|
|
||
| This metric is recorded as a counter. | ||
|
|
||
| ##### `pod_level_resources_defaulting_total` |
There was a problem hiding this comment.
AFAIK we don't have any metrics measuring defaulting or validation decisions. I don't know if there's a technical reason for this or if we just haven't had a use case for it, but I'm hesitant to introduce the metrics here.
There was a problem hiding this comment.
Yeah, I think we can skip this.
| Tracks operation failures during the container lifecycle that are specific to the shared pod-level resource pool. | ||
|
|
||
| ##### `kubelet_pod_level_oom_kills_total` | ||
| Total number of OOM kills triggered specifically because the shared pod-level memory pool was exhausted. This metric is crucial for identifying cases where a container was killed even if it was under its own individual limit, but the pod's aggregate limit was reached. |
There was a problem hiding this comment.
How will this be measured / detected?
There was a problem hiding this comment.
If you agree that this is worth tracking, we could measure by:
- fetching memory.events for all containers and pod cgroup using cadvisorStatsPRovider
- pod limit ooms = pod ooms - sum(container ooms)
IMO it is worth tracking as pod-level resources allow resource sharing among the containers. So this metric could help understand if pod-level limit is set correctly or not.
| #### Resource State (State Metrics) | ||
| These metrics expose the current state of Pod-level resource requests and limits, primarily for consumption by kube-state-metrics and observability dashboards. | ||
|
|
||
| ##### `kube_pod_level_resources_requests` |
There was a problem hiding this comment.
I think drop the level on these. Either kube_pod_resources_requests or kube_pod_spec_resources_requests would be more consistent with the other metrics.
| - `namespace` - Namespace of the pod. | ||
| - `uid` - Kubernetes UID of the pod. | ||
| - `resource` - The resource type (e.g., `cpu`, `memory`). | ||
| - `unit` - The unit of the resource (e.g., `core`, `bytes`). |
There was a problem hiding this comment.
The container level metrics also include a node label.
| - `pod` - Name of the pod. | ||
| - `namespace` - Namespace of the pod. | ||
| - `uid` - Kubernetes UID of the pod. |
There was a problem hiding this comment.
This labels strike me as having extremely high cardinality. Or is there a reason we don't need to worry about that here?
There was a problem hiding this comment.
+1. See my comment above about kube-state-metrics. I don't think we need these metrics at all.
| This section outlines the final list of metrics for the Pod-Level Resources feature, excluding Resource Manager extensions. These metrics are designed to provide deep observability into admission control, scheduling efficiency, and Kubelet-level execution. | ||
|
|
||
| #### Admission & API Validation | ||
| These metrics track feature adoption, user intent, and validation friction within the control plane. |
There was a problem hiding this comment.
@richabanker how are metrics for feature adoption handled? Do we keep them in alpha and then deprecate them after 2-3 release? This would be my preference. If so let's make notes here about that plan.
There was a problem hiding this comment.
K8s already exposes kubernetes_feature_enabled metric showing whether a feature is enabled / disabled in a component, can that be used to track "feature adoption" in a way if all we care to know about is whether this feature was on/off ?
There was a problem hiding this comment.
We want to track how many pods have pod-level resources set. Even after enabling the feature gate, it is required to explicitly set resources at pod-level in the spec to use this functionality. Does that make sense?
There was a problem hiding this comment.
Then we would never want to deprecate this metric even when the feature graduates to GA right?
There was a problem hiding this comment.
Can you track this with kube state metrics does have to be a metric? If we can use kube state metrics that's usually better for this sort of thing
There was a problem hiding this comment.
Oh taking a step back, actually adding a permanent metric just to parse a resource's spec to capture presence/absence of a field will be an anti-pattern. We would generally want metrics if they help with determining operational health (health / availability / latency) of features , basically something actionable for cluster admins.
And regarding KSM, agree that if we want to expose metrics about a resource's spec/status, KSM would be the way to go. But since pod.spec.resource is not natively tracked in existing KSM metrics today, you'd have to add a new metric in the KSM repo (similar to kube_pod_level_resources_requests being proposed in this PR) that can parse this new field to convert that to a metric. Currently it only exposes these metrics for pod resource https://github.com/kubernetes/kube-state-metrics/blob/main/docs/metrics/workload/pod-metrics.md
Another way to use KSM for this would be to add a label onto pod metadata with the value you want to track, then configure KSM with --metric-labels-allowlist=pods=[<label>] which will make the existing kube_pod_labels KSM metric to show how many pods have this label set. Also cc @dgrisonnet to confirm if thats the right way to go about feature adoption metrics depending on resource spec.
| This metric is recorded as a counter. | ||
|
|
||
| #### Resource State (State Metrics) | ||
| These metrics expose the current state of Pod-level resource requests and limits, primarily for consumption by kube-state-metrics and observability dashboards. |
There was a problem hiding this comment.
kube-state-metrics is informer based, not metric based.
Let's drop the metrics listed here for that purpose. If we need this information, we should open a PR against kube-state-metrics after this feature merges.
Specifically, I recommend dropping:
kube_pod_level_resource_requestskube_pod_level_resource_limits
| - `pod` - Name of the pod. | ||
| - `namespace` - Namespace of the pod. | ||
| - `uid` - Kubernetes UID of the pod. |
There was a problem hiding this comment.
+1. See my comment above about kube-state-metrics. I don't think we need these metrics at all.
| #### The Kubelet (Execution Phase) | ||
| Tracks operation failures during the container lifecycle that are specific to the shared pod-level resource pool. | ||
|
|
||
| ##### `kubelet_pod_level_oom_kills_total` |
There was a problem hiding this comment.
| ##### `kubelet_pod_level_oom_kills_total` | |
| ##### `kubelet_pod_oom_kills_total` |
? (I don't think we need the feature name here, the fact that it's a pod oom is enough)
|
|
||
| This metric is recorded as a counter. | ||
|
|
||
| #### The Kubelet (Execution Phase) |
There was a problem hiding this comment.
Recommend adding a "pod CPU throttling metric". Maybe pod_cpu_cfs_throttled_seconds_total ?
| #### Admission & API Validation | ||
| These metrics track feature adoption, user intent, and validation friction within the control plane. | ||
|
|
||
| ##### `pod_level_resources_admission_total` |
There was a problem hiding this comment.
+1. This should be the main adoption metric (ALPHA, removed in a future release). I recommend dropping the other adoption related metrics listed below.
Let's make a note here that this metrics is ALPHA, temporary, and removed in 2-3 release.
Signed-off-by: Dixita <ndixita@google.com>
Uh oh!
There was an error while loading. Please reload this page.