What happened?
When trying to create a NodeReadinessRule after a full install, the server fails to call the webhook because the service port 443 is not found for service nrr-webhook-service.
After fixing the port misconfig on the ValidatingWebhookConfiguration, the server fails to call the webhook because of a mismatch in the certificate.
After further investigation, it seems that the webhook service is sharing the same endpoint as the metrics service, which causes a failure when trying to call the webhook.
Steps to Reproduce
- Install the CRD using the instructions in the book (just change the version to
v0.3.0):
VERSION=v0.3.0
kubectl apply -f https://github.com/kubernetes-sigs/node-readiness-controller/releases/download/${VERSION}/crds.yaml
kubectl wait --for condition=established --timeout=30s crd/nodereadinessrules.readiness.node.x-k8s.io
- Do the full install of the controller using the instructions in the book:
kubectl apply -f https://github.com/kubernetes-sigs/node-readiness-controller/releases/download/${VERSION}/install-full.yaml
- Save the example custom resource presented in the book to a file and try to apply it:
$ kubectl apply -f nrr-cr.yaml
Error from server (InternalError): error when creating "nrr-cr.yaml": Internal error occurred: failed calling webhook "vnodereadinessrule.kb.io": failed to call webhook: Post "https://nrr-webhook-service.nrr-system.svc:443/validate-readiness-node-x-k8s-io-v1alpha1-nodereadinessrule?timeout=10s": no service port 443 found for service "nrr-webhook-service"
- Verify that the webhook service is using port 8443, not 443:
$ kubectl get svc -n nrr-system nrr-webhook-service
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
nrr-webhook-service ClusterIP 10.96.225.215 <none> 8443/TCP 135m
- Edit
nrr-validating-webhook-configuration to use port 8443 instead (as a quick fix).
- Try to apply the custom resource in the file again:
$ kubectl apply -f nrr-cr.yaml
Error from server (InternalError): error when creating "nrr-cr.yaml": Internal error occurred: failed calling webhook "vnodereadinessrule.kb.io": failed to call webhook: Post "https://nrr-webhook-service.nrr-system.svc:8443/validate-readiness-node-x-k8s-io-v1alpha1-nodereadinessrule?timeout=10s": tls: failed to verify certificate: x509: certificate is valid for nrr-metrics-service.nrr-system.svc, nrr-metrics-service.nrr-system.svc.cluster.local, not nrr-webhook-service.nrr-system.svc
- Verify that both the metrics service and the webhook service are using the same endpoint:
$ kubectl get endpointslices.discovery.k8s.io -n nrr-system
NAME ADDRESSTYPE PORTS ENDPOINTS AGE
nrr-metrics-service-n2bk7 IPv4 8443 10.244.3.6 3h15m
nrr-webhook-service-gmk67 IPv4 8443 10.244.3.6 144m
- Verify that the webhook service should be listening to the webhook server on port 9443, not 8443:
$ kubectl describe pods -n nrr-system nrr-controller-manager-99964d6bc-kl298 | grep Port
Port: 9443/TCP (webhook-server)
Expected Behavior
The controller should validate and create the resource without any issues.
Controller Version / Image Tag
v0.3.0
Kubernetes Version
Client Version: v1.35.0 Kustomize Version: v5.7.1 Server Version: v1.35.0
Controller Logs
folded logs
2026-06-11T23:39:33Z INFO version: unknown
2026-06-11T23:39:33Z INFO controller-runtime.builder Registering a validating webhook {"GVK": "readiness.node.x-k8s.io/v1alpha1, Kind=NodeReadinessRule", "path": "/validate-readiness-node-x-k8s-io-v1alpha1-nodereadinessrule"}
2026-06-11T23:39:33Z INFO controller-runtime.webhook Registering webhook {"path": "/validate-readiness-node-x-k8s-io-v1alpha1-nodereadinessrule"}
2026-06-11T23:39:33Z INFO setup webhook enabled
2026-06-11T23:39:33Z INFO setup starting manager
2026-06-11T23:39:33Z INFO controller-runtime.metrics Starting metrics server
2026-06-11T23:39:33Z INFO starting server {"name": "health probe", "addr": "[::]:8081"}
2026-06-11T23:39:33Z INFO controller-runtime.webhook Starting webhook server
2026-06-11T23:39:33Z INFO controller-runtime.certwatcher Updated current TLS certificate {"cert": "/tmp/k8s-metrics-server/metrics-certs/tls.crt", "key": "/tmp/k8s-metrics-server/metrics-certs/tls.key"}
2026-06-11T23:39:33Z INFO controller-runtime.metrics Serving metrics server {"bindAddress": ":8443", "secure": true}
2026-06-11T23:39:33Z INFO controller-runtime.certwatcher Updated current TLS certificate {"cert": "/tmp/k8s-webhook-server/serving-certs/tls.crt", "key": "/tmp/k8s-webhook-server/serving-certs/tls.key"}
2026-06-11T23:39:33Z INFO controller-runtime.webhook Serving webhook server {"host": "", "port": 9443}
2026-06-11T23:39:33Z INFO controller-runtime.certwatcher Starting certificate poll+watcher {"cert": "/tmp/k8s-metrics-server/metrics-certs/tls.crt", "key": "/tmp/k8s-metrics-server/metrics-certs/tls.key", "interval": "10s"}
2026-06-11T23:39:33Z INFO controller-runtime.certwatcher Starting certificate poll+watcher {"cert": "/tmp/k8s-webhook-server/serving-certs/tls.crt", "key": "/tmp/k8s-webhook-server/serving-certs/tls.key", "interval": "10s"}
I0611 23:39:33.857687 1 leaderelection.go:257] attempting to acquire leader lease nrr-system/ba65f13e.readiness.node.x-k8s.io...
I0611 23:39:50.640748 1 leaderelection.go:271] successfully acquired lease nrr-system/ba65f13e.readiness.node.x-k8s.io
2026-06-11T23:39:50Z DEBUG events nrr-controller-manager-99964d6bc-kl298_b0f177bd-876f-459e-8329-2a1b927def97 became leader {"type": "Normal", "object": {"kind":"Lease","namespace":"nrr-system","name":"ba65f13e.readiness.node.x-k8s.io","uid":"123ad03a-e750-4834-9c59-70cea85b5513","apiVersion":"coordination.k8s.io/v1","resourceVersion":"8706"}, "reason": "LeaderElection"}
2026-06-11T23:39:50Z INFO Starting EventSource {"controller": "nodereadiness-controller", "controllerGroup": "readiness.node.x-k8s.io", "controllerKind": "NodeReadinessRule", "source": "kind source: *v1alpha1.NodeReadinessRule"}
2026-06-11T23:39:50Z INFO Starting EventSource {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: *v1.Node"}
2026-06-11T23:39:50Z INFO Starting Controller {"controller": "nodereadiness-controller", "controllerGroup": "readiness.node.x-k8s.io", "controllerKind": "NodeReadinessRule"}
2026-06-11T23:39:50Z INFO Starting workers {"controller": "nodereadiness-controller", "controllerGroup": "readiness.node.x-k8s.io", "controllerKind": "NodeReadinessRule", "worker count": 1}
2026-06-11T23:39:50Z INFO Starting Controller {"controller": "node", "controllerGroup": "", "controllerKind": "Node"}
2026-06-11T23:39:50Z INFO Starting workers {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "worker count": 1}
2026-06-11T23:39:50Z INFO Reconciling node {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-control-plane"}, "namespace": "", "name": "multinode-control-plane", "reconcileID": "1a388efd-bf7d-4533-a207-2ca98f7e4259", "node": "multinode-control-plane"}
2026-06-11T23:39:50Z INFO Processing node against rules {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-control-plane"}, "namespace": "", "name": "multinode-control-plane", "reconcileID": "1a388efd-bf7d-4533-a207-2ca98f7e4259", "node": "multinode-control-plane", "ruleCount": 0}
2026-06-11T23:39:50Z INFO Reconciling node {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker"}, "namespace": "", "name": "multinode-worker", "reconcileID": "276249dc-70fc-48a9-9279-24f5e4126cf5", "node": "multinode-worker"}
2026-06-11T23:39:50Z INFO Processing node against rules {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker"}, "namespace": "", "name": "multinode-worker", "reconcileID": "276249dc-70fc-48a9-9279-24f5e4126cf5", "node": "multinode-worker", "ruleCount": 0}
2026-06-11T23:39:50Z INFO Reconciling node {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker2"}, "namespace": "", "name": "multinode-worker2", "reconcileID": "7b320adf-8420-405a-bd51-73535d99baed", "node": "multinode-worker2"}
2026-06-11T23:39:50Z INFO Processing node against rules {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker2"}, "namespace": "", "name": "multinode-worker2", "reconcileID": "7b320adf-8420-405a-bd51-73535d99baed", "node": "multinode-worker2", "ruleCount": 0}
2026-06-11T23:39:50Z INFO Reconciling node {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker3"}, "namespace": "", "name": "multinode-worker3", "reconcileID": "56996778-5e98-40ee-9b0a-029f7175f99e", "node": "multinode-worker3"}
2026-06-11T23:39:50Z INFO Processing node against rules {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker3"}, "namespace": "", "name": "multinode-worker3", "reconcileID": "56996778-5e98-40ee-9b0a-029f7175f99e", "node": "multinode-worker3", "ruleCount": 0}
2026/06/12 01:34:31 http: TLS handshake error from 172.18.0.3:44250: remote error: tls: bad certificate
2026/06/12 01:36:44 http: TLS handshake error from 172.18.0.3:57437: remote error: tls: bad certificate
2026/06/12 02:02:10 http: TLS handshake error from 172.18.0.3:38203: remote error: tls: bad certificate
Additional Environment Details
cert-manager-controller image: "quay.io/jetstack/cert-manager-controller:v1.20.2"
What happened?
When trying to create a NodeReadinessRule after a full install, the server fails to call the webhook because the service port 443 is not found for service
nrr-webhook-service.After fixing the port misconfig on the ValidatingWebhookConfiguration, the server fails to call the webhook because of a mismatch in the certificate.
After further investigation, it seems that the webhook service is sharing the same endpoint as the metrics service, which causes a failure when trying to call the webhook.
Steps to Reproduce
v0.3.0):nrr-validating-webhook-configurationto use port 8443 instead (as a quick fix).Expected Behavior
The controller should validate and create the resource without any issues.
Controller Version / Image Tag
v0.3.0
Kubernetes Version
Client Version: v1.35.0 Kustomize Version: v5.7.1 Server Version: v1.35.0
Controller Logs
folded logs
2026-06-11T23:39:33Z INFO version: unknown
2026-06-11T23:39:33Z INFO controller-runtime.builder Registering a validating webhook {"GVK": "readiness.node.x-k8s.io/v1alpha1, Kind=NodeReadinessRule", "path": "/validate-readiness-node-x-k8s-io-v1alpha1-nodereadinessrule"}
2026-06-11T23:39:33Z INFO controller-runtime.webhook Registering webhook {"path": "/validate-readiness-node-x-k8s-io-v1alpha1-nodereadinessrule"}
2026-06-11T23:39:33Z INFO setup webhook enabled
2026-06-11T23:39:33Z INFO setup starting manager
2026-06-11T23:39:33Z INFO controller-runtime.metrics Starting metrics server
2026-06-11T23:39:33Z INFO starting server {"name": "health probe", "addr": "[::]:8081"}
2026-06-11T23:39:33Z INFO controller-runtime.webhook Starting webhook server
2026-06-11T23:39:33Z INFO controller-runtime.certwatcher Updated current TLS certificate {"cert": "/tmp/k8s-metrics-server/metrics-certs/tls.crt", "key": "/tmp/k8s-metrics-server/metrics-certs/tls.key"}
2026-06-11T23:39:33Z INFO controller-runtime.metrics Serving metrics server {"bindAddress": ":8443", "secure": true}
2026-06-11T23:39:33Z INFO controller-runtime.certwatcher Updated current TLS certificate {"cert": "/tmp/k8s-webhook-server/serving-certs/tls.crt", "key": "/tmp/k8s-webhook-server/serving-certs/tls.key"}
2026-06-11T23:39:33Z INFO controller-runtime.webhook Serving webhook server {"host": "", "port": 9443}
2026-06-11T23:39:33Z INFO controller-runtime.certwatcher Starting certificate poll+watcher {"cert": "/tmp/k8s-metrics-server/metrics-certs/tls.crt", "key": "/tmp/k8s-metrics-server/metrics-certs/tls.key", "interval": "10s"}
2026-06-11T23:39:33Z INFO controller-runtime.certwatcher Starting certificate poll+watcher {"cert": "/tmp/k8s-webhook-server/serving-certs/tls.crt", "key": "/tmp/k8s-webhook-server/serving-certs/tls.key", "interval": "10s"}
I0611 23:39:33.857687 1 leaderelection.go:257] attempting to acquire leader lease nrr-system/ba65f13e.readiness.node.x-k8s.io...
I0611 23:39:50.640748 1 leaderelection.go:271] successfully acquired lease nrr-system/ba65f13e.readiness.node.x-k8s.io
2026-06-11T23:39:50Z DEBUG events nrr-controller-manager-99964d6bc-kl298_b0f177bd-876f-459e-8329-2a1b927def97 became leader {"type": "Normal", "object": {"kind":"Lease","namespace":"nrr-system","name":"ba65f13e.readiness.node.x-k8s.io","uid":"123ad03a-e750-4834-9c59-70cea85b5513","apiVersion":"coordination.k8s.io/v1","resourceVersion":"8706"}, "reason": "LeaderElection"}
2026-06-11T23:39:50Z INFO Starting EventSource {"controller": "nodereadiness-controller", "controllerGroup": "readiness.node.x-k8s.io", "controllerKind": "NodeReadinessRule", "source": "kind source: *v1alpha1.NodeReadinessRule"}
2026-06-11T23:39:50Z INFO Starting EventSource {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "source": "kind source: *v1.Node"}
2026-06-11T23:39:50Z INFO Starting Controller {"controller": "nodereadiness-controller", "controllerGroup": "readiness.node.x-k8s.io", "controllerKind": "NodeReadinessRule"}
2026-06-11T23:39:50Z INFO Starting workers {"controller": "nodereadiness-controller", "controllerGroup": "readiness.node.x-k8s.io", "controllerKind": "NodeReadinessRule", "worker count": 1}
2026-06-11T23:39:50Z INFO Starting Controller {"controller": "node", "controllerGroup": "", "controllerKind": "Node"}
2026-06-11T23:39:50Z INFO Starting workers {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "worker count": 1}
2026-06-11T23:39:50Z INFO Reconciling node {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-control-plane"}, "namespace": "", "name": "multinode-control-plane", "reconcileID": "1a388efd-bf7d-4533-a207-2ca98f7e4259", "node": "multinode-control-plane"}
2026-06-11T23:39:50Z INFO Processing node against rules {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-control-plane"}, "namespace": "", "name": "multinode-control-plane", "reconcileID": "1a388efd-bf7d-4533-a207-2ca98f7e4259", "node": "multinode-control-plane", "ruleCount": 0}
2026-06-11T23:39:50Z INFO Reconciling node {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker"}, "namespace": "", "name": "multinode-worker", "reconcileID": "276249dc-70fc-48a9-9279-24f5e4126cf5", "node": "multinode-worker"}
2026-06-11T23:39:50Z INFO Processing node against rules {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker"}, "namespace": "", "name": "multinode-worker", "reconcileID": "276249dc-70fc-48a9-9279-24f5e4126cf5", "node": "multinode-worker", "ruleCount": 0}
2026-06-11T23:39:50Z INFO Reconciling node {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker2"}, "namespace": "", "name": "multinode-worker2", "reconcileID": "7b320adf-8420-405a-bd51-73535d99baed", "node": "multinode-worker2"}
2026-06-11T23:39:50Z INFO Processing node against rules {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker2"}, "namespace": "", "name": "multinode-worker2", "reconcileID": "7b320adf-8420-405a-bd51-73535d99baed", "node": "multinode-worker2", "ruleCount": 0}
2026-06-11T23:39:50Z INFO Reconciling node {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker3"}, "namespace": "", "name": "multinode-worker3", "reconcileID": "56996778-5e98-40ee-9b0a-029f7175f99e", "node": "multinode-worker3"}
2026-06-11T23:39:50Z INFO Processing node against rules {"controller": "node", "controllerGroup": "", "controllerKind": "Node", "Node": {"name":"multinode-worker3"}, "namespace": "", "name": "multinode-worker3", "reconcileID": "56996778-5e98-40ee-9b0a-029f7175f99e", "node": "multinode-worker3", "ruleCount": 0}
2026/06/12 01:34:31 http: TLS handshake error from 172.18.0.3:44250: remote error: tls: bad certificate
2026/06/12 01:36:44 http: TLS handshake error from 172.18.0.3:57437: remote error: tls: bad certificate
2026/06/12 02:02:10 http: TLS handshake error from 172.18.0.3:38203: remote error: tls: bad certificate
Additional Environment Details
cert-manager-controller image: "quay.io/jetstack/cert-manager-controller:v1.20.2"