Understanding the Kubernetes Scale subresource
How it interacts with HPA and PDB & its importance for Custom Resources
Understanding the scale subresource in Kubernetes
Resources like Deployments and Statefulsets in Kubernetes have a
scale subresource which captures three things:
spec.replicas: The desired number of replicas.status.replicas: The actual, current number of replicas.status.labelSelector: Identifies the pods managed by the resource.
Here’s an example of what a typical query response looks like when you explore the scale subresource:
# Sample output from querying a Kubernetes deployment's scale settings.
➜ curl -s localhost:8001/apis/apps/v1/namespaces/kube-system/deployments/coredns/scale | jq .
{
"kind": "Scale",
"apiVersion": "autoscaling/v1",
"metadata": {
"name": "coredns",
"namespace": "kube-system",
"uid": "0f39b1dd-8cb4-4374-a95d-11d96c0b9d6a",
"resourceVersion": "3260769",
"creationTimestamp": "2023-08-10T15:55:24Z"
},
"spec": {
"replicas": 2
},
"status": {
"replicas": 2,
"selector": "k8s-app=kube-dns"
}
}
Why’s the scale subresource Necessary?
When I first learned about the scale subresource, I wondered why it’s needed when the replica information is already available in the spec. As with many things in Kubernetes, it’s needed to support extensibility and flexibility.
Kubernetes doesn’t enforce a uniform schema for representing desired and current replica counts in workload resources. The scale subresource provides a unified interface, enabling different workload resources, both built-in and custom, to consistently expose their scaling settings. This uniform interface allows for seamless integration with the rest of the Kubernetes ecosystem.
This is better understood by looking at how the scale subresource is used.
Uses of the scale subresource
The scale subresource is used for the following cases:
- By the Horizontal Pod Autoscaler (HPA): HPA uses the
scalesubresource to dynamically adjust the desired replica count based on some utilization metrics like CPU usage, QPS, etc. - By the Pod Disruption Budget (PDB) controller: When
.spec.maxUnavailableor.spec.minAvailablein PDB configurations is specified as a percentage, the PDB controller queries thescalesubresource, of the resource managing the pod, to get the desired number of replicas. - Scaling pods through the
kubectl scalecommand: This command allows manually changing the replica count of resources that have thescalesubresource enabled.
Scale subresource and Custom Resources
If you’re working with custom resources that manage pods and want them to integrate seamlessly with HPA, PDB, or be compatible with the kubectl scale command, you need to enable the scale subresource in the Custom Resource Definition (CRD).
If you are using kubebuilder to write controllers, you can enable the scale subresource for your custom resource as mentioned
here. Also, it seems that the scale subresource can be enabled
PDB, Custom Resources, and the scale subresource
For pods managed by a custom resource, PDB can be used without restrictions only if the custom resource supports the scale subresource. This is because when a percentage value is specified for .spec.maxUnavailable or .spec.minAvailable, the PDB controller needs to know the total desired replicas in order to calculate the number of replicas that should be available during a disruption. And PDB controller gets the total desired replicas via the scale subresource of the resource owning the pods.
// if maxUnavailable is set as a percentage
desiredAvailableReplicas := desiredTotalReplicas - (desiredTotalReplicas * maxUnavailable / 100)
// if minAvailable is set a percentage
desiredAvailableReplicas := desiredTotalReplicas * minAvailable / 100
Call paths for reference:
getExpectedPodCount → getExpectedScale → getScaleController
If scale subresource isn’t enabled for your custom resource, you can still use PDB, albeit with certain limitations - you can only use .spec.minAvailable with an integer value, not percentages, as mentioned in the Kubernetes
documentation.