Understanding the Kubernetes Scale subresource
How it interacts with HPA and PDB & its importance for Custom Resources
Understanding the scale
subresource in Kubernetes
Resources like Deployments
and Statefulsets
in Kubernetes have a
scale
subresource which captures three things:
spec.replicas
: The desired number of replicas.status.replicas
: The actual, current number of replicas.status.labelSelector
: Identifies the pods managed by the resource.
Here’s an example of what a typical query response looks like when you explore the scale
subresource:
# Sample output from querying a Kubernetes deployment's scale settings.
➜ curl -s localhost:8001/apis/apps/v1/namespaces/kube-system/deployments/coredns/scale | jq .
{
"kind": "Scale",
"apiVersion": "autoscaling/v1",
"metadata": {
"name": "coredns",
"namespace": "kube-system",
"uid": "0f39b1dd-8cb4-4374-a95d-11d96c0b9d6a",
"resourceVersion": "3260769",
"creationTimestamp": "2023-08-10T15:55:24Z"
},
"spec": {
"replicas": 2
},
"status": {
"replicas": 2,
"selector": "k8s-app=kube-dns"
}
}
Why’s the scale
subresource Necessary?
When I first learned about the scale
subresource, I wondered why it’s needed when the replica information is already available in the spec
. As with many things in Kubernetes, it’s needed to support extensibility and flexibility.
Kubernetes doesn’t enforce a uniform schema for representing desired and current replica counts in workload resources. The scale
subresource provides a unified interface, enabling different workload resources, both built-in and custom, to consistently expose their scaling settings. This uniform interface allows for seamless integration with the rest of the Kubernetes ecosystem.
This is better understood by looking at how the scale
subresource is used.
Uses of the scale
subresource
The scale
subresource is used for the following cases:
- By the Horizontal Pod Autoscaler (HPA): HPA uses the
scale
subresource to dynamically adjust the desired replica count based on some utilization metrics like CPU usage, QPS, etc. - By the Pod Disruption Budget (PDB) controller: When
.spec.maxUnavailable
or.spec.minAvailable
in PDB configurations is specified as a percentage, the PDB controller queries thescale
subresource, of the resource managing the pod, to get the desired number of replicas. - Scaling pods through the
kubectl scale
command: This command allows manually changing the replica count of resources that have thescale
subresource enabled.
Scale subresource and Custom Resources
If you’re working with custom resources that manage pods and want them to integrate seamlessly with HPA, PDB, or be compatible with the kubectl scale
command, you need to enable the scale
subresource in the Custom Resource Definition (CRD).
If you are using kubebuilder
to write controllers, you can enable the scale
subresource for your custom resource as mentioned
here. Also, it seems that the scale subresource can be enabled
PDB, Custom Resources, and the scale
subresource
For pods managed by a custom resource, PDB can be used without restrictions only if the custom resource supports the scale
subresource. This is because when a percentage value is specified for .spec.maxUnavailable
or .spec.minAvailable
, the PDB controller needs to know the total desired replicas in order to calculate the number of replicas that should be available during a disruption. And PDB controller gets the total desired replicas via the scale
subresource of the resource owning the pods.
// if maxUnavailable is set as a percentage
desiredAvailableReplicas := desiredTotalReplicas - (desiredTotalReplicas * maxUnavailable / 100)
// if minAvailable is set a percentage
desiredAvailableReplicas := desiredTotalReplicas * minAvailable / 100
Call paths for reference:
getExpectedPodCount → getExpectedScale → getScaleController
If scale
subresource isn’t enabled for your custom resource, you can still use PDB, albeit with certain limitations - you can only use .spec.minAvailable
with an integer value, not percentages, as mentioned in the Kubernetes
documentation.