How To AutoScale Pods in Kubernetes Using HPA (Horizontal Pod AutoScaler) – Minikube Demo

It is very important to have autoscaling and downscaling in place to support huge traffic. Kubernetes provides the facility to support this using HPA and VPA.

VPA (Vertical Pod AutoScaler)
Vertical Pods Autoscaler (VPA) allocates more (or less) cpu or memory to existing pods. It generally do the following work:

  • VPA continuously checks metrics values you configured during setup AT A DEFAULT 10 SEC intervals
  • When the threshold is met, VPA attempts to change the allocated memory and/or CPU
  • VPA mainly updates the resources(CPU/Memory) inside the deployment or replication controller specs
  • When pods are restarted the new allocated memory/CPU are applied to the created pods.

HPA (Horizontal Pod AutoScaler)
HPA scales up/down the number of Pods replicas. HPA do the following work:

  • HPA continuously checks metrics values you configure during setup AT A DEFAULT 30 SEC intervals
  • It increases the number of pods if the SPECIFIED threshold is met
  • HPA mainly updates the number of replicas inside the deployment or replication controller
  • The Deployment/Replication Controller WOULD THEN roll-out ANY additional needed pods

HPA Example/Demo
In this post, we will specifically cover the HPA example. For this you need to have minikube up and running. You can setup minikube and kubectl by following this tutorial –

Enable metrics-server

minikube addons enable metrics-server
minikube addons list

Create a file name nginx.yaml to create a deployment

apiVersion: apps/v1
kind: Deployment
name: nginx
namespace: default
replicas: 1
app: nginx
app: nginx
- name: nginx
image: nginx:1.7.9
- containerPort: 80
# You must specify requests for CPU to autoscale
# based on CPU utilization
cpu: 100m

Create another file nginx-hpa.yaml for HPA

apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
name: nginx
apiVersion: apps/v1
kind: Deployment
name: nginx
minReplicas: 1
maxReplicas: 10
targetCPUUtilizationPercentage: 10

Now apply these yaml file to create deployment and HPA for nginx. Run the following command:

kubectl apply -f nginx.yaml
kubectl apply -f nginx-hpa.yaml

Now you can check the pods must be getting created and also the HPA. Run the following command to check:

kubectl get pods
kubectl get hpa

Now Expose the nginx deployment by creating a service

kubectl expose deployment nginx --type=LoadBalancer --name=nginx-service

This will expose the nginx deployment with the help of service and then we would be able to access the nginx service with some IP and port. To get the IP and port, run the below command:

minikube service nginx-service --url

You can this url in browser and should see nginx home page. This shows that we have successfully deployed an nginx pod.

Now, we have everything in place, we will generate some load on nginx and will see the autoscaling in action. To generate load run the following kubectl command.

kubectl run --generator=run-pod/v1 -it --rm load-generator --image=busybox /bin/sh

After running kubectl command you will see a bash prompt in which you have to run the wget command in a loop:

/ # while true; do wget -q -O-; done

This will run the wget command in a loop generating some load on the pod. Now check if the HPA shows some load and the number of pods are increasing with the help of following command:

kubectl get hpa
kubectl get pods

As you see, the number of pods are increasing when the load increases. This is how the HPA actually works.