Scaling Applications

One of the key benefits of Kubernetes is the ability to scale your applications up or down based on demand. Scaling can be done manually or automatically.

Manual Scaling

You can change the number of replicas in a Deployment using the scale command:

kubectl scale deployment nginx-deployment --replicas=10

You can also edit the YAML and reapply:

kubectl edit deployment nginx-deployment

Change the replicas field, save, and exit. Kubernetes will adjust automatically.

Horizontal Pod Autoscaler (HPA)

HPA automatically scales the number of Pods based on observed CPU utilization or other custom metrics. To use HPA, you must have the metrics server installed.

First, ensure your Deployment has resource requests defined:

resources:
  requests:
    cpu: 100m

Then create an HPA:

kubectl autoscale deployment nginx-deployment --cpu-percent=50 --min=2 --max=10

This will keep CPU at around 50%, scaling between 2 and 10 replicas. View HPA status:

kubectl get hpa

Vertical Scaling

Vertical scaling means increasing the resources (CPU/memory) of existing Pods. This usually requires rolling updates. Tools like Vertical Pod Autoscaler (VPA) can automate this, but manual changes to resource requests/limits in the Deployment YAML are common.

Two Minute Drill

Manual scaling: kubectl scale deployment name --replicas=N.
Horizontal Pod Autoscaler (HPA) scales based on metrics like CPU.
HPA requires metrics server and resource requests.
Vertical scaling changes resource limits; may require Pod restart.

Need more clarification?

Drop us an email at career@quipoinfotech.com

Welcome to Quipoin

Quipoin Menu

Scaling Applications

Need more clarification?