Blog

Developer’s Insight: Navigating Vertical and Horizontal Pod Autoscaling

We’ve been encountering resource shortages in one of our environments, which has led to delays in processing. This issue arises from sudden spikes in load on our RabbitMQ queues, necessitating an increase in workers to handle the queue consumption.

Our project is centered around Microservices-based architecture and leverages various services, including Celery, RabbitMQ, Redis, MinIO, MongoDB, and Postgres.

As the number of task elements queued in RabbitMQ grows, a single worker can’t handle the increased load, which also impacts the resource consumption of individual workers. To address these issues, we use Horizontal Pod Autoscaling (HPA) and Vertical Pod Autoscaling (VPA).

Previously, this process required significant manual intervention. Developers had to continuously monitor resource usage and manually scale up workers whenever the load increased. This approach was time-consuming and inefficient, diverting valuable developer time and attention from other critical tasks.

To address these challenges, we turned to Kubernetes’ autoscaling capabilities. By implementing both Vertical Pod Autoscaling (VPA) and Horizontal Pod Autoscaling (HPA), we aimed to create a more resilient and responsive system. VPA adjusts the resource limits of our pods based on actual usage, ensuring that each pod has the necessary CPU and memory to perform efficiently. Meanwhile, HPA dynamically scales the number of pod replicas based on metrics such as CPU utilization or custom metrics like RabbitMQ queue length.

This dual approach allowed us to handle varying loads more effectively, reduce manual intervention, and improve overall system performance.

Implementing Vertical Pod Autoscaling (VPA)

Vertical Pod Autoscaling (VPA) involves monitoring the resource usage of a pod, specifically CPU and memory. To implement VPA, we utilize Kubernetes’ native toolset, which includes the installation of Custom Resource Definitions (CRDs) and several deployments. These components work together to monitor resource usage and automatically adjust the resource limits of your pods, ensuring they have the appropriate amount of CPU and memory based on actual demand. This helps maintain optimal performance and resource efficiency within your Kubernetes environment.

Installation of CRDs: https://github.com/kubernetes/autoscaler/tree/master/vertical-pod-autoscaler

The above link will help you install the perquisites of Vertical Pod Autoscaling.

Once you have installed the CRDs, you will now need three major deployments necessary for enabling vertical pod monitoring and autoscaling.

The three deployments are:

VPA Admission Controller

VPA Recommender

VPA Updater

Recommender – it monitors the current and past resource consumption and, based on it, provides recommended values for the containers’ cpu and memory requests.

Updater – it checks which of the managed pods have correct resources set and, if not, kills them so that they can be recreated by their controllers with the updated requests.

Admission Plugin – it sets the correct resource requests on new pods (either just created or recreated by their controller due to Updater’s activity).

For installing these deployments, you can install the project that is mentioned in the above link, or you can use these YAML files.

VPA Admission Controller Yaml:

apiVersion: apps/v1 
kind: Deployment 
metadata: 
  name: vpa-admission-controller 
  namespace: kube-system 
  labels: 
    app: vpa-admission-controller 
spec: 
  replicas: 1 
  selector: 
    matchLabels: 
      app: vpa-admission-controller 
  strategy: 
    type: RollingUpdate 
    rollingUpdate: 
      maxSurge: 25% 
      maxUnavailable: 25% 
  template: 
    metadata: 
      labels: 
        app: vpa-admission-controller 
    spec: 
      containers: 
        - name: admission-controller 
          image: registry.k8s.io/autoscaling/vpa-admission-controller:1.1.2 
          imagePullPolicy: Always 
          ports: 
            - containerPort: 8000 
              protocol: TCP 
            - containerPort: 8944 
              name: prometheus 
              protocol: TCP 
          resources: 
            limits: 
              cpu: 200m 
              memory: 500Mi 
            requests: 
              cpu: 50m 
              memory: 200Mi 
          env: 
            - name: NAMESPACE 
              valueFrom: 
                fieldRef: 
                  fieldPath: metadata.namespace 
          volumeMounts: 
            - mountPath: /etc/tls-certs 
              name: tls-certs 
              readOnly: true 
      securityContext: 
        runAsNonRoot: true 
        runAsUser: 65534 
      serviceAccountName: vpa-admission-controller 
      terminationGracePeriodSeconds: 30 
      dnsPolicy: ClusterFirst 
      restartPolicy: Always 
      volumes: 
        - name: tls-certs 
          secret: 
            secretName: vpa-tls-certs

VPA Recommender

apiVersion: apps/v1 
kind: Deployment 
metadata: 
  name: vpa-recommender 
  namespace: kube-system 
  labels: 
    app: vpa-recommender 
spec: 
  replicas: 1 
  selector: 
    matchLabels: 
      app: vpa-recommender 
  strategy: 
    type: RollingUpdate 
    rollingUpdate: 
      maxSurge: 25% 
      maxUnavailable: 25% 
  template: 
    metadata: 
      labels: 
        app: vpa-recommender 
    spec: 
      containers: 
        - name: recommender 
          image: registry.k8s.io/autoscaling/vpa-recommender:1.1.2 
          imagePullPolicy: Always 
          ports: 
            - containerPort: 8942 
              name: prometheus 
              protocol: TCP 
          resources: 
            limits: 
              cpu: 200m 
              memory: 1000Mi 
            requests: 
              cpu: 50m 
              memory: 500Mi 
          terminationMessagePath: /dev/termination-log 
          terminationMessagePolicy: File 
      securityContext: 
        runAsNonRoot: true 
        runAsUser: 65534 
      serviceAccountName: vpa-recommender 
      terminationGracePeriodSeconds: 30 
      dnsPolicy: ClusterFirst 
      restartPolicy: Always 
      schedulerName: default-scheduler

VPA Updater

apiVersion: apps/v1 
kind: Deployment 
metadata: 
  name: vpa-updater 
  namespace: kube-system 
  labels: 
    app: vpa-updater 
spec: 
  replicas: 1 
  selector: 
    matchLabels: 
      app: vpa-updater 
  strategy: 
    type: RollingUpdate 
    rollingUpdate: 
      maxSurge: 25% 
      maxUnavailable: 25% 
  template: 
    metadata: 
      labels: 
        app: vpa-updater 
    spec: 
      containers: 
        - name: updater 
          image: registry.k8s.io/autoscaling/vpa-updater:1.1.2 
          imagePullPolicy: Always 
          ports: 
            - containerPort: 8943 
              name: prometheus 
              protocol: TCP 
          resources: 
            limits: 
              cpu: 200m 
              memory: 1000Mi 
            requests: 
              cpu: 50m 
              memory: 500Mi 
          env: 
            - name: NAMESPACE 
              valueFrom: 
                fieldRef: 
                  fieldPath: metadata.namespace 
          terminationMessagePath: /dev/termination-log 
          terminationMessagePolicy: File 
      securityContext: 
        runAsNonRoot: true 
        runAsUser: 65534 
      serviceAccountName: vpa-updater 
      terminationGracePeriodSeconds: 30 
      dnsPolicy: ClusterFirst 
      restartPolicy: Always 
      schedulerName: default-scheduler

Once you’ll add these deployments, you will need a secret called vpa-tls-cert.yaml file that is being used in the secrets of one of the deployment.

Vpa-tls-cert.yaml

apiVersion: v1 
data: 
  caCert.pem: >- 
    LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURNRENDQWhpZ0F3SUJBZ0lVWmQvQzNURmNrbXM5eXpRMEowK2E3VVRkK0ZFd0RRWUpLb1pJaHZjTkFRRUwKQlFBd0dURVhNQlVHQTFVRUF3d09kbkJoWDNkbFltaHZiMnRmWTJFd0lCY05NalF3TlRJME1USXdPVFEwV2hnUApNakk1T0RBek1Ea3hNakE1TkRSYU1Ca3hGekFWQmdOVkJBTU1Eblp3WVY5M1pXSm9iMjlyWDJOaE1JSUJJakFOCkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXBaQjRVaHdRTXBIY25NYURvbkdLVjlnOGJrSGUKSkNVSzc0T2drTVNiL01HNUZ1T1RLRDhMaituUnZ1S3F4Sm1yaXJkTzBoRGxyR1FzSXMrM2RNaUZYS2YrYldBbwplcXNIY1o4L2hSZ1Q3SU1yNzUvbG0weTV5RGIvcHhaMS92WkNhUldoUUNEVk04bG85d0dEeTdsbWt0b0R5Q0YwCk5CNVlINjdQR3g0Y2RSa3d1S2V5ZVYyclVETmNBaDhFTnBoNWsxeW4rckFreG5xa3N5YS9iSCtUWDR0ZmZwZEMKYlNqY0pGSXlPLzRmYjRtd256NWhvSlBEdTV3aFVtc3J6UTZjQ3NRT2dpZStzWlZHaW1GZnZBWG85L2xKSGpydwppWVlKV3BJWUl1OTZWSnl0Sk5vK2hzRFdGTThCZ05LRnVOUXdjSk4yT1VxVFhtSDM1OU11cXd1NElRSURBUUFCCm8yNHdiREFkQmdOVkhRNEVGZ1FVcVF2bHJqd08zUUtPeHhvY1pnamIreW1Ja2tvd0h3WURWUjBqQkJnd0ZvQVUKcVF2bHJqd08zUUtPeHhvY1pnamIreW1Ja2tvd0R3WURWUjBUQVFIL0JBVXdBd0VCL3pBWkJnTlZIUkVFRWpBUQpnZzUyY0dGZmQyVmlhRzl2YTE5allUQU5CZ2txaGtpRzl3MEJBUXNGQUFPQ0FRRUFtNm9mQ0ZPN3NiMjhLdnZ2CmZBUVVlNDJnR3NZWGpCSG1ydjlIY2dqUFRVZmZRd25NSHpJblNydWRmVUZBTXNQRkw1Mm1WemVISnQ5cEkzUnMKYVJwSkF4MmRvS3pJMS9RaUxsc1RQWEp2aFJSK1RFc2wva0dxU0taQ3V3OFNIRGdhR1R0cFA1aEZkOWdIUUVrUApUYlc2NWRjd251ZG1md3RLZW15dFZUbnkxRjB5M2lHc3hwTlA1TkM0TzNVVG1pc3p4bTVkbjlOay9ncnc1dHV5Cm1IZi8ya2FUcy9Na3BoNEpGS1preVpyUTM2T2YyeE9DdUhMV1o5dzFhL0c1WGVCUHM0K1piNEhMbFJreW4xWWIKZXE3YWRGbHlmVzFJcDNVZHNVckZUbnA5WG1LY3BrVElWSWtlMU1uRDArclZMZElHOTRTeW1Za1FnRkFycWdhZQo3MXNIdnc9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== 
  caKey.pem: >- 
    LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcFFJQkFBS0NBUUVBcFpCNFVod1FNcEhjbk1hRG9uR0tWOWc4YmtIZUpDVUs3NE9na01TYi9NRzVGdU9UCktEOExqK25SdnVLcXhKbXJpcmRPMGhEbHJHUXNJcyszZE1pRlhLZitiV0FvZXFzSGNaOC9oUmdUN0lNcjc1L2wKbTB5NXlEYi9weFoxL3ZaQ2FSV2hRQ0RWTThsbzl3R0R5N2xta3RvRHlDRjBOQjVZSDY3UEd4NGNkUmt3dUtleQplVjJyVUROY0FoOEVOcGg1azF5bityQWt4bnFrc3lhL2JIK1RYNHRmZnBkQ2JTamNKRkl5Ty80ZmI0bXduejVoCm9KUER1NXdoVW1zcnpRNmNDc1FPZ2llK3NaVkdpbUZmdkFYbzkvbEpIanJ3aVlZSldwSVlJdTk2Vkp5dEpObysKaHNEV0ZNOEJnTktGdU5Rd2NKTjJPVXFUWG1IMzU5TXVxd3U0SVFJREFRQUJBb0lCQUhRbnpuSjA0OXhHdjB0eQpic25aOUtBNndmTHMzVWZvZ1NxVzQyQ052NDV0NzBnaXM0eUo1cmU1NklDT2tCWk1aYkIxZUtRaVhMQTh1MFpWCkNyV3hOUGRUbVVudEs1a2NRcVd4ZnlRR1Izd29idnUzNTZPMENhZHhCcDJZUVlKMkRST2lyRFhNa3llNUI1NDcKL3RaQUpibWpvUUdZanVrOTNMK2xxZ0d4ZDIraThsV1I0SXliWHMrYzdjZmdXd0ZyRmJLVnJEVUxDNXlsck13awp0RHZkNjZXdUJ1aEY1Y2RCRE9LTjlaZ1JnUHFEVlRqYUZNUTkyWHdoNEE1Q1JGZzRCK21sUmFRQ2NaZHdEZWJXCktGNGJBUjh0RDNYdjB4UUxBb0pCVkROaWRoSEQ3cERmZVdOQ0pMZTZjc0ZEUjZyeHpFcWJtMWpKTEFYNFBLMHEKUFM0MFdpMENnWUVBMmZqZnpDN1hwTk9jVXJaWFN1VmVxd2dkdHludG83QmJZdFdieVpxdnRaNklEVEpCeWp6TQpjU1VyYStLUmExZjBMYXlwaFhwOVBvUnZLRXI2ZHJXditNMHJQc01NeHVaQWFYMDg0b3hPakovK0NDMmMwa1JEClEzNldhMkUvMEpFb1A0VllwalFaWUFMbjl2UThuUWJLOWJDcG1jUkFTL3pKNWJPMjJ0NjJ3VzhDZ1lFQXduTHgKMFkzWUFMRFVBYlBpcCtMSjd1VDZ1RmZnY3hjOVY4MnFsVC9LZzBOdHJHMXB1dk41RXc1ZWQrQ051RlEvcERWVwpWS0N5bnAyaHJ0OFBvcjRwTkFFUUxzT3FOTks0ellqYm1hK3dhcHpqRkpxV3IybGczbjdEVlFicGVlbTBqa3BLCm5LZjBiNFJ6NFd2MndVVFdyRVM0eXZIRUpmMlhEZDZlUVRBeE4yOENnWUVBbXZQR1BNT3IvdnVEckhUOWR6dTUKWWNKaVJYeGorREo5dExQL2pJRVBtZi81M2MrMVgweDdWWS9EMzJ6d1RhdjM1S1JTMnBXcUJWQm1LUEdzUGNtSgpNRWpDRGxyZ2NXRHJ0MUlWZjBPWTczVXBSSzBRUjVYSmIyakZDODdWYTdKVk4xclhHMGY4SmZuSzV3N1hMQlhSCklIbVhCNzJ3cTVRbi9zZ1VIR0dvNzdjQ2dZRUFnSFd2aStGSmNpdGY1RUFTM1JiV2tSeDFCcFFIbEFFbVpYdFoKMW4vdUtnbkJ5c2Y5c2FSbnVFOGwyY3hmMUFiVWhJYzRJWENJa0lGUzcxUXQ0RFlBd25weFZuT3RYbmhYM25FcgpvcnlPcitBMXBNYjhCYVo0ZUlVR1JvWHFlTUFNcUhRc0ZwSmV5YzJYUUxVeXJ6dnJGcVBQOFVNSGNwRzEyVlBZCitQZjlpOEVDZ1lFQXpIRTFya1dSSG9XYnVjbGRXZWx1OGo1Q2o2NXJNVzVqU3poQklNR1IrZGErQ2lLR21GUGEKM1FSZnhkcHAwWGk2UlM1TGMrRndjQTFXQ1J4TmdYT0UrcGFoL3FFSG42NHJickV1MnlZVTFFTWwyT2haS3Z0QwpFS1p0YjFLcHFUaXhaZjFwMjdCRzY2cnptd0RpSUpTeExQSEh3VE55eHI5VUlYbksyZ1M4eCt3PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo= 
  serverCert.pem: >- 
    LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURNRENDQWhpZ0F3SUJBZ0lVTWlKRDEwRVdZUGVUMk5wbFJLdlcvbE05R1Uwd0RRWUpLb1pJaHZjTkFRRUwKQlFBd0dURVhNQlVHQTFVRUF3d09kbkJoWDNkbFltaHZiMnRmWTJFd0lCY05NalF3TlRJME1USXdPVFEwV2hnUApNakk1T0RBek1Ea3hNakE1TkRSYU1DWXhKREFpQmdOVkJBTU1HM1p3WVMxM1pXSm9iMjlyTG10MVltVXRjM2x6CmRHVnRMbk4yWXpDQ0FTSXdEUVlKS29aSWh2Y05BUUVCQlFBRGdnRVBBRENDQVFvQ2dnRUJBSnFIbUNEVkplbjAKNXpSNlhKaGVJYWFBcnRGVUswa2hRYmNRTDYzelhkRzhndXpkRzF0azlRNW13UGp1OGtNKytwelZDeGhHc2hpSQpmUVJSVklGN1g2R1FydXNsS3A4a011OWhPaGFiTHNtSEpiNll2MmpydjdQSkljMFlrTmwzNHNyeWNIZFNPNjJICmh0eU9qVHprdWExT21YbGF2QzR0YzlkTU5uTTVVbE02eVhSc2FId08rOWJPZ0RuNmR0ZWJPNkhFaXoxdFVrZVMKL2VVNS9OZ3laeDhIUUUvNFpPd2tqUFhkYjFOL0lPaHBxUktqTHY0MVRFNTBMOUtzYStVRzJ4Mnk1aUZ1bXJoZQpjKzdNUmllbmxlcnJRQmcyaU5qOVNVQTdoNGhxOWRwWWIrVHREdENzKzdEczltSFZ5S3hWUktrREdkZDVoK2U5CjFzTCtzclJVNkRrQ0F3RUFBYU5oTUY4d0NRWURWUjBUQkFJd0FEQUxCZ05WSFE4RUJBTUNCZUF3SFFZRFZSMGwKQkJZd0ZBWUlLd1lCQlFVSEF3SUdDQ3NHQVFVRkJ3TUJNQ1lHQTFVZEVRUWZNQjJDRzNad1lTMTNaV0pvYjI5cgpMbXQxWW1VdGMzbHpkR1Z0TG5OMll6QU5CZ2txaGtpRzl3MEJBUXNGQUFPQ0FRRUFOUW9LSTNBRWUvdEdtUUhlCmxFZi92K0sxRnF1QUZrYXFDS0dYM08vcXJCVXBvdXFoY2RmMXNKSzRnRXFlejJWbEVhNmJucnAwYVBoYTNGMzgKS3VOWmNUNXJJNnl2bVZ4cWtCbTczTHUxUjNFQ2NCYk5xaDRjczd1SjZsYUV0OE0reGpuUEZ3WnV5a0NQM0t4SgpMQkJTSWRpMy9rYTMxTVdQV0JsYzdCSklvUFNtOGQ0RDh1NDVMYTJyaGtYMWVSNUluZ0lMMCtVdXFzY3pXUkdHClV4a2t2bEUwYUpBN0hzVEc1azVweHJWZ1VhdUM0VjA5enNPTkYzdHFmYWtYSmU0UUlLWmtTdEJ4bXVPRU9uWXMKUElzVHR1bEFKM3ExbHdITDh4V0RnakZZVmNHRFZBOFFHd2Y1dGJhek9RZXRyb0F2Qy83Q0hscFIxMjhxWEcrMAppbitHSXc9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== 
  serverKey.pem: >- 
    LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb3dJQkFBS0NBUUVBbW9lWUlOVWw2ZlRuTkhwY21GNGhwb0N1MFZRclNTRkJ0eEF2cmZOZDBieUM3TjBiClcyVDFEbWJBK083eVF6NzZuTlVMR0VheUdJaDlCRkZVZ1h0Zm9aQ3U2eVVxbnlReTcyRTZGcHN1eVljbHZwaS8KYU91L3M4a2h6UmlRMlhmaXl2SndkMUk3clllRzNJNk5QT1M1clU2WmVWcThMaTF6MTB3MmN6bFNVenJKZEd4bwpmQTc3MXM2QU9mcDIxNXM3b2NTTFBXMVNSNUw5NVRuODJESm5Id2RBVC9oazdDU005ZDF2VTM4ZzZHbXBFcU11Ci9qVk1UblF2MHF4cjVRYmJIYkxtSVc2YXVGNXo3c3hHSjZlVjZ1dEFHRGFJMlAxSlFEdUhpR3IxMmxodjVPME8KMEt6N3NPejJZZFhJckZWRXFRTVoxM21INTczV3d2Nnl0RlRvT1FJREFRQUJBb0lCQUV3WXM4aVBUa0ZjMGJKZApBMVloSEs4U3RZUHR6L2NPUW44MG9GWW4veTUyaDM2L0QyYTlXNVFBODh4aVZyall2YThYbG9RWTVFRUNqWlhmCnV4NmNNNmFyU2dnUDRHSklBREV6anRodENPaEMva3BTakFmTitEUS91cjUxOTNhdisrWDI1MzFLNklwMnIrblMKNVNoMGRoOHJFcGJaSU13WW8vQUUzdGQxQmQ1bjA4STk5RzlCWktBNmdjL2ZJNEJPVk5lamVWcVEyUUNHOEovVQpRZnd5bG94Z3F0bE1vWklVNW0weS8vQWN6bTg5dGtQcjZnS3psMmh5VXN2eGNjdXBCSzMwWXhvRUUzZ0FudWxjCk44SkdSYitUZmQwYUxEOXhxdHc2aDRHcE9paDdvNkFoNURmcm5PUVlVSytwK1piK0NhMW9RNHJSYVRFekR5YUgKOExHVlVqVUNnWUVBeDdYUDVLaklCOURESkIvYWp6emVZejVBZGVhakdZSmlVOHNPSWxnSGFaK05hWE1qYXlQbQpzRCtLWXZzeEhGNW85VWlQalJKckN3ZDBPaTdvSXFuYVRwRHJmcE5VdWppWURkeXp6VkM2Zmw3NGxJZDBmdzloCkNta0JWVk81VjBUbjF2dnNDam1hM05FNmY0dUpPNm8vTlp5S2MxTVV0OCtZQ1d0NGtuLzBNZmNDZ1lFQXhoWEMKR2RMdDlCVUhDVTBwWDNsb1NSVmpVS0lDdGhjVGN0RFVkdXdyZ2dIOEhERUJpRm9wTjl5ZVNRYS9JTjF6SFRxUQpTQmt5cHNkelBQVUlFZUx2c0lQckFSMGlxWUR3dWU1d1VsVUdyblhjeUFtdGxmNlZSVUpCVUVLSnNJZHF4aXlCCkJPU2VXWXNyS2d0ZWRjSTk1ODNRZGpOaG5iWVdFOVR3OU9QVEswOENnWUJRUS81K1JHZkEzR0xSemd1bHJpMGoKYmcyeVZUUVFPSnNVV25RZjBZbUpKaHRMMm43TnZPd213aUw0alVTN3hpWWhEenpDMGpnN2dvOXdJeEloZkdyRgpVUEdWT2RtL3pxY3VTeG5vMXgxZFZRWkxpL1dDYThmd3l5dENCQmhsdnNmL1c5a09jd0NPNTNpL2NuR0JqWGRDCk1OeGtaV3ZhUkpFeW1BTXB2a3VER1FLQmdRQ2NmSWZ3MnIvOWViY3JVL2dCWXVwT0FrV0paOVA0Z2xacytDbEIKSWVabE9LZ3dwVTV1cDd1MFUrZ0FEUUpsTmsxQXBBbGp5L1JGNlg5U2dza3pTRExQSWdnL3d3S2xJaVlLM1NHRApDWVRUd216KzR4WnRUc2dpQk91UU9tQ0lReExKS0ZOc3lDZUkyZmJwcWoyZmppcFZ2RFNaakpIcmcvUUJDdEtvCnhHc0k1UUtCZ0J3UXB4T1RpNlc4OFZGZGYyRDBZTzJGYUNqV3E4Kzk1SzdUK3Q1YU9CTlhsMzJwd3JzNVI3aGMKUnFEVklZTG9MNWZHVEgyZDcvdmNXS1dwMjFCTDNvZjZldlNXNEFQNnJ0NlRUTFdBYkFjbmRTMlNlOWkrMTVlUgpTQlZibEU3VFA5ajg5UEdDc1F5STNZeWJrMWN6NERrTjJ0MHplU25WVG5aeHR3ZWdWNVVxCi0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg== 
kind: Secret 
metadata: 
  creationTimestamp: '2024-05-24T12:09:44Z' 
  managedFields: 
    - apiVersion: v1 
      fieldsType: FieldsV1 
      fieldsV1: 
        f:data: 
          .: {} 
          f:caCert.pem: {} 
          f:caKey.pem: {} 
          f:serverCert.pem: {} 
          f:serverKey.pem: {} 
        f:type: {} 
      manager: kubectl-create 
      operation: Update 
      time: '2024-05-24T12:09:44Z' 
  name: vpa-tls-certs 
  namespace: kube-system 
  resourceVersion: '60147243' 
  uid: 51ac2807-1af8-4822-b828-0f9876f5c6a6

Now, the installation part is complete here. The thing you will need now is VPA YAML for your deployment.

Example-vpa.yaml

apiVersion: autoscaling.k8s.io/v1 

kind: VerticalPodAutoscaler 

metadata: 

name: pod-name-vpa 

namespace: default 

spec: 

resourcePolicy: 

containerPolicies: 

- containerName: '*' 

controlledResources: 

- cpu 

- memory 

minAllowed: 

cpu: 50m 

memory: 100Mi 

maxAllowed: 

cpu: 2000m 

memory: 4096Mi 

targetRef: 

apiVersion: "apps/v1" 

kind: Deployment 

name: pod-name 

updatePolicy: 

updateMode: "Off"

In the above example, we have set the minimum allowed and maximum allowed resource usage to the pod.

The VPA will now monitor the pod with this given minimum allowed resources, now there is this key call updatePolicy that consists of different update modes.

Modes:

“Auto”: VPA assigns resource requests on pod creation as well as updates them on existing pods using the preferred update mechanism. Currently, this is equivalent to “Recreate” (see below). Once restart free (“in-place”) update of pod requests is available, it may be used as the preferred update mechanism by the “Auto” mode.

“Recreate”: VPA assigns resource requests on pod creation as well as updates them on existing pods by evicting them when the requested resources differ significantly from the new recommendation (respecting the Pod Disruption Budget, if defined). This mode should be used rarely, only if you need to ensure that the pods are restarted whenever the resource request changes. Otherwise, prefer the “Auto” mode which may take advantage of restart-free updates once they are available.

“Initial”: VPA only assigns resource requests on pod creation and never changes them later.

“Off”: VPA does not automatically change the resource requirements of the pods. The recommendations are calculated and can be inspected in the VPA object.

The preferred way to monitor usage would be in the “off” mode, which keeps track of the resource usage and doesn’t automatically change the resources.

In the above Image, the recommendations are updated by the VPA. As you can see in the example vpa yaml file, we set the minimum allowed resources, and the recommendations are updated based on that, the memory is in Bytes value.

As soon as you’re convinced with the recommendations you can allocate the resources to your deployments.

Implementing Horizontal Pod Autoscaling (HPA)

Horizontal Pod Autoscaling (HPA) involves increasing the number of instances of your component. For example, you have a worker pod that takes one minute to complete a task. Under heavy load, where the worker has ten tasks to process simultaneously, you wouldn’t want to wait ten minutes for all tasks to complete sequentially. This is where HPA becomes crucial. By automatically increasing the number of worker pod instances, HPA ensures that multiple tasks can be processed concurrently, significantly reducing the overall processing time and improving performance under heavy loads.

In our case, we have a microservice architecture-based system that involves multiple tasks. Every task is a microservice that runs when the task is called and goes idle. This task execution is taken care of by the broker RabbitMQ.

So, to auto-scale the pods based on the number of queues in RabbitMQ, we use are using a tool called Keda.

Installation: https://keda.sh/docs/2.14/deploy/#install

In our case, we have been using the helm chart for managing our project, and we have integrated this chart to our helm.

Here is the HPA-deployment YAML file:

{{- if (($.Values.autoscale).enabled) }} 

{{- range $groupName, $group := .Values.deploymentGroups }} 

{{- range $componentName, $component := $group.components }} 

 

{{- $config := deepCopy $group.config }} 

{{- $config := deepCopy $component | mergeOverwrite $config }} 

 

{{- if and $config.enabled ($config.autoscale).enabled }} 

--- 

apiVersion: keda.sh/v1alpha1 

kind: ScaledObject 

metadata: 

name: {{$groupName}}-{{$componentName | toString | replace "_" "-"}}-scaler 

namespace: {{$.Release.Namespace}} 

annotations: 

"helm.sh/hook": post-install,post-upgrade 

"autoscaling.keda.sh/paused": "{{ $.Values.autoscale.paused }}" 

spec: 

scaleTargetRef: 

name: {{$groupName}}-{{$componentName | toString | replace "_" "-"}} 

envSourceContainerName: {{$groupName}}-{{$componentName | toString | replace "_" "-"}} 

pollingInterval: {{ $.Values.autoscale.pollingInterval }} 

cooldownPeriod: {{ $.Values.autoscale.cooldownPeriod }} 

minReplicaCount: {{ $config.autoscale.minReplicaCount | default $.Values.autoscale.defaultMinReplicaCount}} 

maxReplicaCount: {{ $config.autoscale.maxReplicaCount | default $.Values.autoscale.defaultMaxReplicaCount}} 

{{ if (($config.autoscale).scaleToZero | default $.Values.autoscale.defaultScaleToZero) -}} 

idleReplicaCount: 0 

{{- end }} 

fallback: 

failureThreshold: 5 

replicas: 1 

advanced: 

restoreToOriginalReplicaCount: true 

horizontalPodAutoscalerConfig: 

name: {{$groupName}}-{{$componentName | toString | replace "_" "-"}}-hpa 

behavior: 

scaleDown: 

stabilizationWindowSeconds: {{ $.Values.autoscale.stabilizationWindowSeconds }} 

triggers: 

- type: rabbitmq 

metadata: 

mode: QueueLength 

value: "{{$config.autoscale.queueLength | default $.Values.autoscale.queueLength}}" 

queueName: {{$config.autoscale.queueName | default $componentName}} 

authenticationRef: 

name: pointing-fish-rabbitmq-auth-trigger 

{{- end -}} 

{{- end -}} 

{{- end -}} 

{{- end -}}

The above YAML is getting the keys from Values.yaml under which there is key called autoscale.

autoscale: 

enabled: true 

paused: "false" 

stabilizationWindowSeconds: 30 

cooldownPeriod: 120 

pollingInterval: 15 

queueLength: 20 

defaultMaxReplicaCount: 10 

defaultminReplicaCount: 1 

defaultScaleToZero: false

Reference for understanding these keys: https://keda.sh/docs/2.14/reference/scaledobject-spec/

I will walk you through some important keys mentioned above.

paused: “true”

This flag will pause the horizontal pod autoscaling for each deployment. Keep it default false.

queueLength: 4

You need to set this limit for each task. For instance, if a task takes one minute to complete, you wouldn’t want to wait ten minutes for the overall processing to finish. To address this, you can configure the queueLength. In this example, let’s set it to 4. When the queue length exceeds this threshold, HPA will scale the worker by adding one replica. This means that an additional replica will be created for every multiple of this set number (in this case, 4). So, if the queue length exceeds 4, one replica will be added; if it exceeds 8, another replica will be added; and this pattern will continue as the queue length increases.

defaultMaxReplicaCount: 10

This key takes of the maximum number of instances should be increased till.

defaultminReplicaCount: 1

This key scales down the pod to this number when the scaling is no longer needed.

defaultScaleToZero: false

If this flag is set to true, the pod will be scaled to the IdleReplicaCount. In some scenarios, a pod may not be in use, and we prefer not to keep it running since it still consumes resources that we do not want to waste.

queueName: app

This is the most crucial key. It enables HPA to monitor the specified queue in RabbitMQ. By using this queue name, HPA can manage to scale up and down as needed.

Share this on -

AI-Powered Interoperability: Connecting People, Data, and Care

India’s Path to Net Zero 2070: How Enterprises Can Lead the Charge Toward a Greener Future

Intelligent Virtual Assistants for Seniors: Are We Ready for AI Companions at Home?

Building human centered AI driven solutions

CaliforniaIndoreBengaluru

SERVICES

INDUSTRY

RESOURCES

Building human centred AI driven solutions

CaliforniaIndoreBengaluru

SERVICES

INDUSTRY

RESOURCES

© 2025 47Billion. All Rights Reserved.

Privacy Policy | Terms & Conditions

Developer’s Insight: Navigating Vertical and Horizontal Pod Autoscaling

You might also like:

AI-Powered Interoperability: Connecting People, Data, and Care

India’s Path to Net Zero 2070: How Enterprises Can Lead the Charge Toward a Greener Future

Intelligent Virtual Assistants for Seniors: Are We Ready for AI Companions at Home?

SERVICES

INDUSTRY

RESOURCES

SERVICES

INDUSTRY

RESOURCES

© 2025 47Billion. All Rights Reserved.

Privacy Policy | Terms & Conditions

© 2025 47Billion. All Rights Reserved.