search by tags

for the user

adventures into the land of the command line

note to self about the k8s hpa

it scales based on cpu requests, not limits.

that means that if a deployment with like:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: some-deployment
spec:
.
.
  template:
    spec:
      containers:
      - name: some-deployment
        image: some-image:latest
        imagePullPolicy: Always
          resources:
            requests:
              memory: 100Mi
              cpu: 0.1
            limits:
              memory: 200Mi
              cpu: 1
.
.
---
apiVersion: autoscaling/v1
kind: HorizontalPodAutoscaler
spec:
  maxReplicas: 10
  minReplicas: 2
  scaleTargetRef:
    apiVersion: extensions/v1beta1
    kind: Deployment
    name: some-deployment
  targetCPUUtilizationPercentage: 80

the hpa will spin up a new pod when the current deployment is using 80% of 0.1 vcpus. Not 80% of 1 vcpu. If you expect high utlisation, you’ll find youself maxing out your maxReplicas super quickly.