Following up on our previous post, we’ll now make kubernetes work for us, and automatically scale our application.
One of the really cool kubernetes features is the ability to automatically span new pods for your deployment based on the CPU usage of your pods. This is kinda simple, but very useful.
There is a new version in alpha that you can also use to autoscale based on memory usage and any application specific metric, but for now we’ll start with the basic CPU autoscaling.
To do this, we’ll need to start with a little change on our previous deployment.yml to specify how much CPU this application requires, this also means that to autoscale based on CPU, you’ll need to test and know your application CPU usage profile, but that is a subject for another post.
Take a look at the new at the new “resources” section in the deployment.yaml bellow.
apiVersion: apps/v1 kind: Deployment metadata: labels: run: rails-sample name: rails-sample spec: replicas: 1 selector: matchLabels: run: rails-sample template: metadata: labels: run: rails-sample spec: containers: - args: - rails - s - -b - 0.0.0.0 - -p - "3000" env: - name: DATABASE_HOST value: 192.168.0.15 - name: DATABASE_USERNAME value: root - name: DATABASE_PASSWORD value: password image: urubatan/urubatan_rails_docker_sample:1.0.0 name: rails-sample ports: - containerPort: 3000 protocol: TCP resources: requests: cpu: 0.5 limits: cpu: 0.9
This deployment specifies that each pod requires 50% of one CPU to run, and should be limited to 90% of the CPU.
We can also add memory limits to this section, and probably should, but I’m keeping the example simple.
What is required for the autoscale to work is the “requirements” section, the scaling calculation will be based on the requested CPU.
As in the previous examples, lets start with the command line, then we’ll save it to a yaml to be able to reproduce the environment later.
To autoscale our deployment when it reaches 90% of the requested CPU, we’ll use the following command:
$ kubectl autoscale deployment.apps/rails-sample --min=1 --max=5 --cpu-percent=90
This command will create one “Horizontal Pod Autoscaler” that will scale from 1 up to 5 pods when the available pods are using 90% of the requested CPU.
We can see the resulting YAML with the following command:
$ kubectl get hpa rails-sample -o=yaml
We can save the output of this command to a file to be able to recreate our HPA later, maybe in an autoscale.yaml file, the contents of this file will be similar to the bellow (I’ll clean it a little)
apiVersion: autoscaling/v1 kind: HorizontalPodAutoscaler metadata: name: rails-sample spec: maxReplicas: 5 minReplicas: 1 scaleTargetRef: apiVersion: apps/v1 kind: Deployment name: rails-sample targetCPUUtilizationPercentage: 90
I’ve cleaned up all the state from the file, leaving only the configuration.
This is a pretty simple file, that only sets the bare minimum for the autoscale, selecting what will be autoscaled, and defining the minimum, maximum and the scaling criteria.
You can test this generating some load in the application, and using a “kubectl get all” to show the number of pods and hpa statistics.
As before, please leave any questions in the comments. critics and questions are also welcome.
And if you want to see how to configure a more complex environment for your kubernetes applications and some good practices, come back here next Monday.