Load balancing in Kubernetes is a variety of ways to redirect incoming traffic to specific servers in the cluster, thus distributing traffic evenly and making scaling tasks easier.
The main benefit of balancing is avoiding application downtime. It prevents planned downtime due to the deployment of a new software version or unplanned downtime due to hardware issues.
In this article, we'll look at how load balancing helps stabilize the Kube cluster, increasing application availability. As Kubernetes services help load balancing, we will show how they work and then give specific examples of load balancing.
But first of all, let's talk about how Kube implements pod tracking, which makes the balancing itself much easier.
Pods in Kubernetes are temporary objects that get a new IP each time they are started. After a task is completed, they are destroyed and then re-created on a new deployment. Without Kubernetes service tools, we would have to keep track of the IPs of all active pods, which would be a very complicated task, especially as our application scales. The Kube service solves this problem thanks to the selector. Let's take a look at this code (replace the values with your actual ones):
kind: Service
metadata:
name: hostmanapp
spec:
selector:
app: hostmanapp
ports:
- protocol: TCP
name: hostmanapp
port: 5428
targetPort: 5428
The selector ensures that services are correctly matched with the associated pods. When a service receives a pod with a matching label, it updates the pod's IP in the Endpoints object lists. Endpoints keep track of the IP addresses of all pods and update automatically. Each service creates its own Endpoint object.
This article won't go into too much detail about Endpoints. Just remember that the Endpoints update the list of IP addresses so that the Kube service can redirect its traffic.
Defining a service using a selector is the most common method, but we can do this without it. For example, if we migrate our application to Kube, we can evaluate its behavior without migrating the server. Let's try using an existing application hosted on the old server:
kind: Service
metadata:
name: hostmanapp-without-ep
spec:
ports:
- protocol: TCP
port: 5428
targetPort: 5428
And then:
kind: Endpoints
metadata:
name: hostmanapp-without-ep
subsets:
- addresses:
- ip: x.x.x.x #specify the IP
ports:
- port: 5428
This will set the name hostmanapp-without-ep
to connect to the hostmanapp
server.
-
Kube, by default, always creates a ClusterIP service. However, there are four types of services, each designed for its own tasks, and together, they help to provide quite flexible load balancing. Let's take a look at all of them and give code examples for customization.
Designed for intra-cluster communication between applications. It is configured like this (the application values are random; you should replace them with your own):
kind: Service #mandatory line to define any service
metadata:
name: hostmanapp
spec:
type: ClusterIP
selector:
app: hostmanapp
ports:
- protocol: TCP
port: 5428
targetPort: 5428
External service for mapping pods to hosts via a persistent port defined separately below (all values are also random; replace them with your own):
kind: Service
metadata:
name: hostmanapp
spec:
type: NodePort
selector:
app: hostmanapp
ports:
- protocol: TCP
port: 5428
targetPort: 5428
nodePort: 32157
LoadBalancer is a cloud infrastructure service that allows you to provide routing through a website, for example. Here is the code for launching it:
kind: Service
metadata:
name: hostmanapp
spec:
type: LoadBalancer
selector:
app: hostmanapp
ports:
- protocol: TCP
port: 5428
targetPort: 5428
This service is needed to provide out-of-cluster access. The way to do it is simple:
metadata:
name: hostmanapp
spec:
type: ExternalName
externalName: hostmanapp.mydomain.com
Note that any service will have a DNS name created using this pattern: service-name.space-name.svc.cluster.local
. This record will point to the cluster IP. Without it, Kube will query the IPs of specific pods.
As we have seen from the descriptions of all four Kube services, you can organize load balancing in different ways. Let's start by describing how it is done inside the cluster.
ClusterIP service is intended for intra-cluster balancing (you can find the code for configuring this and other Kube services above). It is suitable, for example, for organizing the interaction of separate groups of pods located within one Kube cluster. You can provide access to the service in two ways: through DNS or with the environment variables.
Above, we have already described the DNS method. Let's add that it is the most common and recommended way of interaction between microservices. But note that DNS works in Kube only with a DNS-server add-on: for example, CoreDNS.
As for environment variables, they are set when starting a new service via the service-name
instruction. You may need PORT
and SERVICE_HOST
variables, and here are the directives for setting them:
service-name_PORT
service-name_SERVICE_HOST
It can be performed using NodePort (hereafter NP) and LoadBalancer (hereafter LB). NP is suitable for balancing a limited number of services and has the advantage of providing connectivity without a dedicated external balancing tool.
The first limitation of NP is that it is suitable only for a private network; you can't organize an Internet connection via NP. Another disadvantage is that it only works over static ports in a limited range, and the service must allocate the same port to each host. This becomes problematic when the application scales to multiple microservices.
The LB provides a public IP or DNS to which external users can connect. Traffic flows from the LB to the matched service on the assigned port, redirecting it to the worker pods. However, the LB does not have a direct match to the pods.
Let's see how to create an external LB. In the example below, we will start a pod and connect its GUI from an external network. Note that the LB does not filter incoming or outgoing traffic. It is just a proxy to connect to the external network, redirecting traffic to the appropriate modules/services. Create a cluster with create cluster
command and enter the following parameters (name
and region
values are random; you must also enter your SSH public key in the ssh-public-key
field).
--name myHostmanCluster
--region my-hostman-region
--with-oidc
--ssh-access
--ssh-public-key <xxxxxxxxxx>
--managed
Now we edit the pod's yaml:
kind: Pod
metadata:
name: hostmanpod
labels:
app: hostmanpod
spec:
containers:
- name: hostmanpod
image: hostmanpod:latest
Now edit the pod’s YAML file:
kind: Pod
metadata:
name: hostmanpod
labels:
app: hostmanpod
spec:
containers:
- name: hostmanpod
image: hostmanpod:latest
Then create a pod:
kubectl apply -f hostmanpod.yaml
Let's make sure it works:
kubectl get pods --selector='app=hostmanpod'
Now activate the LB (again, the values in the code samples are random):
kind: Service
metadata:
name: hostmanpod-ext-serv
spec:
type: LoadBalancer
selector:
app: hostmanpod
ports:
- name: hostmanpod-admin
protocol: TCP
port: 14953
targetPort: 14953
Next, start the service and confirm:
kubectl apply -f hostmanpod-svc.yaml
service/hostmanpod-ext-serv created
kubectl get svc
Copy the obtained DNS to the browser and enter the port specified in the code above. Let's assume that we got this DNS:
http://b9f305e6d743a85cb32f48f6a210cb51.my-hostman-region.com
Then we should paste the following into the browser:
http://b9f305e6d743a85cb32f48f6a210cb51.my-hostman-region.com:14953
Now you can share this address with anyone who wants to connect to your administrator account. As you can see, creating and configuring an external load balancer for an application is quite easy.
LoadBalancer indeed has some limitations, but Ingress helps to bypass them.
We have seen that LoadBalancer creates application instances for each service. This is fine as long as we have a few services, but as their number increases, it becomes difficult to manage them. Also, LB does not support URL routing, SSL, and more. This is where Ingress comes to the rescue, which is an extension for NP and LB. Ingress processes internal traffic to determine which pods or services to forward next. The main function of Ingress is load balancing, but it can also perform URL routing, SSL termination, and several other functions. There are quite a few Ingress configurations, and they are easy to find by queries. Here is one for illustrative purposes:
The example above defines the rules by which traffic from end users will flow. We should also add that Ingress is not a Kube service like LB or NP, but a set of rules used by these services. In addition, a cluster using Ingress needs an Ingress Controller. There are a few of these controllers, and you can check out some popular solutions: AWS ALB, NGINX, Istio, Traefik.
Different controllers have different features and capabilities, so you will have to evaluate them based on your requirements. But whichever controller you use, Ingress will greatly simplify the configuration and management of routing rules and help you implement SSL-based traffic. And, of course, like traditional tools, Ingress controllers support a variety of balancing algorithms.
We learned about the differences between Kubernetes services, how to access them, how to organize intra-cluster and external balancing, and got acquainted with additional tools. To effectively use Kube services, you need to understand which of them is optimal for your tasks and configure it accordingly. It will save a lot of debugging time and ensure trouble-free operation of your applications and services.