Sign In
Sign In

Kubernetes Cluster: Installation, Configuration, and Management

Kubernetes Cluster: Installation, Configuration, and Management
Hostman Team
Technical writer
Kubernetes
22.08.2024
Reading time: 7 min

Kubernetes, or K8s, is an open-source container orchestration platform developed by Google. The core concept behind Kubernetes is that a user installs it on a server, or more likely a cluster, and deploys various workloads on it. Kubernetes addresses challenges related to container creation, scaling, namespaces, access rights, and more. The primary interaction with the cluster is through YAML configuration files.

This tutorial will guide you through creating and deploying a Kubernetes cluster locally.

Creating Virtual Machines

We will set up the Kubernetes cluster on two virtual machines: one acting as the master node and the other as a worker node. While deploying a cluster with only two nodes is not practical for real-world use, it is sufficient for educational purposes. If you wish to create a Kubernetes cluster with more nodes, simply repeat the process for each additional node.

  1. We will use Oracle's VirtualBox to create virtual machines, which you can download from this link. After installation, proceed to create the virtual machines.
  2. For the operating system, we will use Ubuntu Server, which can be downloaded here. After downloading, open VirtualBox.
  3. Click "Create" in VirtualBox to create a new virtual machine. The default settings are sufficient, but allocate 3 GB of RAM and 2 CPUs for the master node (which manages the Kubernetes cluster) and 2 GB of RAM for the worker node. Kubernetes requires a minimum of 2 CPUs for the master node. Create two virtual machines this way.
  4. After creating the virtual machines, create a boot image with the Ubuntu Server distribution. Go to "Storage" and click "Choose/Create a Disk Image."
  5. Click "Add" and select the Ubuntu Server distribution. Then, start both machines and install the operating system by selecting "Try or Install Ubuntu." During installation, create users for each system and choose the default settings.
  6. After installation, shut down both virtual machines and go to their settings. In the "Network" section, change the connection type to "Bridged Adapter" for each system so that the virtual machines can communicate with each other over the network.

System Preparation

Network Configuration

Set the node names for the cluster. On the master node, execute the following command:

sudo hostnamectl set-hostname master.local

On the worker node, execute:

sudo hostnamectl set-hostname worker.local

If there are multiple worker nodes, assign each a unique name: worker1.local, worker2.local, and so on.

To ensure that nodes are accessible by name, modify the hosts file on each node. Add the following lines:

192.168.43.80     master.local master
192.168.43.77     worker.local worker

Here, 192.168.43.80 and 192.168.43.77 are the IP addresses of each node. To find the IP address, use the ip addr command:

ip addr

Locate the IP address next to inet. Open the hosts file and make the necessary edits:

sudo nano /etc/hosts

To verify that the VMs can communicate with each other, ping the nodes:

ping 192.168.43.80

If successful, you will receive a response similar to this:

PING 192.168.43.80 (192.168.43.80) 56(84) bytes of data.
64 bytes from 192.168.43.80: icmp_seq=1 ttl=64 time=0.054 ms

You can enforce pod-to-pod isolation by applying the patterns in Kubernetes Cluster Network Policies tutorial—defining ingress and egress rules, using podSelector and namespace selectors to restrict traffic between master and worker nodes.

Updating Packages and Installing Additional Utilities

Next, install the necessary utilities and packages on each node. These steps should be applied to each node unless specified otherwise. Start by updating the package list and systems:

sudo apt-get update && apt-get upgrade -y

Then install the following packages:

sudo apt-get install curl apt-transport-https git iptables-persistent -y

Swap File

Kubernetes will not start with an active swap file, so it needs to be disabled:

sudo swapoff -a

To prevent it from reactivating after a reboot, modify the fstab file:

sudo nano /etc/fstab

Comment out the line with #:

# /swap.img      none    swap    sw      0       0

Kernel Configuration

Load additional kernel modules:

sudo nano /etc/modules-load.d/k8s.conf

Add the following two lines to k8s.conf:

br_netfilter
overlay

Now, load the modules into the kernel:

sudo modprobe br_netfilter
sudo modprobe overlay

Verify the modules are loaded successfully:

sudo lsmod | egrep "br_netfilter|overlay"

You should see output similar to this:

overlay               147456  0
br_netfilter           28672  0
bridge                299008  1 br_netfilter

Create a configuration file to process traffic through the bridge in netfilter:

sudo nano /etc/sysctl.d/k8s.conf

Add the following two lines:

net.bridge.bridge-nf-call-ip6tables = 1
net.bridge.bridge-nf-call-iptables = 1

Apply the settings:

sudo sysctl --system

Docker Installation

Run the following command to install Docker:

sudo apt-get install docker docker.io -y

For more details on installing Docker on Ubuntu, refer to the official guide.

After installation, enable Docker to start on boot and restart the service:

sudo systemctl enable docker
sudo systemctl restart docker

Kubernetes Installation

Add the GPG key:

sudo curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

Next, create a repository configuration file:

sudo nano /etc/apt/sources.list.d/kubernetes.list

Add the following entry:

deb https://apt.kubernetes.io/ kubernetes-xenial main

Update the apt-get package list:

sudo apt-get update

Install the following packages:

sudo apt-get install kubelet kubeadm kubectl

Installation is now complete. Verify the Kubernetes client version:

sudo kubectl version --client 

The output should be similar to this:

Client Version: version.Info{Major:"1", Minor:"24", GitVersion:"v1.24.2"}

Cluster Configuration

Master Node

Run the following command for the initial setup and preparation of the master node:

sudo kubeadm init --pod-network-cidr=10.244.0.0/16

The --pod-network-cidr flag specifies the internal subnet address, with 10.244.0.0/16 being the default value.

The process will take a few minutes. Upon completion, you will see the following message:

Then you can join any number of worker nodes by running the following on each as root:
kubeadm join 192.168.43.80:6443 --token f7sihu.wmgzwxkvbr8500al \
--discovery-token-ca-cert-hash sha256:6746f66b2197ef496192c9e240b31275747734cf74057e04409c33b1ad280321

Save this command to connect the worker nodes to the master node.

Create the KUBECONFIG environment variable:

export KUBECONFIG=/etc/kubernetes/admin.conf

Install the Container Network Interface (CNI):

kubectl apply -f https://raw.githubusercontent.com/coreos/flannel/master/Documentation/kube-flannel.yml

Worker Node

On the worker node, run the kubeadm join command obtained during the master node setup. After this, on the master node, enter:

sudo kubectl get nodes

The output should be:

NAME                 STATUS      ROLES                        AGE    VERSION
master.local          Ready      control-plane,master          10m    v1.24.2
worker.local          Ready      <none>                        79s    v1.24.2

The cluster is now deployed and ready for operation.

Conclusion

Setting up a Kubernetes cluster involves several steps, from creating and configuring virtual machines to installing and configuring the necessary software components. This tutorial provided a step-by-step guide to deploying a basic Kubernetes cluster on a local environment. While this setup is suitable for educational purposes, real-world deployments typically involve more nodes and more complex configurations. Kubernetes provides powerful tools for managing containerized applications, making it a valuable skill for modern IT professionals. By following this guide, you've taken the first steps in mastering Kubernetes and its ecosystem.

Kubernetes
22.08.2024
Reading time: 7 min

Similar

Kubernetes

How to Deploy PostgreSQL on Kubernetes

PostgreSQL is a popular relational database management system (RDBMS) that provides high-availability features like streaming replication, logical replication, and failover solutions. Deploying PostgreSQL on Kubernetes allows organizations to build resilient systems that ensure minimal downtime and data availability. With Kubernetes StatefulSets, you can scale PostgreSQL deployment in response to demand. This also useful if you use dedicated servers. Choose your server now! Kubernetes Environment Setup To get started, make sure you have the following: Kubernetes Cluster (Cloud or Local):  You can set up a Kubernetes cluster on Hostman within no time. To follow this tutorial with a local Kubernetes cluster, you can use one of these tools: k3s, minikube, microk8s, kind. Kubectl: Kubectl allows users to interact with a Kubernetes cluster. The kubectl needs a configuration YAML file which contains cluster details and is usually provided by your cloud provider.  From the Hostman control panel, you can simply download this configuration file with a click of a button as indicated in the below screenshot. To connect, you need to set KUBECONFIG environment variable accordingly. export KUBECONFIG=/absolute/path/to/file/k8s-cluster-config.yaml Helm: You need Helm CLI to install Helm charts. Helm version 3 is required. Deploy PostgreSQL Using a Helm Chart Helm is a package manager for Kubernetes just like apt for Ubuntu and Debian. Instead of manually creating multiple YAML files for Pods, Services, Persistent Volumes, Secrets, etc., the Helm chart simplifies this to a single command (e.g., helm install), streamlining the deployment process. Step 1: Add helm repository To add the Bitnami PostgreSQL Helm repo, run this command: helm repo add bitnami https://charts.bitnami.com/bitnami To sync your local Helm repository with the remote one: helm repo update Step 2: Manage Data Persistence PostgreSQL requires persistent storage to ensure that data is preserved even if a pod crashes or is rescheduled. When a Persistent Volume Claim (PVC) is combined with a Persistent Volume (PV), Kubernetes can allocate a desired chunk of storage either in disk or cloud storage. PVC requests the Kubernetes cluster for storage space. Kubernetes then looks at the available PVs and assigns one to it. Create a file named postgres-local-pv.yaml with the YAML manifest: apiVersion: v1 kind: PersistentVolume metadata: name: postgresql-local-pv spec: capacity: storage: 5Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: manual hostPath: path: /mnt/data/postgresql This manifest creates a PersistentVolume backed by a local directory (/mnt/data/postgresql) on a specific node. This means if the node goes down or becomes unavailable, the data stored in that PV will be inaccessible, which is a critical risk in production. Therefore, it’s highly recommended to use cloud-native storage solutions instead of hostPath to ensure reliability, scalability and data protection. This PV has a reclaim policy of Retain, ensuring that it is not deleted when no longer in use by a PVC. You can set storageClassName to ceph-storage, glusterfs, portworx-sc, or openebs-standard based on your needs. Create a file named postgres-local-pvc.yaml with this text: apiVersion: v1 kind: PersistentVolumeClaim metadata: name: postgresql-local-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: manual The ReadWriteOnce config means the volume can be read-write by a single node at a time. You might think, replacing it with ReadWriteMany will make your application highly available. This isn’t the case. ReadWriteMany (RWX) access mode allows multiple pods to access the same PersistentVolume simultaneously, this can indeed create serious issues leading to potential race conditions, data corruption, or inconsistent state. Apply these manifests using kubectl and create new resources. kubectl apply -f postgres-local-pv.yamlkubectl apply -f postgres-local-pvc.yaml Step 3: Install PostgreSQL Helm Chart Run the following command to install the Helm chart. helm install tutorial-db bitnami/postgresql --set auth.username=bhuwan \ --set auth.password=”AeSeigh2gieshe” \ --set auth.database=k8s-tutorial \ --set auth.postgresPassword=”Ze4hahshez6dop9vaing” \ --set primary.persistence.existingClaim=postgresql-local-pvc \ --set volumePermissions.enabled=true After a couple of minutes, verify if things have worked successfully with this command: kubectl get all Step 4: Test and Connect The following command runs a temporary PostgreSQL client pod. The pod connects to the database named k8s-tutorial, using the username bhuwan and the password from the environment variable $POSTGRES_PASSWORD. export POSTGRES_PASSWORD=$(kubectl get secret --namespace default tutorial-db-postgresql -o jsonpath="{.data.password}" | base64 -d) kubectl run tutorial-db-postgresql-client --rm --tty -i --restart='Never' \ --image docker.io/bitnami/postgresql:17.2.0-debian-12-r6 \ --env="PGPASSWORD=$POSTGRES_PASSWORD" \ --command -- psql --host tutorial-db-postgresql \ -U bhuwan -d k8s-tutorial -p 5432 After the session ends, the pod will be deleted automatically due to the --rm flag. A quick reminder, if you have changed the Helm chart release name, users, or database name, adjust the above commands accordingly. Deploy Postgres on Kubernetes from scratch A StatefulSet is the best Kubernetes resource for deploying stateful applications like PostgreSQL. This way, every PostgreSQL pod gets its own stable network identities and persistent volumes. Note: you’ll be using a previously created Persistent Volume Claim (PVC) and Persistent Volume(PV). So, do some cleanup and recreate those resources. helm delete tutorial-db kubectl delete pvc postgresql-local-pvc kubectl delete pv postgresql-local-pv kubectl apply -f postgres-local-pv.yaml -f postgres-local-pvc.yaml Create a file named postgres-statefulset.yaml with the following text: apiVersion: apps/v1 kind: StatefulSet metadata: name: postgres-statefulset labels: app: postgres spec: serviceName: "postgresql-headless-svc" replicas: 1 selector: matchLabels: app: postgres template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:17.2 envFrom: - secretRef: name: postgresql-secret ports: - containerPort: 5432 name: postgresdb volumeMounts: - name: pv-data mountPath: /var/lib/postgresql/db volumes: - name: pv-data persistentVolumeClaim: claimName: postgresql-local-pvc Before you can apply these changes, create a new Secret for handling sensitive details like passwords with kubectl. kubectl create secret generic postgresql-secret --from-literal=POSTGRES_PASSWORD=Ze4hahshez6dop9vaing kubectl apply -f postgres-statefulset.yaml If the pod gets stuck with Pending state, you can try creating a StorageClass with the following manifest. kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: manual provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer To investigate any further issues with the pod, you can use the command: kubectl describe pod postgres-statefulset-0 This command will report any issues related to scheduling the pod to a node, mounting volumes, or resource constraints. Databases like PostgreSQL are typically accessed internally by other services or applications within the cluster, so it's better to create a Headless service for it. Create a file called postgres-service.yaml and include the following YAML manifest: apiVersion: v1 kind: Service metadata: name: postgresql-headless-svc spec: type: ClusterIP selector: app: postgres ports: - port: 5432 targetPort: 5432 clusterIP: None Finally, you can test the connection with kubectl run. kubectl run tutorial-db-postgresql-client --rm --tty -i --restart='Never' \ --image docker.io/bitnami/postgresql:17.2.0-debian-12-r6 \ --env="PGPASSWORD=Ze4hahshez6dop9vaing" \ --command -- psql --host postgres-statefulset-0.postgresql-headless-svc \ -U postgres -p 5432 Scale, Replication, and Backup To scale up a Statefulset, simply pass the number of replicas with --replicas flag.  kubectl scale statefulset postgres-statefulset --replicas=3  To reach replicas, you can make use of headless service. For instance, with hostname postgres-statefulset-1.postgresql-headless-svc you can send requests to pod 1. For handling backups, you can use CronJob with the pg_dump utility provided by PostgreSQL. After scaling your StatefulSet, adjust CPU and memory quotas as shown in the Kubernetes Requests and Limits tutorial to prevent runaway queries from monopolizing node resources—ensuring predictable performance and avoiding OOMKilled errors across all replicas. Best Practices Throughout the tutorial, the decision to handle passwords via Kubernetes Secret, using StatefulSet instead of Deployment was a good move. To make this deployment even more secure, reliable, and highly available, here are some ideas: Set Resource Requests and Limits: Set appropriate CPU and memory requests and limits to avoid over-provisioning and under-provisioning. Backups: Use Kubernetes CronJobs to regularly back up your PostgreSQL data. Consider implementing Volume Snapshots as well. Monitoring and Log Postgresql: You can use tools like Prometheus and Grafana to collect and visualize PostgreSQL metrics, such as query performance, disk usage, and replication status. Use Pod Disruption Budgets (PDBs): If too many PostgreSQL pods are disrupted at once (e.g., during a rolling update), it can lead to database unavailability or replication issues. Choose your server now! Conclusion Helm chart is the recommended way of complex and production deployment. Helm provides an automated version manager alongside hiding the complexities of configuring individual Kubernetes components. Using the Helm template command, you can even render the Helm chart locally and make necessary adjustments with its YAML Kubernetes manifests. Kubernetes provides scalability, flexibility, and ease of automation for PostgreSQL databases. By leveraging Kubernetes features like StatefulSets, PVCs, PDBs, and secrets management, you can ensure that your PostgreSQL database is tuned for the production environment. And if you’re looking for a reliable, high-performance, and budget-friendly solution for your workflows, Hostman has you covered with Linux VPS Hosting options, including Debian VPS, Ubuntu VPS, and VPS CentOS. Frequently Asked Questions (FAQ) Should you run Postgres in Kubernetes?  Yes, but with caution. While it was previously discouraged, modern tools (Operators) make it viable. It offers great benefits for automation and scalability, but it adds significant complexity compared to managed services (like RDS) or standard VM deployments. It is best suited for teams with strong Kubernetes expertise. What is the recommended architecture for PostgreSQL in Kubernetes?  The standard recommended architecture is a High Availability (HA) Primary-Replica setup. Primary: Handles writes and reads. Replicas: Handle read-only traffic and serve as failover candidates. Operators: Use a Kubernetes Operator (like CloudNativePG, Zalando, or Crunchy Data) to manage the failover, backups, and synchronization automatically, rather than managing raw StatefulSets manually. What operating system is recommended for Postgres?  PostgreSQL is developed primarily on Linux (Debian and Ubuntu are the most common distributions for the container images). Since Kubernetes runs on Linux, this is the native and most performant environment. How do I deploy PostgreSQL on Kubernetes?  For production, avoid manual YAML files. The best methods are: Helm Charts: For quick, standard deployments (e.g., Bitnami charts). Operators: For lifecycle management (backups, updates, HA). Command example: helm install my-postgres oci://registry-1.docker.io/bitnami/charts/postgresql How does storage work for Postgres in Kubernetes?  Postgres requires persistent storage so data survives pod restarts.You must configure a PersistentVolumeClaim (PVC) that maps to a PersistentVolume (PV) backed by your storage class (e.g., AWS EBS, Google Persistent Disk, or local storage). How do I access the Postgres database from outside the cluster?  By default, the database is only accessible within the cluster via ClusterIP. To access it externally, you can use: Port Forwarding: (For debugging) kubectl port-forward svc/my-postgres 5432:5432 LoadBalancer: Change the service type to LoadBalancer (for cloud environments). Ingress: Configure an Ingress controller (though typically Ingress is for HTTP, TCP ingress is possible).
21 January 2026 · 11 min to read
Kubernetes

Liveness, Readiness, and Startup Probes in Kubernetes: Complete Guide

Kubernetes is a powerful container orchestration platform that automates application deployment, scaling, and management. One of the key tasks in container management is ensuring that containers are healthy and ready to handle requests.  In Kubernetes, there are mechanisms known as probes: Liveness, Readiness, and Startup. With their help, Kubernetes monitors container states and makes decisions about restarting them, routing traffic, or waiting for initialization to complete. In this article, we’ll take a detailed look at each probe, how to configure them, common mistakes when using them, and best practices. Each probe will be accompanied by a practical example. We’ll be working through practical examples, which requires a Kubernetes cluster. You can rent a ready-made cluster using a cloud Kubernetes service. For basic service operation, one master node and one worker node with minimal configuration is enough. What are Kubernetes Probes? Probes in Kubernetes are diagnostic checks performed by the kubelet (an agent running on each Kubernetes node) to assess the state of containers. They help determine whether a container is functioning correctly, whether it is ready to accept network traffic, or whether it has finished initialization. Without such checks, Kubernetes cannot know for sure whether an application is in a healthy state, which can lead to service disruptions or incorrect request routing. There are three main types of probes in Kubernetes, each solving its own task: Liveness Probe checks whether the container is “alive,” i.e., working correctly. If the check fails, Kubernetes will automatically restart the container. Readiness Probe determines whether the container is ready to accept incoming network traffic. If the container is not ready, it is excluded from load balancing. Startup Probe is used for applications that require a long startup time, helping to avoid premature restarts or removal from routing. These checks are necessary to ensure fault tolerance and application stability. In particular, they allow Kubernetes to: Automatically restart stuck containers. Exclude containers from handling requests if they are temporarily unready. Control startup of applications with long initialization times. Reduce the likelihood of errors caused by incorrect traffic routing. Probes support three execution mechanisms: HTTP: Sending an HTTP request to a container endpoint. A response code in the range 200–399 is considered successful. TCP: Checking whether a TCP connection can be opened on a specified port. Command: Executing a command inside the container. A return code of 0 means success. Now, let’s look at each type of probe in more detail. Purpose of Liveness Probe The Liveness Probe determines whether a running container is functioning correctly. If the check fails, Kubernetes considers the container unhealthy and automatically restarts it. This is useful in situations where the application hangs, consumes too many resources, or encounters an internal error, but the container process itself continues running. How the Liveness Probe Works The kubelet periodically performs the check defined in the Liveness Probe configuration. If the check fails (for example, an HTTP request returns a 500 code or a command returns a non-zero exit code), Kubernetes increases the counter of failed attempts. After reaching the threshold (set by the failureThreshold parameter), the container will be restarted automatically. Configuration Parameters Below are the parameters used when configuring a Liveness Probe: initialDelaySeconds: Delay before the first check after the container starts (in seconds). periodSeconds: Interval between checks (in seconds). timeoutSeconds: Timeout for waiting for a response (in seconds). successThreshold: Minimum number of successful checks for the container to be considered “healthy” (usually set to 1). failureThreshold: Number of failed checks after which the container is considered “unhealthy.” Practical Example Let’s see how to use a Liveness Probe in practice. Below is the manifest: --- apiVersion: v1 kind: Namespace metadata: name: test-liveness-probe --- apiVersion: apps/v1 kind: Deployment metadata: name: test-liveness-probe-http namespace: test-liveness-probe spec: replicas: 1 selector: matchLabels: app: nginx-liveness template: metadata: labels: app: nginx-liveness spec: containers: - name: nginx--test-container image: nginx:1.26.0 livenessProbe: httpGet: path: / port: 80 initialDelaySeconds: 15 periodSeconds: 10 timeoutSeconds: 3 failureThreshold: 3 In this configuration, an Nginx web server image is used, and a Liveness Probe is set up to periodically perform an HTTP request to the root / endpoint on port 80 to monitor application health. The first check starts 15 seconds after the container launches (initialDelaySeconds) and is performed every 10 seconds (periodSeconds). If the HTTP request to /healthz returns a response code in the range 200–399, the check is considered successful. However, if three consecutive checks fail (failureThreshold), the container will be restarted automatically. Save the configuration above to a file named test-liveness-probe.yaml and apply it: kubectl apply -f test-liveness-probe.yaml Check that the pod has started successfully: kubectl get pods -n test-liveness-probe As you can see in the screenshot above, the pod started successfully and is in Running status. Now, let’s verify that the probe works. Check the pod logs with: kubectl logs test-liveness-probe-http-6bf85d548b-xc9lf -n test-liveness-probe (Remember to replace the pod name test-liveness-probe-http-6bf85d548b-xc9lf with the one displayed by the command above.) In the output, we will see that the probe sends a request every 10 seconds and successfully receives a response. Next, let’s test how the pod behaves if we change the probe to use a non-existent endpoint. Update the configuration like this: livenessProbe:   httpGet:     path: /nonexistent     port: 80 Save the changes and apply them: kubectl apply -f test-liveness-probe.yaml Check the pod status: kubectl get pods -n test-liveness-probe The pod is running, but note the RESTARTS column, which shows the number of pod restarts. In this case, the pod has restarted twice and will continue restarting. This is because of the Liveness Probe settings: if three consecutive checks fail, the container is restarted automatically. Typical Use Cases Liveness probes are vital for ensuring the continuous availability and reliability of applications running in Kubernetes. For instance, in a high-traffic production environment, a liveness probe ensures that if a container becomes unresponsive or encounters an issue, it is automatically restarted, minimizing downtime and improving the user experience. It helps ensure that faulty containers do not remain active for prolonged periods, which could affect application performance or lead to outages. Action on Liveness Probe Failure If a Liveness probe fails, Kubernetes takes action by restarting the container. This is crucial for scenarios like deadlocks, where the application within the container becomes unresponsive, or if the application is in a hung state, unable to recover on its own. The restart mechanism allows Kubernetes to recover the container and restore service availability without manual intervention. Purpose of Readiness Probe The Readiness Probe determines whether a container is ready to accept incoming network traffic. If the check fails, Kubernetes removes the container from routing (for example, from a Service or Ingress object), but does not restart the container. This allows temporary isolation of a container that is not ready to handle requests, for example, during a database update or cache loading. How the Readiness Probe Works The kubelet performs the check similarly to the Liveness Probe. If the check succeeds, the container is considered ready, and Kubernetes includes it in traffic routing. If the check fails, the container is excluded from routing but continues running. Configuration Parameters The Readiness Probe uses the same parameters as the Liveness Probe, but their values may differ. Practical Example Let’s look at a practical example of using a Readiness Probe. Below is a manifest that configures a readiness check using HTTP: apiVersion: v1 kind: Namespace metadata: name: test-readiness-probe --- apiVersion: apps/v1 kind: Deployment metadata: name: nginx-test-deployment namespace: test-readiness-probe labels: app: nginx spec: replicas: 1 selector: matchLabels: app: nginx template: metadata: labels: app: nginx spec: containers: - name: nginx-web-server image: nginx:1.14.2 ports: - containerPort: 80 readinessProbe: httpGet: path: / port: 80 initialDelaySeconds: 5 periodSeconds: 10 timeoutSeconds: 3 successThreshold: 1 failureThreshold: 3 In this configuration, we use an Nginx web server image. The check is performed every 10 seconds on port 80 (the default port for Nginx). If the probe fails three consecutive times on port 80, the container is removed from routing. The first check begins 5 seconds after the container starts. Save the configuration above into a file named test-readiness-probe.yaml and apply it: kubectl apply -f test-readiness-probe.yaml Verify that the pod has started successfully: kubectl get pods -n test-readiness-probe As shown in the screenshot above, the pod started successfully and is in Running status. Now let’s check probe behavior in case of a failure. To “break” the container, connect to the pod and remove the index file index.html: kubectl exec -it nginx-deployment-5c8b8b9669-q8vrh -n test-readiness-probe -- rm /usr/share/nginx/html/index.html Next, check the pod status: kubectl get pods -n test-readiness-probe As seen in the screenshot above, the pod will be marked as not ready (0/1 in the READY column). Check the pod events with: kubectl describe pod nginx-deployment-5c8b8b9669-q8vrh -n test-readiness-probe In the Events section, a message appears indicating that the Readiness Probe failed and returned a 403 error. To make the pod accept incoming traffic again, simply delete it: kubectl delete pod nginx-deployment-5c8b8b9669-q8vrh -n test-readiness-probe Check pod status again: kubectl get pods -n test-readiness-probe The pod is now ready to work and can accept incoming network traffic. Typical Use Cases Readiness probes are particularly important for applications that require time for initialization, such as database services or applications that need to load data or establish connections before becoming operational. For example, in a microservices architecture, a service may depend on other services being fully up and running before it can start accepting traffic. Readiness probes prevent such services from being overwhelmed by traffic requests during initialization, allowing for a smoother deployment process. Action on Readiness Probe Failure If the readiness probe fails, Kubernetes temporarily removes the Pod from the list of endpoints for the associated service, ensuring that traffic is not routed to the container while it is still in the process of becoming ready. The container will only be reintroduced to the service once the readiness probe succeeds, confirming that it can now handle incoming traffic. Purpose of Startup Probe The Startup Probe is intended for applications that require more time to start. It allows Kubernetes to wait for container initialization to complete before starting Liveness or Readiness checks. Without the Startup Probe, a slow application may be restarted due to failed Liveness checks, even though it simply hasn’t finished starting yet. How the Startup Probe Works The Startup Probe runs until it either succeeds or exceeds the failure threshold. Once it succeeds, Kubernetes begins running the Liveness and Readiness Probes (if configured). If the check fails, the container is restarted. Configuration Parameters The Startup Probe uses parameters similar to other probes but typically with higher values for initialDelaySeconds and failureThreshold to account for the application’s long startup time. Practical Example Let’s see how to use a Startup Probe in practice. Here’s a configuration for an application with a long initialization time: apiVersion: v1 kind: Namespace metadata: name: test-startup-probe --- apiVersion: apps/v1 kind: Deployment metadata: name: startup-demo namespace: test-startup-probe spec: replicas: 1 selector: matchLabels: app: startup-demo template: metadata: labels: app: startup-demo spec: containers: - name: demo-container image: nginx:alpine ports: - containerPort: 80 lifecycle: postStart: exec: command: ["/bin/sh", "-c", "sleep 30 && touch /usr/share/nginx/html/ready"] startupProbe: exec: command: ["cat", "/usr/share/nginx/html/ready"] failureThreshold: 10 periodSeconds: 5 livenessProbe: httpGet: path: / port: 80 initialDelaySeconds: 5 periodSeconds: 5 readinessProbe: httpGet: path: / port: 80 initialDelaySeconds: 5 periodSeconds: 5 In this configuration, a container based on an Nginx image is launched, simulating a slow startup. Kubernetes uses three probes to manage container state: Startup Probe: Checks if the container has finished starting by running cat /usr/share/nginx/html/ready every 5 seconds (periodSeconds: 5). If the ready file exists and is accessible (command succeeds), the container is considered started. The probe is retried up to 10 times (failureThreshold: 10), giving a maximum of 50 seconds for a successful startup. If all attempts fail, the container restarts. Liveness Probe: Checks whether the container continues running correctly by sending an HTTP GET request to / on port 80 every 5 seconds (periodSeconds: 5) after an initial delay of 5 seconds (initialDelaySeconds: 5). If the server responds with a code in the 200–399 range, the check is successful. Otherwise, the container is considered unhealthy and restarted. Readiness Probe: Determines whether the container is ready to accept traffic by also sending an HTTP GET request to / on port 80 every 5 seconds (periodSeconds: 5) after a 5-second delay. If successful (response code 200–399), the container is included in load balancing. If it fails, the container is excluded from routing but not restarted. Save the configuration to a file named test-startup-probe.yaml and apply it: kubectl apply -f test-startup-probe.yaml Check pod status: kubectl get pods -n test-startup-probe As shown in the screenshot above, the pod starts but is not ready (0/1 in the READY column). During the first 30 seconds (because of sleep 30 in postStart), the pod remains in Running status but not Ready, since the Startup Probe is waiting for the /usr/share/nginx/html/ready file to appear. After 30 seconds, the pod becomes ready for work. Typical Use Cases Startup probes are designed for applications that have a long or complex initialization process. For example, a database system might need several minutes to fully initialize before it can serve traffic or respond to requests. Without a startup probe, Kubernetes might prematurely restart the container based on failed liveness or readiness probes, causing unnecessary restarts and delays in the application becoming operational. The startup probe ensures the container is given sufficient time to complete its startup sequence before being evaluated by other probes. Common mistakes Using incorrect values. Small values for periodSeconds or timeoutSeconds can lead to false positives due to temporary delays. You should set reasonable values (for example, periodSeconds: 10, timeoutSeconds: 3). Missing endpoints for checks. If the application does not define endpoints (e.g., /healthz or /ready), HTTP checks will fail. Implement the necessary endpoints in the application during development. Overloading the container. Frequent checks can overload the application, especially if they perform complex operations. Use lightweight checks, such as TCP instead of HTTP, if that is sufficient. Ignoring the Startup Probe. Without a Startup Probe, slow applications may be restarted because of failed Liveness checks. You need to use and properly configure a Startup Probe for applications with long startup times. Best practices Separate Liveness and Readiness probes. The Liveness Probe should check whether the application is running, while the Readiness Probe should check whether it is ready to accept network traffic. For example, Liveness can check for process existence, while Readiness can check the availability of external dependencies. Use a Startup Probe for slow applications. If an application takes more than 10–15 seconds to start, configure a Startup Probe to avoid premature restarts. Implement lightweight checks. HTTP checks should return minimal data to reduce application load. TCP checks are preferable if checking port availability is sufficient. Take dependencies into account. If the application depends on a database or another service, configure the Readiness Probe to check the availability of those dependencies. Key Differences Liveness vs. Readiness: Liveness probes are concerned with the container's overall health and are designed to trigger restarts if the container is unresponsive. In contrast, Readiness probes focus on whether the container is ready to serve traffic, preventing requests from being routed to a container that isn't fully initialized or capable of handling them. Considerations When configuring probes, it’s essential to select the appropriate probe type based on your application's behavior and requirements. Consider the time it takes for your application to become responsive, as well as any initialization tasks it may need to complete. Proper configuration of probes ensures accurate health checks and resource management, minimizing unnecessary restarts and optimizing container resource usage. Conclusion Liveness, Readiness, and Startup Probes in Kubernetes are critical tools that allow you to: Monitor container health Automatically restart failed instances Exclude unready containers from routing Give slow applications enough time for initialization Proper probe configuration requires understanding how the application works and carefully tuning the parameters. Using probes in Kubernetes not only increases application stability but also simplifies infrastructure management by automating responses to failures and state changes.
18 September 2025 · 15 min to read
Kubernetes

How to Install Kubecost: Full Installation Guide

Kubecost is a tool for monitoring and managing costs in Kubernetes. It helps you understand in real time how much resources (CPU, RAM, storage, etc.) each component (pod, service, namespace, deployment) is consuming, and how that translates into money. It is mainly used to monitor costs per service and optimize resource usage. Kubecost brings cost transparency, letting you see how much each application or namespace costs. Unused resources are automatically identified. This tool is useful for DevOps engineers in managing and optimizing resources, financial analysts in tracking infrastructure spending, and project managers in allocating costs across teams and projects. In this article, we’ll go through the installation, integration, and initial configuration of Kubecost. Don't forget to check our powerful VPS hosting if you need more power and control over your project! Installing Kubecost Let’s walk through the installation of Kubecost step by step. Step Zero: Create and Connect to a Kubernetes Cluster To use Kubecost, you’ll need: A Kubernetes cluster with a supported version (1.16 or newer). Sufficient resources in the cluster (a minimum of 2 CPUs and 4 GB RAM is recommended for Kubecost pods). A cluster management tool like kubectl. Hostman’s cloud infrastructure provides the ability to create a Kubernetes cluster with a recommended configuration (2 CPUs @ 3.3 GHz, 4 GB RAM, 60 GB NVMe). We described the process of creating a cluster in the documentation. For easier monitoring, you can also install the Kubernetes Dashboard with a single click. Once the cluster is created, connect to it—we recommend using Lens. The connection process is also described in detail in our docs. You’ll need a terminal with the cluster’s context. To access it, navigate to the Overview tab in Lens and click the Terminal button located at the bottom. All command-line operations will be performed in this terminal. Step One: Choose a Storage Type Kubernetes requires dedicated storage to function properly. For development, Local Path Provisioner is a good option; for production, we recommend an external fault-tolerant storage solution. Local Path Provisioner This is convenient in test and local environments where a single node and low fault tolerance are sufficient. However, in clusters with multiple nodes under active testing, it may not be enough since it’s limited to local disks. Here’s how to install it using Rancher’s ready-made manifest: curl -s https://raw.githubusercontent.com/rancher/local-path-provisioner/master/deploy/local-path-storage.yaml | kubectl apply -f - Expected output: namespace/local-path-storage created serviceaccount/local-path-provisioner-service-account created role.rbac.authorization.k8s.io/local-path-provisioner-role created clusterrole.rbac.authorization.k8s.io/local-path-provisioner-role created rolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created clusterrolebinding.rbac.authorization.k8s.io/local-path-provisioner-bind created deployment.apps/local-path-provisioner created storageclass.storage.k8s.io/local-path createdconfigmap/local-path-config created Ensure the pod is running: kubectl get pods -n local-path-storage Expected output: NAME                               READY   STATUS    RESTARTS   AGE local-path-provisioner-xxx         1/1     Running   0          68s After installation, a StorageClass named local-path should appear: kubectl get sc Expected output: NAME         PROVISIONER              ... VOLUMEBINDINGMODE     AGE local-path   rancher.io/local-path    ... WaitForFirstConsumer  5s To set the created local-path as the default storage class: kubectl patch storageclass local-path \   -p '{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}' Expected output: storageclass.storage.k8s.io/local-path patched External Storage For production use, where highly available volumes and automatic node-failure recovery are important, choose more reliable solutions than Local Path Provisioner, such as S3 storage. Step Two: Install Kubecost Add the Kubecost Helm repository and update it: helm repo add kubecost https://kubecost.github.io/cost-analyzer/ helm repo update Now use one of the following Helm commands: If the cluster has a default StorageClass: helm install kubecost kubecost/cost-analyzer \   --namespace kubecost --create-namespace If the cluster does NOT have a default StorageClass: helm install kubecost kubecost/cost-analyzer \   --namespace kubecost --create-namespace \   --set global.storageClass=<STORAGECLASS> Expected output: Kubecost 2.x.x has been successfully installed. Step Three: Verify Installation Check that the PersistentVolumeClaims (PVCs) created by Kubecost are in Bound status: kubectl get pvc -n kubecost Expected output (trimmed for clarity): NAME                         STATUS kubecost-cost-analyzer       Bound kubecost-prometheus-server   Bound Make sure each PVC shows Bound. Next, ensure all pods are running and error-free: kubectl get pod -n kubecost Expected output: NAME                              READY   STATUS    RESTARTS kubecost-cost-analyzer-xxx        4/4     Running   0 kubecost-forecasting-xxx          1/1     Running   0 kubecost-grafana-xxx              2/2     Running   0 kubecost-prometheus-server-xxx    1/1     Running   0 If you see this, Kubecost is installed correctly. Step Four: Port Forwarding To manage Kubecost and view its metrics, you need to port-forward to your local machine. First, identify the service used by Kubecost: kubectl get svc -n kubecost Expected output (trimmed): NAME                       TYPE        CLUSTER-IP        EXTERNAL-IP   PORT kubecost-cost-analyzer     ClusterIP   10.111.138.113    <none>        9090/TCP The desired service is typically kubecost-cost-analyzer, and its port is 9090. Forward it: kubectl port-forward -n kubecost service/kubecost-cost-analyzer 9090:9090 Expected output: Forwarding from 127.0.0.1:9090 -> 9090 Forwarding from [::1]:9090 -> 9090 Now you can use Kubecost via the web UI at http://localhost:9090. How to Configure Kubecost Note: The Kubecost UI may vary by version; button labels, metrics, and other elements may change. Go to the Settings tab (on the left, might be hidden) for initial configuration. Cost Model Configuration Filling out this section is recommended for accurate cost calculations. Scroll to the Pricing section and enable the Enable Custom Pricing toggle. The app will prompt you to enter resource pricing manually. If using Hostman, you can find this pricing info on the Create Cluster page, under section 3. Worker Nodes Configuration, tab Custom. There, sliders will display the cost of the selected configuration. Note: In Hostman, the cost of fixed-configuration worker nodes is lower than that of equivalent custom-configured ones. Example field entry: Field Value Description Monthly CPU Price $1.80 Price per 1 vCPU Monthly Spot CPU Price* 0 Price per 1 Spot vCPU Monthly RAM Price $1.50 Price per 1 GB of RAM Monthly Spot RAM Price* 0 Price per 1 GB of Spot RAM Monthly GPU Price* 0 Price per 1 GPU Monthly Storage Price $0.04 Price per 1 GB of storage * — Not used in Hostman. Custom Labels Labels are used to identify, group, and detail costs associated with Kubernetes resources. Scroll to the Labels section. It’s similar in layout to the cost model section. Name Description Default Value Owner Label / Annotation Indicates resource owner (e.g., user or team)* owner Team Label Defines the team using the resource* team Department Label Links the resource to a department or cost center* department Product Label Specifies the app/product the resource is for* app Environment Label Indicates the environment (dev, prod, staging, etc.) env GPU Label Node-level label indicating GPU type — GPU Label Value Label value indicating GPU presence — * — supports CSV format. Prometheus Status Check Kubecost retrieves metrics from Prometheus. Scroll down to Prometheus Status—it’s near the bottom of the Settings page. You should see green checkmarks for each metric (as shown in the screenshot). If metrics are missing, Kubecost may not work as expected. For full diagnostics, visit: http://localhost:9090/diagnostics. Alert Configuration Kubecost can notify users of unexpected events. Alerts can be sent via email, Slack, webhooks, or Microsoft Teams. Go to the Alerts tab. Under Global Recipients, enter the contacts for global alert delivery. Below that, you can define alert types and specific recipients. Each type is described below: Name Description Allocation Budget Budget for cost allocation at namespace/team/project level. Notifies on overage. Allocation Efficiency Resource usage efficiency (e.g., CPU, RAM) within budgets or namespaces. Allocation Recurring Update Regular updates on resource allocation and costs. Allocation Spend Change Notifies of significant changes in resource spend. Asset Budget Budget for physical/virtual resources (nodes, GPUs, disks). Alerts on overage. Asset Recurring Update Regular updates on physical/virtual resource usage. Cloud Cost Budget Budget for cloud costs. Alerts when exceeded. Uninstalling and Reinstalling Kubecost Sometimes, full uninstallation is required to fix issues—for example, if no default StorageClass was set during the initial install. To remove Kubecost completely: helm uninstall kubecost -n kubecost kubectl delete ns kubecost To reinstall, follow Step Two again. Troubleshooting Common Issues Error Symptoms Solution Out of memory OOMKilled, logs show CrashLoopBackOff Add new worker nodes via Hostman (Resources tab). Kubernetes will reschedule the pods. Lack of CPU or disk Pods stuck in CrashLoopBackOff; Prometheus shows incomplete data Add more resources, check Prometheus logs for retention or WAL errors. Prometheus out of disk space Logs show Storage retention limit reached, WAL write errors Resize disk (for external storage), or add a new disk and migrate Prometheus data (local). UI slow / Graphs timing out Graphs load slowly or timeout Increase resources.requests/limits; optimize Prometheus retention and use recording rules. No PersistentVolume for PVC Error: 0/2 nodes ... no available persistent volumes to bind Refer to Step One, reinstall Kubecost with proper storage. PVC stuck in Pending kubectl get pvc shows Pending; no PV or no StorageClass Ensure storage class exists or set manually. Missing metrics in UI No data/graphs; logs show Unable to query Prometheus Verify Prometheus is running and has enough disk. Helm install fails Errors like chart not found, or failed resource creation Retry Step Two, ensure you have proper RBAC permissions. UI inaccessible via port Port-forward runs, but http://<node_ip>:9090 fails Use http://localhost:9090 if running locally; configure NodePort or LoadBalancer access. Zero dollar cost in UI Cost Allocation shows $0 or no data Manually define the cost model under Settings > On-Prem. Conclusion Kubecost is a powerful tool for monitoring and optimizing Kubernetes costs. It helps make infrastructure spending transparent and manageable. This guide covered the full installation and configuration process, including cluster preparation, choosing a storage class, Helm-based deployment, cost model setup, and Prometheus integration. Effective use of Kubecost not only helps reduce expenses but also improves resource management across teams, projects, and applications. By following this guide, you’ll be able to deploy and tailor Kubecost to suit your infrastructure needs.
25 July 2025 · 10 min to read

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start.
Email us
Hostman's Support