Integrating S3 into Kubernetes allows applications to interact with object storage as if it were a traditional file system. In Hostman, this is achieved using CSI S3 (Container Storage Interface), enabling seamless connection of S3 buckets to Kubernetes pods. Below is a step-by-step guide to setting up and using CSI S3 in a Kubernetes cluster.
When creating a new Kubernetes cluster, you can enable CSI S3 support during the Add-ons selection stage:
If the cluster already exists, follow these steps to install CSI S3:
Once installed, the bucket will be available for use in the cluster without requiring additional configuration.
To ensure CSI S3 is successfully installed and connected, run the following commands:
To check the StorageClass
:
kubectl get storageclass csi-s3 -o yaml
kubectl get secret csi-s3-secret -n csi-s3 -o yaml
The connection details will appear in the data section in Base64 format.
Since S3 is an object storage system, Kubernetes cannot directly interact with it. Kubernetes uses file systems and block devices, so a PersistentVolumeClaim
(PVC) is used to bridge the gap via the CSI driver. A FUSE layer, such as geesefs
, is used to mount the S3 bucket and make files accessible as a typical file system.
Note that CSI S3 is not a full POSIX-compliant file system. It is best suited for storing static files like images, CSS, JS, configuration files, or other data that does not require intensive disk operations. Issues may arise hosting databases or other applications that modify file permissions or ownership.
Partial file overwrites and creating hard links (hardlinks) are not supported.
PVC acts as a "request" for storage. Kubernetes responds to the request by creating a PersistentVolume
(PV), which links to the PVC and provides access to storage.
Below is an example PVC manifest for connecting to S3:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: csi-s3-pvc
namespace: default
spec:
accessModes:
- ReadWriteMany
resources:
requests:
storage: 5Gi
storageClassName: csi-s3
Key parameters in the PVC manifest:
apiVersion
and kind
: Defines the resource as a PersistentVolumeClaim
(version v1).metadata
: Specifies the PVC name (csi-s3-pvc
) and the namespace (default
).accessModes
: Specifies the storage access mode. ReadWriteMany
allows multiple pods to read/write to the same storage.resources.requests.storage
: Requests 5 GiB of storage space for the application.storageClassName
: Indicates the StorageClass
name (csi-s3
).To enable an application to use a PersistentVolumeClaim
(PVC), you need to configure it in the Deployment
manifest. Below is an example showing how an Nginx container uses a connected PVC:
apiVersion: apps/v1
kind: Deployment
metadata:
name: s3-app-deployment
namespace: default
spec:
replicas: 1
selector:
matchLabels:
app: s3-app
template:
metadata:
labels:
app: s3-app
spec:
containers:
- name: s3-app-container
image: nginx
volumeMounts:
- name: s3-storage
mountPath: /usr/share/nginx/html
volumes:
- name: s3-storage
persistentVolumeClaim:
claimName: csi-s3-pvc
Key elements of the Deployment manifest:
apiVersion
and kind
: Specifies that the resource is a Deployment
of version apps/v1.metadata
: Defines the Deployment
name (s3-app-deployment
) and the namespace (default
).replicas
: Specifies the number of replicas (1 in this example), enabling application scaling.template
: Contains the pod specification for the Deployment
.containers
: Describes the containers included in the pod. In this example, the pod includes a container running the Nginx image.volumeMounts
: Defines the volume mount point inside the container. Here, the PVC is mounted to /usr/share/nginx/html
, allowing the container to access data stored in S3 through the volume.volumes
: Specifies the volumes used by the pod. The volume s3-storage
is linked to the PVC named csi-s3-pvc
, ensuring that data from S3 is accessible through this volume.Using PVC allows an application to access data in S3 and also simplifies access control and scaling by ensuring that each replica has access to shared data.