CSI S3 Integration

Updated on 26 November 2024

Integrating S3 into Kubernetes allows applications to interact with object storage as if it were a traditional file system. In Hostman, this is achieved using CSI S3 (Container Storage Interface), enabling seamless connection of S3 buckets to Kubernetes pods. Below is a step-by-step guide to setting up and using CSI S3 in a Kubernetes cluster.

Installing CSI S3 to a New Cluster

When creating a new Kubernetes cluster, you can enable CSI S3 support during the Add-ons selection stage:

  1. Navigate to the Add-ons section while setting up the cluster.
  2. Enable the CSI S3 option.
  3. Choose an existing S3 bucket, create a new one, or connect an external S3 bucket.

Installing CSI S3 to an Existing Cluster

If the cluster already exists, follow these steps to install CSI S3:

  1. Go to the Add-ons tab on the cluster management page.
  2. Click the three dots next to the CSI S3 add-on and select Install.
  3. Select an existing S3 bucket, create a new one, or connect an external S3 bucket.
  4. Click Install and wait for the process to complete.

Once installed, the bucket will be available for use in the cluster without requiring additional configuration.

Verifying Installation

To ensure CSI S3 is successfully installed and connected, run the following commands:

  • To check the StorageClass:

kubectl get storageclass csi-s3 -o yaml
  • To view S3 connection details:
kubectl get secret csi-s3-secret -n csi-s3 -o yaml

The connection details will appear in the data section in Base64 format.

Using CSI S3 in Kubernetes

Since S3 is an object storage system, Kubernetes cannot directly interact with it. Kubernetes uses file systems and block devices, so a PersistentVolumeClaim (PVC) is used to bridge the gap via the CSI driver. A FUSE layer, such as geesefs, is used to mount the S3 bucket and make files accessible as a typical file system.

Note that CSI S3 is not a full POSIX-compliant file system. It is best suited for storing static files like images, CSS, JS, configuration files, or other data that does not require intensive disk operations. Issues may arise hosting databases or other applications that modify file permissions or ownership.

Partial file overwrites and creating hard links (hardlinks) are not supported.

PersistentVolumeClaim (PVC)

PVC acts as a "request" for storage. Kubernetes responds to the request by creating a PersistentVolume (PV), which links to the PVC and provides access to storage.

Below is an example PVC manifest for connecting to S3:

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
  name: csi-s3-pvc
  namespace: default
spec:
  accessModes:
    - ReadWriteMany
  resources:
    requests:
      storage: 5Gi
  storageClassName: csi-s3

Key parameters in the PVC manifest:

  • apiVersion and kind: Defines the resource as a PersistentVolumeClaim (version v1).
  • metadata: Specifies the PVC name (csi-s3-pvc) and the namespace (default).
  • accessModes: Specifies the storage access mode. ReadWriteMany allows multiple pods to read/write to the same storage.
  • resources.requests.storage: Requests 5 GiB of storage space for the application.
  • storageClassName: Indicates the StorageClass name (csi-s3).

Connecting PVC to a Deployment

To enable an application to use a PersistentVolumeClaim (PVC), you need to configure it in the Deployment manifest. Below is an example showing how an Nginx container uses a connected PVC:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: s3-app-deployment
  namespace: default
spec:
  replicas: 1
  selector:
    matchLabels:
      app: s3-app
  template:
    metadata:
      labels:
        app: s3-app
    spec:
      containers:
        - name: s3-app-container
          image: nginx
          volumeMounts:
            - name: s3-storage
              mountPath: /usr/share/nginx/html
      volumes:
        - name: s3-storage
          persistentVolumeClaim:
            claimName: csi-s3-pvc

Key elements of the Deployment manifest:

  • apiVersion and kind: Specifies that the resource is a Deployment of version apps/v1.
  • metadata: Defines the Deployment name (s3-app-deployment) and the namespace (default).
  • replicas: Specifies the number of replicas (1 in this example), enabling application scaling.
  • template: Contains the pod specification for the Deployment.
  • containers: Describes the containers included in the pod. In this example, the pod includes a container running the Nginx image.
  • volumeMounts: Defines the volume mount point inside the container. Here, the PVC is mounted to /usr/share/nginx/html, allowing the container to access data stored in S3 through the volume.
  • volumes: Specifies the volumes used by the pod. The volume s3-storage is linked to the PVC named csi-s3-pvc, ensuring that data from S3 is accessible through this volume.

Using PVC allows an application to access data in S3 and also simplifies access control and scaling by ensuring that each replica has access to shared data.

Was this page helpful?
Updated on 26 November 2024

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start.
Email us
Hostman's Support