Sign In
Sign In

Structure and Types of File Systems in Linux

Structure and Types of File Systems in Linux
Hostman Team
Technical writer
Infrastructure

The Linux file system is a complex tree-structured system that begins at the root. It consists of directories and subdirectories. Every file and file system is interconnected. This structure follows the typical scheme proposed by the FHS — a standard supported by the Linux Foundation.

Features of File Systems

A file system is how files are named, stored, retrieved, and updated on a disk or storage partition. The file system's structure must have a predefined format that the operating system understands.

The organization of a file system involves formatting, partitioning, and the method of storing organized data structures on a hard (or floppy) disk.

Such a system shell is divided into two segments: metadata (file name, creation date, size) and user data.

Your computer uses this file system to determine the location of files in your storage.

For example, Windows' main file systems are NTFS, FAT, and FAT32. NTFS supports three types of file links: hard links, junction points, and symbolic links (NTFS Links). The NTFS structure is one of the most efficient and complex to date. Each cluster on the medium has an entry in the FAT table. Entries indicate the assignment of file parts to a cluster. Each file entry concatenates with other file entries, starting from the first cluster. Since the first FAT system could handle only eight-character filenames, some limitations were lifted in FAT16 and then in FAT32.

Types of File Systems in Linux

File system types offered during the installation of a Linux-based OS include:

  • Ext
  • Ext2
  • Ext3
  • Ext4
  • JFS
  • XFS
  • Btrfs
  • Swap

These file system types have different functionalities and sets of predefined commands.

Ext — extended file system. It was introduced in 1992 and is considered one of the first.

Its functionality was partly developed based on the UNIX file system. The initial goal was to go beyond the file system used before it (MINIX) and overcome its limitations. Today it is hardly used.

Ext2 — "second extended file system". Known since 1993. It was developed as an analog of the previous file system.

It implemented innovations in memory volume and changed overall performance. It allows storing up to 2 TB of data. Like ext, it has little prospect, so it should be avoided.

Ext3 — third extended file system. Introduced in 2001. It surpasses the previous one in that it is journaled.

A journaling file system is one that writes changes (updates) to files and data in a separate journal before these actions are completed.

This file system uses an algorithm that allows recovering files after a reboot.

Ext4 — fourth extended system. Created in 2006. It overcame many limitations of the third version. It is widely used today and is the default file system in most Linux distributions.

Although it may not be the most advanced, it is reliable and stable enough, so it is commonly used in a wide range of Unix systems.

Therefore, if you don’t want to overthink the pros and cons of the many file systems you can choose from, experts recommend sticking with this one.

Alternative File Systems

JFS — created by IBM in 1990. The name JFS stands for Journaling File System. It easily restores data after a power failure and is quite reliable. Moreover, it consumes less processor power than other file systems.

XFS — high-performance file system. Created by Silicon Graphics. Originally intended for their IRIX OS, it was later ported to Linux. Today, XFS for Windows also exists.

Created in 1990, XFS is a 64-bit high-performance journaling system shell. It works well with large files but not particularly with smaller ones.

Btrfs — an alternative file system proposed by Oracle in 2009. It is considered a competing file system to Ext4, although the latter is generally regarded as the better version (faster data transfer, more stability). However, Btrfs has several unique advantages. Overall, it offers excellent performance.

Types of Linux Files

Linux file types include:

  • regular file
  • named pipe
  • device file
  • soft link (symbolic link)
  • directories
  • socket
  • door

File Types

Purpose

Regular files

Storing character and binary data

Directories (d)

Organizing access to files

Symbolic links (l)

Providing access to files located on any media

Block devices (b)

Interface for interacting with computer hardware

Character devices (c)

 

Pipes (p)

Organizing inter-process communication

Sockets (s)

 

A directory is a file containing other organized data structures (directories) and provides pointers to them. It acts as a folder in a filing cabinet (grouping related files). But while folders contain only files, directories may contain additional directories (subdirectories).

A symbolic (soft) link points to the name and location of a specific file. When a user copies, moves, or otherwise acts on the link, the operation is performed on the file it references.

Hard links are created separately. A hard link points to the actual data in the file just like a regular file. Apart from the name, there is no difference between the original file and a hard link pointing to the same data. Both files are regular files. A hard link can only be distinguished from any other regular file by the number of links each has. The number of links is shown in the second column of the ls -l listing. If the number is greater than 1, then additional hard links to the data exist.

All physical devices used by Linux are represented by device files. Device files are classified as special characters or special blocks. Special character files represent devices that interact with Linux character by character. Printers are an example of such devices.

Block-special files are hard and floppy disks and CD-ROMs interacting with the OS using data blocks.

Device files are extremely powerful because they allow users to access hardware devices such as drives, modems, and printers as if they were data files. They can be easily moved and copied, and data can be transferred between devices often without using special commands or syntax.

Linux OS Directories

The Linux directory structure is tree-shaped (branching). It’s important to highlight a characteristic specific to Unix-like systems: these OSes aim for simplicity and treat every object as a sequence of bytes. In Unix, these sequences are represented as files. 

Unlike Windows OS, which has multiple roots, the Linux file system allows only one root. The root directory is where all other directories and OS files reside (denoted by a forward slash /).

The entire Linux folder structure is represented in a single directory called the root directory.

Main Directories in the Root Directory

  • /home
    This is the home directory. Since Linux is a multi-user environment, each user is assigned a separate object in the system, accessible only to them and the superuser.
  • /bin and /sbin
    bin stands for binary. This is where the OS stores core program codes. Binary files are executable structured data containing compiled source code.
    sbin stands for system binary. This directory is reserved for software necessary for system recovery, booting, and rollback.
  • /opt
    Stands for "optional". This is where manually installed applications and programs are stored.
  • /usr
    usr stands for Unix System Resources. This directory contains user-level applications, unlike /bin or /sbin, which house system-level applications.
    Subdirectories under /usr include:
    • /usr/bin – most binary programs
    • /usr/include – header files needed for source code compilation
    • /usr/sbin – directories for recurring tasks
    • /usr/lib – libraries
    • /usr/src – kernel source code and header files
    • /usr/share – architecture-independent files (documents, icons, fonts)
      Originally intended for all user-related content, /usr has evolved into a location for software and data used by users.
  • /lib, /lib32, /lib64
    These are directories of library files — programs used by other applications.
  • /boot
    A static bootloader that contains the kernel's executable file and other configuration files needed to start the PC.
  • /sys
    This is where the user interacts with the kernel. It is considered a structured path to the kernel. The directory is mounted with a virtual file system called sysfs, serving as the kernel interface for accessing data about connected devices.
  • /tmp
    Temporary files needed by applications during a session are stored here.
  • /dev
    Contains special device files that allow software to interact with peripherals. Device files are categorized into character and block devices.
    A block device performs data input/output in blocks (e.g., an SSD), while a character device handles input/output as a stream of characters (e.g., a keyboard).
  • /proc
    proc stands for process. This directory contains pseudo-files that provide information about system resources.
  • /run
    This directory is mounted with a virtual tmpfs file system and holds runtime files related to active processes. These files exist in RAM and disappear when the session ends.
  • /root
    The home directory for the superuser (administrator).
  • /srv
    The service catalog. If you use a web server, you can store data for a specific webpage here.

File System and Data Storage Paths on Physical Disk

Linux directories map the names of structured data to their addresses on the physical disk. Linux directories have a predefined size to store metadata.

Files in directories use inodes (index nodes). An inode stores the disk block address and file attributes.

Each directory and file information in Linux contains an inode, and the inode itself holds a list of pointers referencing disk blocks.

A directory in the file system is an inode that stores information about all structured data names it contains.

Another note about inodes: Inodes are unique, but the names pointing to these nodes are not. This is why inodes track hard links.

Linux Architecture

The architecture of Linux consists of the hardware layer, kernel, system library, system, and utilities.

At the top is user space, where user applications run. Below this is the kernel space, where the OS kernel resides.

There is also a specific library collection called the GNU C Library (glibc). This library provides the OS call interface that bridges the kernel and user applications. Both user applications and the kernel operate in their own protected address spaces. Each user process has its own virtual address space, while the kernel has a unified address space.

The kernel structure includes three main levels:

  1. System Call Interface (SCI) – the top level that handles system calls (e.g., file writing).
  2. Core kernel code – an architecture-independent object shared across supported architectures.
  3. Architecture-specific code – forms the Board Support Package, designed specifically for the processor and platform of the given architecture.

Linux architecture is examined from various perspectives. A key goal of architectural decomposition is to enhance understanding.

Kernel Tasks

The kernel performs several functions:

  • Process management – determines which processes use the CPU, when, and for how long.
  • Memory management – monitors memory usage, allocation location, and duration.
  • Device drivers – serve as interpreters between hardware and processes.
  • System calls – handle service requests from active processes.

The kernel is invisible to the user and operates in its own realm (kernel space). What users see (browsers, files) exists in user space. These applications interact with the kernel through the System Call Interface.

Linux Operating Layers

  • Linux Kernel – OS software residing in memory that instructs the CPU.
  • Hardware – the physical machine consisting of RAM, CPU, and I/O devices like storage, network, and graphics. The CPU performs computations, reads memory, and writes to RAM.
  • User Processes – running programs managed by the kernel, collectively forming user space. These processes interact with each other via inter-process communication (IPC).

OS code executes on CPUs in two modes: kernel mode and user mode. Kernel mode has unrestricted hardware access, while user mode restricts access to memory, SCI, and CPU. This division also applies to memory (kernel space vs. user space) and enables complex operations like privilege separation and virtual machine creation.

Linux Distributions

Above the OS kernel, a Linux distribution is a collection of applications (typically open-source). A distribution may include server software, admin tools, documentation, and various desktop applications.

It aims to offer a consistent interface, safe and simple software management, and often a specific operational purpose.

Linux is freely distributed and accessible through multiple means. It is used by individuals and organizations and is often combined with free or proprietary software.

A distribution typically includes all software needed for installation and use.

Popular Linux distributions include:

  • Red Hat
  • Ubuntu
  • Debian
  • CentOS
  • Arch Linux
  • Linux Mint

These distributions can be used by beginners and system administrators. For example, Ubuntu is suitable for novices due to its user-friendly interface. Arch Linux is more suited to professionals, offering fewer pre-installed packages.

Infrastructure

Similar

Infrastructure

Virtualization vs Containerization: What They Are and When to Use Each

This article explores two popular technologies for abstracting physical hardware: virtualization and containerization. We will provide a general overview of each and also discuss the differences between virtualization and containerization. What Is Virtualization The core component of this technology is the virtual machine (VM). A VM is an isolated software environment that emulates the hardware of a specific platform. In other words, a VM is an abstraction that allows a single physical server to be transformed into multiple virtual ones. Creating a VM makes sense when you need to manage all operating system kernel settings. This avoids kernel conflicts with hardware, supports more features than a specific OS build might provide, and allows you to optimize and install systems with a modified kernel. What Is Containerization Containers work differently: to install and run a container platform, a pre-installed operating system kernel is required (this can also be on a virtual OS). The OS allocates system resources for the containers that provide a fully configured environment for deploying applications. Like virtual machines, containers can be easily moved between servers and provide a certain level of isolation. However, to deploy them successfully, it’s sufficient for the base kernel (e.g., Linux, Windows, or macOS) to match — the specific OS version doesn’t matter. Thus, containers serve as a bridge between the system kernel layer and the application layer. What Is the Difference Between Containerization and Virtualization Some, especially IT beginners, often frame it as "virtualization vs containerization." But these technologies shouldn't be pitted against each other — they actually complement one another. Let’s examine how they differ and where they overlap by looking at how both technologies perform specific functions. Isolation and Security Virtualization makes it possible to fully isolate a VM from the rest of the server, including other VMs. Therefore, VMs are useful when you need to separate your applications from others located on the same servers or within the same cluster. VMs also increase the level of network security. Containerization provides a certain level of isolation, too, but containers are not as robust when it comes to boundary security compared to VMs. However, solutions exist that allow individual containers to be isolated within VMs — one such solution is Hyper-V. Working with the Operating System A VM is essentially a full-fledged OS with its own kernel, which is convenient but imposes high demands on hardware resources (RAM, storage, CPU). Containerization uses only a small fraction of system resources, especially with adapted containers. When forming images in a hypervisor, the minimal necessary software environment is created to ensure the container runs on an OS with a particular kernel. Thus, containerization is much more resource-efficient. OS Updates With virtualization, you have to download and install OS updates on each VM. To install a new OS version, you need to update the VM — in some cases, even create a new one. This consumes a significant amount of time, especially when many virtual machines are deployed. With containers, the situation is similar. First, you modify a file (called a Dockerfile) that contains information about the image. You change the lines that specify the OS version. Then the image is rebuilt and pushed to a registry. But that’s not all: the image must then be redeployed. To do this, you use orchestrators — platforms for managing and scaling containers. Orchestration tools (the most well-known are Kubernetes and Docker Swarm) allow automation of these procedures, but developers must install and learn them first. Deployment Mechanisms To deploy a single VM, Windows (or Linux) tools will suffice, as will the previously mentioned Hyper-V. But if you have two or more VMs, it’s more convenient to use solutions like PowerShell. Single containers are deployed from images via a hypervisor (such as Docker), but for mass deployment, orchestration platforms are essential. So in terms of deployment mechanisms, virtualization and containerization are similar: different tools are used depending on how many entities are being deployed. Data Storage Features With virtualization, VHDs are used when organizing local storage for a single VM. If there are multiple VMs or servers, the SMB protocol is used for shared file access. Hypervisors for containers have their own storage tools. For example, Docker has a local Registry repository that lets you create private storage and track image versions. There is also the public Docker Hub repository, which is used for integration with GitHub. Orchestration platforms offer similar tools: for instance, Kubernetes can set up file storage using Azure’s infrastructure. Load Balancing To balance the load between VMs, they are moved between servers or even clusters, selecting the one with the best fault tolerance. Containers are balanced differently. They can’t be moved per se, but orchestrators provide automatic starting or stopping of individual containers or whole groups. This enables flexible load distribution between cluster nodes. Fault Tolerance Faults are also handled in similar ways. If an individual VM fails, it’s not difficult to transfer that VM to another server and restart the OS there. If there’s an issue with the server hosting the containerization platform, containers can be quickly recreated on another server using the orchestrator. Pros and Cons of Virtualization Advantages: Reliable isolation. Logical VM isolation means failures in one VM don’t affect the others on the same server. VMs also offer a good level of network security: if one VM is compromised, its isolation prevents infection of others. Resource optimization. Several VMs can be deployed on one server, saving on purchasing additional hardware. This also facilitates the creation of clusters in data centers. Flexibility and load balancing. VMs are easily transferred, making it simpler to boost cluster performance and maintain systems. VMs can also be copied and restored from backups. Furthermore, different VMs can run different OSs, and the kernel can be any type — Linux, Windows, or macOS — all on the same server. Disadvantages: Resource consumption. VMs can be several gigabytes in size and consume significant CPU power. There are also limits on how many VMs can run on a single server. Sluggishness. Deployment time depends on how "heavy" the VM is. More importantly, VMs are not well-suited to scaling. Using VMs for short-term computing tasks is usually not worthwhile. Licensing issues. Although licensing is less relevant for Russian developers, you still need to consider OS and software licensing costs when deploying VMs — and these can add up significantly in a large infrastructure. Pros and Cons of Containerization Advantages: Minimal resource use. Since all containers share the same OS kernel, much less hardware is needed than with virtual machines. This means you can create far more containers on the same system. Performance. Small image sizes mean containers are deployed and destroyed much faster than virtual machines. This makes containers ideal for developers handling short-term tasks and dynamic scaling. Immutable images. Unlike virtual machines, container images are immutable. This allows the launch of any number of identical containers, simplifying testing. Updating containers is also easy — a new image with updated contents is created on the container platform. Disadvantages: Compatibility issues. Containers created in one hypervisor (like Docker) may not work elsewhere. Problems also arise with orchestrators: for example, Docker Swarm may not work properly with OpenShift, unlike Kubernetes. Developers need to carefully choose their tools. Limited lifecycle. While persistent container storage is possible, special tools (like Docker Data Volumes) are required. Otherwise, once a container is deleted, all its data disappears. You must plan ahead for data backup. Application size. Containers are designed for microservices and app components. Heavy containers, such as full-featured enterprise software, can cause deployment and performance issues. Conclusion Having explored the features of virtualization and containerization, we can draw a logical conclusion: each technology is suited to different tasks. Containers are fast and efficient, use minimal hardware resources, and are ideal for developers working with microservices architecture and application components. Virtual machines are full-fledged OS environments, suitable for secure corporate software deployment. Therefore, these technologies do not compete — they complement each other.
10 June 2025 · 7 min to read
Infrastructure

Top RDP Clients for Linux in 2025: Remote Access Tools for Every Use Case

RDP (Remote Desktop Protocol) is a proprietary protocol for accessing a remote desktop. All modern Windows operating systems have it by default. However, a Linux system with a graphical interface and the xrdp package installed can also act as a server. This article focuses on Linux RDP clients and the basic principles of how the protocol works. Remote Desktop Protocol RDP operates at the application layer of the OSI model and is based on the Transport Layer Protocol (TCP). Its operation follows this process: A connection is established using TCP at the transport layer. An RDP session is initialized. The RDP client authenticates, and data transmission parameters are negotiated. A remote session is launched: the RDP client takes control of the server. The server is the computer being remotely accessed. The RDP client is the application on the computer used to initiate the connection. During the session, all computational tasks are handled by the server. The RDP client receives the graphical interface of the server's OS, which is controlled using input devices. The graphical interface may be transmitted as a full graphical copy or as graphical primitives (rectangles, circles, text, etc.) to save bandwidth. By default, RDP uses port 3389, but this can be changed if necessary. A typical use case is managing a Windows remote desktop from a Linux system. From anywhere in the world, you can connect to it via the internet and work without worrying about the performance of the RDP client. Originally, RDP was introduced in Windows NT 4.0. It comes preinstalled in all modern versions of Windows. However, implementing a Linux remote desktop solution requires special software. RDP Security Two methods are used to ensure the security of an RDP session: internal and external. Standard RDP Security: This is an internal security subsystem. The server generates RSA keys and a public key certificate. When connecting, the RDP client receives these. If confirmed, authentication takes place. Enhanced RDP Security: This uses external tools to secure the session, such as TLS encryption. Advantages of RDP RDP is network-friendly: it can work over NAT, TCP, or UDP, supports port forwarding, and is resilient to connection drops. Requires only 300–500 Kbps bandwidth. A powerful server can run demanding apps even on weak RDP clients. Supports Linux RDP connections to Windows. Disadvantages of RDP Applications sensitive to latency, like games or video streaming, may not perform well. Requires a stable server. File and document transfer between the client and server may be complicated due to internet speed limitations. Configuring an RDP Server on Windows The most common RDP use case is connecting to a Windows server from another system, such as a Linux client. To enable remote access, the target system must be configured correctly. The setup is fairly simple and works "out of the box" on most modern Windows editions.  Enable remote desktop access via the Remote Access tab in System Properties. Select the users who can connect (by default, only administrators). Check firewall settings. Some profiles like “Public” or “Private” may block RDP by default. If the server is not in a domain, RDP might not work until you allow it manually via Windows Firewall → Allowed Apps. If behind a router, you might need to configure port forwarding via the router’s web interface (typically under Port Forwarding). Recall that RDP uses TCP port 3389 by default. Best RDP Clients for Linux Remmina Website: remmina.org Remmina is a remote desktop client with a graphical interface, written in GTK+ and licensed under GPL. In addition to RDP, it supports VNC, NX, XDMCP, SPICE, X2Go, and SSH. One of its key features is extensibility via plugins. By default, RDP is not available until you install the freerdp plugin. After installing the plugin, restart Remmina, and RDP will appear in the menu. To connect: Add a new connection. Fill in connection settings (you only need the remote machine's username and IP). Customize further if needed (bandwidth, background, hotkeys, themes, etc.). Save the connection — now you can connect with two clicks from the main menu. If you need to run Remmina on Windows, a guide is available on the official website. FreeRDP Website: freerdp.com FreeRDP is a fork of the now-unsupported rdesktop project and is actively maintained under the Apache license. FreeRDP is a terminal-based client. It is configured and launched entirely via the command line. Its command structure is similar to rdesktop, for example: xfreerdp -u USERNAME -p PASSWORD -g WIDTHxHEIGHT IP This command connects to the server at the given IP using the specified credentials and screen resolution. KRDC Website: krdc KRDC (KDE Remote Desktop Client) is the official remote desktop client for KDE that supports RDP and VNC protocols. It offers a clean and straightforward interface consistent with KDE's Plasma desktop environment. KRDC is ideal for users of KDE-based distributions like Kubuntu, openSUSE KDE, and Fedora KDE Spin. It integrates well with KDE's network tools and provides essential features such as full-screen mode, session bookmarking, and network browsing via Zeroconf/Bonjour. KRDC is actively maintained by the KDE community and is available through most Linux package managers. GNOME Connections Website: gnome-connections Vinagre was the former GNOME desktop's default remote desktop client. GNOME Connections, a modernized remote desktop tool for GNOME environments, has since replaced it. GNOME Connections supports RDP and VNC, providing a simple and user-friendly interface that matches the GNOME design language. It focuses on ease of use rather than configurability, making it ideal for non-technical users or quick access needs. Features: Bookmarking for quick reconnections Simple RDP session management Seamless integration into GNOME Shell Connections is maintained as part of the official GNOME project and is available in most distribution repositories. Apache Guacamole Website: guacamole.apache.org This is the simplest yet most complex remote desktop software for Linux. Simple because it works directly in a browser — no additional programs or services are needed. Complex because it requires one-time server installation and configuration. Apache Guacamole is a client gateway for remote connections that works over HTML5. It supports Telnet, SSH, VNC, and RDP — all accessible via a web interface. Although the documentation is extensive, many ready-made scripts exist online to simplify basic setup. To install: wget https://git.io/fxZq5 -O guac-install.sh chmod +x guac-install.sh ./guac-install.sh After installation, the script will provide a connection address and password. To connect to a Windows server via RDP: Open the Admin Panel, go to Settings → Connections, and create a new connection. Enter the username and IP address of the target machine — that's all you need. The connection will now appear on the main page, ready for use. Conclusion RDP is a convenient tool for connecting to a remote machine running Windows or a Linux system with a GUI. The server requires minimal setup — just a few settings and firewall adjustments — and the variety of client programs offers something for everyone.
09 June 2025 · 6 min to read
Infrastructure

Docker Container Storage and Registries: How to Store, Manage, and Secure Your Images

Docker containerization offers many benefits, one of which is image layering, enabling fast container generation. However, containers have limitations — for instance, persistent data needs careful planning, as all data within a container is lost when it's destroyed. In this article, we’ll look at how to solve this issue using Docker’s native solution called Docker Volumes, which allows the creation of persistent Docker container storage. What Happens to Data Written Inside a Container To begin, let’s open a shell inside a container using the following command: docker run -it --rm busybox Now let’s try writing some data to the container: echo "Hostman" > /tmp/data cat /tmp/data Hostman We can see that the data is written, but where exactly? If you're familiar with Docker, you might know that images are structured like onions — layers stacked on top of each other, with the final layer finalizing the image. Each layer can only be written once and becomes read-only afterward. When a container is created, Docker adds another layer for handling write operations. Since container lifespans are limited, all data disappears once the container is gone. This can be a serious problem if the container holds valuable information. To solve this, Docker provides a solution called Docker Volumes. Let’s look at what it is and how it works. Docker Volumes Docker Volumes provide developers with persistent storage for containers. This tool decouples data from the container’s lifecycle, allowing access to container data at any time. As a result, data written inside containers remains available even after the container is destroyed, and it can be reused by other containers. This is a useful solution for sharing data between Docker containers and also enables new containers to connect to the existing storage. How Docker Volumes Work A directory is created on the server and then mounted into one or more containers. This directory is independent because it is not included in the Docker image layer structure, which allows it to bypass the read-only restriction of the image layers for containers that include such a directory. To create a volume, use the following command: docker volume create Now, let’s check its location using: docker volume inspect volume_name The volume name usually consists of a long alphanumeric string. In response, Docker will display information such as the time the volume was created and other metadata, including the Mountpoint. This line shows the path to the volume. To view the data stored in the volume, simply open the specified directory. There are also other ways to create a Docker Volume. For example, the -v option can be added directly during container startup, allowing you to create a volume on the fly: docker run -it --rm -v newdata:/data busybox Let’s break down what’s happening here: The -v argument follows a specific syntax, indicated by the colon right after the volume name (in this case, we chose a very creative name, newdata). After the colon, the mount path inside the container is specified. Now, you can write data to this path, for example: echo "Cloud" > /data/cloud Data written this way can easily be found at the mount path. As seen in the example above, the volume name is not arbitrary — it matches the name we provided using -v. However, Docker Volumes also allow for randomly generated names, which are always unique to each host. If you’re assigning names manually, make sure they are also unique. Now, run the command: docker volume ls If the volume appears in the list, it means any number of other containers can use it. To test this, you can run: docker run -it --rm -v newdata:/data busybox Then write something to the volume. Next, start another container using the exact same command and you’ll see that the data is still there and accessible — meaning it can be reused. Docker Volumes in Practice Now let’s take a look at how Docker Volumes can be used in practice. Suppose we're developing an application to collect specific types of data — let’s say football statistics. We gather this data and plan to use it later for analysis — for example, to assess players’ transfer market values or for betting predictions. Let’s call our application FootballStats. Preserving Data After Container Removal Obviously, if we don’t use Docker Volumes, all the collected statistics will simply be lost as soon as the container that stored them is destroyed. Therefore, we need to store the data in volumes so it can be reused later. To do this, we use the familiar -v option:  -v footballstats:/dir/footballstats This will allow us to store match statistics in the /dir/footballstats directory, on top of all container layers. Sharing Data Suppose the FootballStats container has already gathered a certain amount of data, and now it's time to analyze it. For instance, we might want to find out how a particular team performed in the latest national championship or how a specific player did — goals, assists, cards, etc. To do this, we can mount our volume into a new container, which we’ll call FootballStats-Analytics. The key advantage of this setup is that the new container can read the data without interfering with the original FootballStats container’s ongoing data collection. At the same time, analysis of the incoming data can be performed using defined parameters and algorithms. This information can be stored anywhere, either in the existing volume or a new one, if needed. Other Types of Mounts In addition to standard volumes, Docker Volumes also supports other types of mounts designed to solve specialized tasks: Bind Mount Bind mounts are used to attach an existing path on the host to a container. This is useful for including configuration files, datasets, or static assets from websites. To specify directories for mounting into the container, use the --mount option with the syntax <host path>:<container path>. Tmpfs Mount Tmpfs mounts serve the opposite purpose of regular Docker Volumes — they do not persist data after the container is destroyed. This can be useful for developers who perform extensive logging. In such cases, continuously writing temporary data to disk can significantly degrade system performance. The --tmpfs option creates temporary in-memory directories, avoiding constant access to the file system. Drivers Docker Volume Drivers are a powerful tool that enable flexible volume management. They allow you to specify various storage options, the most important being the storage location — which can be local or remote, even outside the physical or virtual infrastructure of the provider. This ensures that data can survive not only the destruction of the container but even the shutdown of the host itself. Conclusion So, we’ve learned how to create and manage storage using Docker Volumes. For more information on how to modify container storage in Docker, refer to the platform’s official documentation. 
09 June 2025 · 6 min to read

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start.
Email us
Hostman's Support