Sign In
Sign In

What is Code Review and When Is It Needed?

What is Code Review and When Is It Needed?
Hostman Team
Technical writer
Infrastructure

You can write code. You can edit existing code. You can even rewrite it from scratch. There’s a lot you can do with code. But what’s the point if the code lives in its own echo chamber? If the same person writes, views, and edits it, many critical errors can drift from one version to another unnoticed without external evaluation. Code locked within the confines of a single text editor is highly likely to stagnate, accumulating inefficient constructs and architectural decisions, even if written by an experienced developer.

This is why every developer should understand what code review is, how it’s done, and what tools are needed. Presenting your code properly to others, gathering feedback, and making changes wisely is important. Only this way can code remain “fresh” and efficient, and applications based on it — secure and high-performing.

Code review is the process of examining code by one or more developers to identify errors, improve quality, and increase readability.

Types of Code Review

1. Formal Review

A formal review is a strict code-checking process with clearly defined stages. It’s used in critical projects where errors can have serious consequences — for example, in finance or healthcare applications. The analysis covers not just the code but also the architecture, performance, and security. Reviewers often include not just developers but also testers and analysts.

For example, a company developing a banking app might follow these steps:

  • Development: A developer completes a new authentication module and submits a pull request via GitHub.
  • Analysis: A review group (2 senior developers + 1 security specialist) is notified and checks the code for logic, readability, and security (e.g., resistance to SQL injection and XSS attacks).
  • Discussion: Reviewers meet the developer over Zoom and give feedback.
  • Documentation: All notes are posted in GitHub comments and tracked in Jira. For instance, some RESTful requests may be flagged as vulnerable with a recommendation to use parameterized queries.
  • Fixes: The developer updates the code and the pull request; the cycle repeats until approval.
  • Approval: Once reviewers are satisfied, the code is merged into the main branch.

2. Informal Review

Informal code review is less strict and more flexible, usually involving:

  • Quick code discussions in chat or meetings
  • Showing code to a colleague in person
  • Asking an expert a technical question

This kind of review happens often in day-to-day work and is characterized by spontaneity, lack of documentation, informal reviewer choice, and shallow checks.

In simpler terms, it’s more like seeking advice than a formal third-party audit. It's a form of knowledge sharing.

Types include:

  • Over-the-Shoulder Review: One developer shows their code to another in real time (via screen share, chat message, or simply turning the monitor).
  • Ad-hoc Review: A developer sends code to a colleague asking them to check it when convenient, e.g., I wrote this handler, but there’s an error. Can you take a look?
  • Unstructured Team Review: Code is discussed at a team meeting, casually and collaboratively, often with knowledge sharing.

Feedback is given as recommendations, not mandates. Developers can ignore or reject suggestions.

Although informal reviews are less reliable than formal ones, they’re quicker and easier, and often complement formal reviews.

Examples of integration:

  • Preliminary Checks: Before a pull request, a dev shows code to a colleague to discuss and fix issues.
  • Informal Discussion During Formal Review: Reviewers may chat to resolve issues more efficiently.
  • Quick Fixes: Developers make changes right after oral feedback instead of long comment exchanges.

3. Pair Programming

Pair programming is when two developers work together on one machine: one writes code, and the other reviews it in real-time.

It’s literally simultaneous coding and reviewing, which helps catch bugs early.

Roles:

  • Driver: Writes code, focused on syntax and implementation.
  • Navigator: Reviews logic, looks for bugs, suggests improvements, and thinks ahead.

Roles can be switched regularly to keep both engaged.

Variants:

  • Strong Style: Navigator makes decisions, and the driver just types. It works well if one of the developers is more experienced.
  • Loose Pairing: Both share decision-making, swapping roles as needed.

Though rare, pair programming has advantages:

  • Instant Feedback: Bugs are fixed immediately.
  • In-depth Review: The second dev is deeply involved in writing the code.
  • On-the-job Learning: Juniors learn directly from experienced peers.

It’s more of a collaborative development method than a strict review.

4. Automated Review

Automated code review uses tools that analyze code for errors, style, and vulnerabilities without human intervention.

These tools are triggered automatically (e.g., after compilation, commit, or pull request).

They analyze, run tests (e.g., unit tests), and generate reports. Some tools can even auto-merge code if it passes checks.

Automated code review is part of DevOps and is common in CI/CD pipelines before deploying to production.

Types:

  • Static Analysis: Checks code without executing it — syntax errors, bad patterns, etc.
  • Dynamic Analysis: Runs code to detect memory leaks, threading issues, and runtime errors.

However, for now, tools can't catch business logic or architectural issues. As AI evolves, tools will likely become better at "understanding" code.

When is Code Review Needed?

Ideally, you should conduct code reviews both in small and large-scale projects.

The only exceptions might be personal side-projects (pet projects), although even these can benefit from outside input.

Automated testing has become standard, from JavaScript websites to C++ libraries.

Still, code review can be skipped for:

  • Trivial changes (e.g., formatting, UI text updates)
  • Peripheral code (e.g., throwaway scripts, config files)
  • Auto-generated code — unless manually modified

In short, review the code only if it plays a critical or central role in the app and a human wrote it.

Main Stages of Conducting Code Review

Regardless of whether a review is formal, informal, or automated, there are several common stages.

Preparation for Review

Whether the written code is a new component for a production application or a modification of an existing method in a personal project, the developer is usually motivated to have it reviewed, either by fellow developers or by using automated testing tools.

Accordingly, the developer has goals for the review and a rough plan for how it should be conducted, at least in broad terms.

It’s important to understand who will participate in the review and whether they have the necessary competencies and authority. In the case of automated testing, it’s crucial to choose the right tools.

Otherwise, the goals of the review may not be achieved, and critical bugs might remain in the code.

Time constraints also matter: when all reviewers and testing tools will be ready to analyze the code, and how long it will take. It’s best to coordinate this in advance.

Before starting the actual review, it can also be helpful to self-review—go over the code yourself and try to spot any flaws. There might be problems that can be fixed immediately.

Once the developer is ready for the review, they notify the reviewers via chat, pull request, or just verbally.

Code Analysis and Error Detection

Reviewers study the code over a period of time. During this process, they prepare feedback in various formats: suggested fixes in an IDE, chat comments, verbal feedback, or testing reports.

The format of the feedback depends on the tools used by the development team, which vary from project to project.

Discussion of Edits and Recommendations

Reviewers and the developer conduct a detailed discussion of the reviewed codebase.

The goal is to improve the code while maintaining a productive dialogue. For instance, the developer might justify certain controversial decisions and avoid making some changes. Reviewers might also suggest non-obvious improvements that the developer hadn't considered.

Documentation and Task Preparation

All identified issues should be clearly documented and marked. Based on this, a list of tasks for corrections is prepared. Kanban boards or task managers are often used for this, e.g., Jira, Trello, and GitHub Issues.

Again, the documentation format depends on the tools used by the team.

Even a solo developer working on a personal project might write tasks down in a physical notebook—or, of course, in a digital one. Though keeping tasks in your head is also possible, it’s not recommended.

Nowadays, explicit tracking is better than implicit assumptions. Relying on memory and intuition can lead to mistakes.

Applying Fixes and Final Approval

Once the list of corrections is compiled, the developer can begin making changes. They often also leave responses to comments.

Bringing code to an acceptable state may take several review rounds. The process is repeated until both reviewers and the developer are satisfied.

It’s crucial to ensure the code is fully functional and meets the team’s quality standards.

After that, the final version of the code is merged into the main branch—assuming a version control system is being used.

Tools for Code Review

In most cases, code review is done using software tools. Broadly speaking, they fall into several categories:

  • Version control systems: Most cloud platforms using version control systems (typically Git) offer built-in review tools for viewing, editing, and commenting on code snippets.
  • Collaboration tools: Development teams often use not just messengers but also task managers or Kanban boards. These help with discussing code, assigning tasks, and sharing knowledge.
  • Automated analyzers: Each programming language has tools for static code analysis to catch syntax issues, enforce style rules, and identify potential vulnerabilities.
  • Automated tests: Once statically checked, the code is run through automated tests, usually via language-specific unit testing libraries.

This article only covers the most basic tools that have become standard regardless of domain or programming language.

GitHub / GitLab / Bitbucket

GitHub, GitLab, and Bitbucket are cloud-based platforms for collaborative code hosting based on Git.

Each offers tools for convenient code review. On GitHub and Bitbucket, this is called a Pull Request, while on GitLab it’s a Merge Request.

Process:

  1. The developer creates a Pull/Merge Request documenting code changes, reviewer comments, and commit history.
  2. Reviewers leave inline comments and general feedback.
  3. After discussion, reviewers either approve the changes or request revisions.

Each platform also provides CI/CD tools for running automated tests:

  • GitHub Actions
  • GitLab CI/CD
  • Bitbucket Pipelines

These platforms are considered the main tools for code reviews. The choice depends on team preferences. The toolas are generally similar but differ in details.

Crucible

Atlassian Crucible is a specialized tool dedicated solely to code review. It supports various version control systems: Git, SVN, Mercurial, Perforce.

Crucible suits teams needing a more formalized review process, with detailed reports and customizable settings. It integrates tightly with Jira for project management.

Unlike GitHub/GitLab/Bitbucket, Crucible is a self-hosted solution. It runs on company servers or private clouds.

Pros and cons:

Platform

Deployment

Managed by

Maintenance Complexity

GitHub / GitLab / Bitbucket

Cloud

Developer

Low

Atlassian Crucible

On-premise

End user/admin

High

Crucible demands more setup but allows organizations to enforce internal security and data policies.

Other Tools

Each programming language has its own specialized tools for runtime and static code analysis:

  • C/C++: Valgrind for memory debugging
  • Java: JProfiler, YourKit for profiling; Checkstyle, PMD for syntax checking
  • Python: PyInstrument for performance; Pylint, Flake8 for quality analysis

These tools often integrate into CI/CD pipelines run by systems like GitHub Actions, GitLab CI, CircleCI, Jenkins.

Thus, formal code review tools are best used within a unified CI/CD pipeline to automatically test and build code into a final product.

Best Practices and Tips for Code Review

1. Make atomic changes

Smaller changes are easier and faster to review. It’s better to submit multiple focused reviews than one large, unfocused one.

This aligns with the “Single Responsibility Principle” in SOLID. Each review should target a specific function so reviewers can focus deeply on one area.

2. Automate everything you can

Automation reduces human error. Static analyzers, linters, and unit tests catch issues faster and more reliably.

Automation also lowers developers’ cognitive load and allows them to focus on more complex coding tasks.

3. Review code, not the developer

Code reviews are about the code, not the person writing it. Criticism should target the work, not the author. Maintain professionalism and use constructive language.

A good review motivates and strengthens teamwork. A bad one causes stress and conflict.

4. Focus on architecture and logic

Beautiful code can still have flawed logic. Poor architecture makes maintenance and scaling difficult.

Pay attention to structure—an elegant algorithm means little in a badly designed system.

5. Use checklists for code reviews

Checklists help guide your review and ensure consistency. A basic checklist might include:

  • Is the code readable?
  • Is it maintainable?
  • Is there duplication?
  • Is it covered by tests?
  • Does it align with architectural principles?

You can create custom code review checklists for specific projects or teams.

6. Discuss complex changes in person

Sometimes it’s better to talk in person (or via call) than exchange messages—especially when dealing with broad architectural concerns.

For specific code lines, written comments might be more effective due to the ability to reference exact snippets.

7. Code should be self-explanatory

Good code speaks for itself. The simpler it is, the fewer bugs it tends to have.

When preparing code for review, remember that other developers will read it. The clarity of the code affects the quality of the review.

Put yourself in the reviewers’ shoes and ensure your decisions are easy to understand.

Conclusion

Code review is a set of practices to ensure code quality through analysis and subsequent revisions. It starts with syntax and architecture checks and ends with performance and security testing.

Reviews can be manual, automated, or both. Typically, new code undergoes automated tests first, then manual review—or the reverse.

If everything is in order, the code goes into production. If not, changes are requested, code is updated, and the process is repeated until the desired quality is achieved.

Infrastructure

Similar

Infrastructure

Virtualization vs Containerization: What They Are and When to Use Each

This article explores two popular technologies for abstracting physical hardware: virtualization and containerization. We will provide a general overview of each and also discuss the differences between virtualization and containerization. What Is Virtualization The core component of this technology is the virtual machine (VM). A VM is an isolated software environment that emulates the hardware of a specific platform. In other words, a VM is an abstraction that allows a single physical server to be transformed into multiple virtual ones. Creating a VM makes sense when you need to manage all operating system kernel settings. This avoids kernel conflicts with hardware, supports more features than a specific OS build might provide, and allows you to optimize and install systems with a modified kernel. What Is Containerization Containers work differently: to install and run a container platform, a pre-installed operating system kernel is required (this can also be on a virtual OS). The OS allocates system resources for the containers that provide a fully configured environment for deploying applications. Like virtual machines, containers can be easily moved between servers and provide a certain level of isolation. However, to deploy them successfully, it’s sufficient for the base kernel (e.g., Linux, Windows, or macOS) to match — the specific OS version doesn’t matter. Thus, containers serve as a bridge between the system kernel layer and the application layer. What Is the Difference Between Containerization and Virtualization Some, especially IT beginners, often frame it as "virtualization vs containerization." But these technologies shouldn't be pitted against each other — they actually complement one another. Let’s examine how they differ and where they overlap by looking at how both technologies perform specific functions. Isolation and Security Virtualization makes it possible to fully isolate a VM from the rest of the server, including other VMs. Therefore, VMs are useful when you need to separate your applications from others located on the same servers or within the same cluster. VMs also increase the level of network security. Containerization provides a certain level of isolation, too, but containers are not as robust when it comes to boundary security compared to VMs. However, solutions exist that allow individual containers to be isolated within VMs — one such solution is Hyper-V. Working with the Operating System A VM is essentially a full-fledged OS with its own kernel, which is convenient but imposes high demands on hardware resources (RAM, storage, CPU). Containerization uses only a small fraction of system resources, especially with adapted containers. When forming images in a hypervisor, the minimal necessary software environment is created to ensure the container runs on an OS with a particular kernel. Thus, containerization is much more resource-efficient. OS Updates With virtualization, you have to download and install OS updates on each VM. To install a new OS version, you need to update the VM — in some cases, even create a new one. This consumes a significant amount of time, especially when many virtual machines are deployed. With containers, the situation is similar. First, you modify a file (called a Dockerfile) that contains information about the image. You change the lines that specify the OS version. Then the image is rebuilt and pushed to a registry. But that’s not all: the image must then be redeployed. To do this, you use orchestrators — platforms for managing and scaling containers. Orchestration tools (the most well-known are Kubernetes and Docker Swarm) allow automation of these procedures, but developers must install and learn them first. Deployment Mechanisms To deploy a single VM, Windows (or Linux) tools will suffice, as will the previously mentioned Hyper-V. But if you have two or more VMs, it’s more convenient to use solutions like PowerShell. Single containers are deployed from images via a hypervisor (such as Docker), but for mass deployment, orchestration platforms are essential. So in terms of deployment mechanisms, virtualization and containerization are similar: different tools are used depending on how many entities are being deployed. Data Storage Features With virtualization, VHDs are used when organizing local storage for a single VM. If there are multiple VMs or servers, the SMB protocol is used for shared file access. Hypervisors for containers have their own storage tools. For example, Docker has a local Registry repository that lets you create private storage and track image versions. There is also the public Docker Hub repository, which is used for integration with GitHub. Orchestration platforms offer similar tools: for instance, Kubernetes can set up file storage using Azure’s infrastructure. Load Balancing To balance the load between VMs, they are moved between servers or even clusters, selecting the one with the best fault tolerance. Containers are balanced differently. They can’t be moved per se, but orchestrators provide automatic starting or stopping of individual containers or whole groups. This enables flexible load distribution between cluster nodes. Fault Tolerance Faults are also handled in similar ways. If an individual VM fails, it’s not difficult to transfer that VM to another server and restart the OS there. If there’s an issue with the server hosting the containerization platform, containers can be quickly recreated on another server using the orchestrator. Pros and Cons of Virtualization Advantages: Reliable isolation. Logical VM isolation means failures in one VM don’t affect the others on the same server. VMs also offer a good level of network security: if one VM is compromised, its isolation prevents infection of others. Resource optimization. Several VMs can be deployed on one server, saving on purchasing additional hardware. This also facilitates the creation of clusters in data centers. Flexibility and load balancing. VMs are easily transferred, making it simpler to boost cluster performance and maintain systems. VMs can also be copied and restored from backups. Furthermore, different VMs can run different OSs, and the kernel can be any type — Linux, Windows, or macOS — all on the same server. Disadvantages: Resource consumption. VMs can be several gigabytes in size and consume significant CPU power. There are also limits on how many VMs can run on a single server. Sluggishness. Deployment time depends on how "heavy" the VM is. More importantly, VMs are not well-suited to scaling. Using VMs for short-term computing tasks is usually not worthwhile. Licensing issues. Although licensing is less relevant for Russian developers, you still need to consider OS and software licensing costs when deploying VMs — and these can add up significantly in a large infrastructure. Pros and Cons of Containerization Advantages: Minimal resource use. Since all containers share the same OS kernel, much less hardware is needed than with virtual machines. This means you can create far more containers on the same system. Performance. Small image sizes mean containers are deployed and destroyed much faster than virtual machines. This makes containers ideal for developers handling short-term tasks and dynamic scaling. Immutable images. Unlike virtual machines, container images are immutable. This allows the launch of any number of identical containers, simplifying testing. Updating containers is also easy — a new image with updated contents is created on the container platform. Disadvantages: Compatibility issues. Containers created in one hypervisor (like Docker) may not work elsewhere. Problems also arise with orchestrators: for example, Docker Swarm may not work properly with OpenShift, unlike Kubernetes. Developers need to carefully choose their tools. Limited lifecycle. While persistent container storage is possible, special tools (like Docker Data Volumes) are required. Otherwise, once a container is deleted, all its data disappears. You must plan ahead for data backup. Application size. Containers are designed for microservices and app components. Heavy containers, such as full-featured enterprise software, can cause deployment and performance issues. Conclusion Having explored the features of virtualization and containerization, we can draw a logical conclusion: each technology is suited to different tasks. Containers are fast and efficient, use minimal hardware resources, and are ideal for developers working with microservices architecture and application components. Virtual machines are full-fledged OS environments, suitable for secure corporate software deployment. Therefore, these technologies do not compete — they complement each other.
10 June 2025 · 7 min to read
Infrastructure

Top RDP Clients for Linux in 2025: Remote Access Tools for Every Use Case

RDP (Remote Desktop Protocol) is a proprietary protocol for accessing a remote desktop. All modern Windows operating systems have it by default. However, a Linux system with a graphical interface and the xrdp package installed can also act as a server. This article focuses on Linux RDP clients and the basic principles of how the protocol works. Remote Desktop Protocol RDP operates at the application layer of the OSI model and is based on the Transport Layer Protocol (TCP). Its operation follows this process: A connection is established using TCP at the transport layer. An RDP session is initialized. The RDP client authenticates, and data transmission parameters are negotiated. A remote session is launched: the RDP client takes control of the server. The server is the computer being remotely accessed. The RDP client is the application on the computer used to initiate the connection. During the session, all computational tasks are handled by the server. The RDP client receives the graphical interface of the server's OS, which is controlled using input devices. The graphical interface may be transmitted as a full graphical copy or as graphical primitives (rectangles, circles, text, etc.) to save bandwidth. By default, RDP uses port 3389, but this can be changed if necessary. A typical use case is managing a Windows remote desktop from a Linux system. From anywhere in the world, you can connect to it via the internet and work without worrying about the performance of the RDP client. Originally, RDP was introduced in Windows NT 4.0. It comes preinstalled in all modern versions of Windows. However, implementing a Linux remote desktop solution requires special software. RDP Security Two methods are used to ensure the security of an RDP session: internal and external. Standard RDP Security: This is an internal security subsystem. The server generates RSA keys and a public key certificate. When connecting, the RDP client receives these. If confirmed, authentication takes place. Enhanced RDP Security: This uses external tools to secure the session, such as TLS encryption. Advantages of RDP RDP is network-friendly: it can work over NAT, TCP, or UDP, supports port forwarding, and is resilient to connection drops. Requires only 300–500 Kbps bandwidth. A powerful server can run demanding apps even on weak RDP clients. Supports Linux RDP connections to Windows. Disadvantages of RDP Applications sensitive to latency, like games or video streaming, may not perform well. Requires a stable server. File and document transfer between the client and server may be complicated due to internet speed limitations. Configuring an RDP Server on Windows The most common RDP use case is connecting to a Windows server from another system, such as a Linux client. To enable remote access, the target system must be configured correctly. The setup is fairly simple and works "out of the box" on most modern Windows editions.  Enable remote desktop access via the Remote Access tab in System Properties. Select the users who can connect (by default, only administrators). Check firewall settings. Some profiles like “Public” or “Private” may block RDP by default. If the server is not in a domain, RDP might not work until you allow it manually via Windows Firewall → Allowed Apps. If behind a router, you might need to configure port forwarding via the router’s web interface (typically under Port Forwarding). Recall that RDP uses TCP port 3389 by default. Best RDP Clients for Linux Remmina Website: remmina.org Remmina is a remote desktop client with a graphical interface, written in GTK+ and licensed under GPL. In addition to RDP, it supports VNC, NX, XDMCP, SPICE, X2Go, and SSH. One of its key features is extensibility via plugins. By default, RDP is not available until you install the freerdp plugin. After installing the plugin, restart Remmina, and RDP will appear in the menu. To connect: Add a new connection. Fill in connection settings (you only need the remote machine's username and IP). Customize further if needed (bandwidth, background, hotkeys, themes, etc.). Save the connection — now you can connect with two clicks from the main menu. If you need to run Remmina on Windows, a guide is available on the official website. FreeRDP Website: freerdp.com FreeRDP is a fork of the now-unsupported rdesktop project and is actively maintained under the Apache license. FreeRDP is a terminal-based client. It is configured and launched entirely via the command line. Its command structure is similar to rdesktop, for example: xfreerdp -u USERNAME -p PASSWORD -g WIDTHxHEIGHT IP This command connects to the server at the given IP using the specified credentials and screen resolution. KRDC Website: krdc KRDC (KDE Remote Desktop Client) is the official remote desktop client for KDE that supports RDP and VNC protocols. It offers a clean and straightforward interface consistent with KDE's Plasma desktop environment. KRDC is ideal for users of KDE-based distributions like Kubuntu, openSUSE KDE, and Fedora KDE Spin. It integrates well with KDE's network tools and provides essential features such as full-screen mode, session bookmarking, and network browsing via Zeroconf/Bonjour. KRDC is actively maintained by the KDE community and is available through most Linux package managers. GNOME Connections Website: gnome-connections Vinagre was the former GNOME desktop's default remote desktop client. GNOME Connections, a modernized remote desktop tool for GNOME environments, has since replaced it. GNOME Connections supports RDP and VNC, providing a simple and user-friendly interface that matches the GNOME design language. It focuses on ease of use rather than configurability, making it ideal for non-technical users or quick access needs. Features: Bookmarking for quick reconnections Simple RDP session management Seamless integration into GNOME Shell Connections is maintained as part of the official GNOME project and is available in most distribution repositories. Apache Guacamole Website: guacamole.apache.org This is the simplest yet most complex remote desktop software for Linux. Simple because it works directly in a browser — no additional programs or services are needed. Complex because it requires one-time server installation and configuration. Apache Guacamole is a client gateway for remote connections that works over HTML5. It supports Telnet, SSH, VNC, and RDP — all accessible via a web interface. Although the documentation is extensive, many ready-made scripts exist online to simplify basic setup. To install: wget https://git.io/fxZq5 -O guac-install.sh chmod +x guac-install.sh ./guac-install.sh After installation, the script will provide a connection address and password. To connect to a Windows server via RDP: Open the Admin Panel, go to Settings → Connections, and create a new connection. Enter the username and IP address of the target machine — that's all you need. The connection will now appear on the main page, ready for use. Conclusion RDP is a convenient tool for connecting to a remote machine running Windows or a Linux system with a GUI. The server requires minimal setup — just a few settings and firewall adjustments — and the variety of client programs offers something for everyone.
09 June 2025 · 6 min to read
Infrastructure

Docker Container Storage and Registries: How to Store, Manage, and Secure Your Images

Docker containerization offers many benefits, one of which is image layering, enabling fast container generation. However, containers have limitations — for instance, persistent data needs careful planning, as all data within a container is lost when it's destroyed. In this article, we’ll look at how to solve this issue using Docker’s native solution called Docker Volumes, which allows the creation of persistent Docker container storage. What Happens to Data Written Inside a Container To begin, let’s open a shell inside a container using the following command: docker run -it --rm busybox Now let’s try writing some data to the container: echo "Hostman" > /tmp/data cat /tmp/data Hostman We can see that the data is written, but where exactly? If you're familiar with Docker, you might know that images are structured like onions — layers stacked on top of each other, with the final layer finalizing the image. Each layer can only be written once and becomes read-only afterward. When a container is created, Docker adds another layer for handling write operations. Since container lifespans are limited, all data disappears once the container is gone. This can be a serious problem if the container holds valuable information. To solve this, Docker provides a solution called Docker Volumes. Let’s look at what it is and how it works. Docker Volumes Docker Volumes provide developers with persistent storage for containers. This tool decouples data from the container’s lifecycle, allowing access to container data at any time. As a result, data written inside containers remains available even after the container is destroyed, and it can be reused by other containers. This is a useful solution for sharing data between Docker containers and also enables new containers to connect to the existing storage. How Docker Volumes Work A directory is created on the server and then mounted into one or more containers. This directory is independent because it is not included in the Docker image layer structure, which allows it to bypass the read-only restriction of the image layers for containers that include such a directory. To create a volume, use the following command: docker volume create Now, let’s check its location using: docker volume inspect volume_name The volume name usually consists of a long alphanumeric string. In response, Docker will display information such as the time the volume was created and other metadata, including the Mountpoint. This line shows the path to the volume. To view the data stored in the volume, simply open the specified directory. There are also other ways to create a Docker Volume. For example, the -v option can be added directly during container startup, allowing you to create a volume on the fly: docker run -it --rm -v newdata:/data busybox Let’s break down what’s happening here: The -v argument follows a specific syntax, indicated by the colon right after the volume name (in this case, we chose a very creative name, newdata). After the colon, the mount path inside the container is specified. Now, you can write data to this path, for example: echo "Cloud" > /data/cloud Data written this way can easily be found at the mount path. As seen in the example above, the volume name is not arbitrary — it matches the name we provided using -v. However, Docker Volumes also allow for randomly generated names, which are always unique to each host. If you’re assigning names manually, make sure they are also unique. Now, run the command: docker volume ls If the volume appears in the list, it means any number of other containers can use it. To test this, you can run: docker run -it --rm -v newdata:/data busybox Then write something to the volume. Next, start another container using the exact same command and you’ll see that the data is still there and accessible — meaning it can be reused. Docker Volumes in Practice Now let’s take a look at how Docker Volumes can be used in practice. Suppose we're developing an application to collect specific types of data — let’s say football statistics. We gather this data and plan to use it later for analysis — for example, to assess players’ transfer market values or for betting predictions. Let’s call our application FootballStats. Preserving Data After Container Removal Obviously, if we don’t use Docker Volumes, all the collected statistics will simply be lost as soon as the container that stored them is destroyed. Therefore, we need to store the data in volumes so it can be reused later. To do this, we use the familiar -v option:  -v footballstats:/dir/footballstats This will allow us to store match statistics in the /dir/footballstats directory, on top of all container layers. Sharing Data Suppose the FootballStats container has already gathered a certain amount of data, and now it's time to analyze it. For instance, we might want to find out how a particular team performed in the latest national championship or how a specific player did — goals, assists, cards, etc. To do this, we can mount our volume into a new container, which we’ll call FootballStats-Analytics. The key advantage of this setup is that the new container can read the data without interfering with the original FootballStats container’s ongoing data collection. At the same time, analysis of the incoming data can be performed using defined parameters and algorithms. This information can be stored anywhere, either in the existing volume or a new one, if needed. Other Types of Mounts In addition to standard volumes, Docker Volumes also supports other types of mounts designed to solve specialized tasks: Bind Mount Bind mounts are used to attach an existing path on the host to a container. This is useful for including configuration files, datasets, or static assets from websites. To specify directories for mounting into the container, use the --mount option with the syntax <host path>:<container path>. Tmpfs Mount Tmpfs mounts serve the opposite purpose of regular Docker Volumes — they do not persist data after the container is destroyed. This can be useful for developers who perform extensive logging. In such cases, continuously writing temporary data to disk can significantly degrade system performance. The --tmpfs option creates temporary in-memory directories, avoiding constant access to the file system. Drivers Docker Volume Drivers are a powerful tool that enable flexible volume management. They allow you to specify various storage options, the most important being the storage location — which can be local or remote, even outside the physical or virtual infrastructure of the provider. This ensures that data can survive not only the destruction of the container but even the shutdown of the host itself. Conclusion So, we’ve learned how to create and manage storage using Docker Volumes. For more information on how to modify container storage in Docker, refer to the platform’s official documentation. 
09 June 2025 · 6 min to read

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start.
Email us
Hostman's Support