Structure and Types of File Systems in Linux

Technical writer

Infrastructure

05.05.2025

11 min read

The Linux file system is a complex tree-structured system that begins at the root. It consists of directories and subdirectories. Every file and file system is interconnected. This structure follows the typical scheme proposed by the FHS — a standard supported by the Linux Foundation.

Features of File Systems

A file system is how files are named, stored, retrieved, and updated on a disk or storage partition. The file system's structure must have a predefined format that the operating system understands.

The organization of a file system involves formatting, partitioning, and the method of storing organized data structures on a hard (or floppy) disk.

Such a system shell is divided into two segments: metadata (file name, creation date, size) and user data.

Your computer uses this file system to determine the location of files in your storage.

For example, Windows' main file systems are NTFS, FAT, and FAT32. NTFS supports three types of file links: hard links, junction points, and symbolic links (NTFS Links). The NTFS structure is one of the most efficient and complex to date. Each cluster on the medium has an entry in the FAT table. Entries indicate the assignment of file parts to a cluster. Each file entry concatenates with other file entries, starting from the first cluster. Since the first FAT system could handle only eight-character filenames, some limitations were lifted in FAT16 and then in FAT32.

Types of File Systems in Linux

File system types offered during the installation of a Linux-based OS include:

Ext
Ext2
Ext3
Ext4
JFS
XFS
Btrfs
Swap

These file system types have different functionalities and sets of predefined commands.

Ext — extended file system. It was introduced in 1992 and is considered one of the first.

Its functionality was partly developed based on the UNIX file system. The initial goal was to go beyond the file system used before it (MINIX) and overcome its limitations. Today it is hardly used.

Ext2 — "second extended file system". Known since 1993. It was developed as an analog of the previous file system.

It implemented innovations in memory volume and changed overall performance. It allows storing up to 2 TB of data. Like ext, it has little prospect, so it should be avoided.

Ext3 — third extended file system. Introduced in 2001. It surpasses the previous one in that it is journaled.

A journaling file system is one that writes changes (updates) to files and data in a separate journal before these actions are completed.

This file system uses an algorithm that allows recovering files after a reboot.

Ext4 — fourth extended system. Created in 2006. It overcame many limitations of the third version. It is widely used today and is the default file system in most Linux distributions.

Although it may not be the most advanced, it is reliable and stable enough, so it is commonly used in a wide range of Unix systems.

Therefore, if you don’t want to overthink the pros and cons of the many file systems you can choose from, experts recommend sticking with this one.

Alternative File Systems

JFS — created by IBM in 1990. The name JFS stands for Journaling File System. It easily restores data after a power failure and is quite reliable. Moreover, it consumes less processor power than other file systems.

XFS — high-performance file system. Created by Silicon Graphics. Originally intended for their IRIX OS, it was later ported to Linux. Today, XFS for Windows also exists.

Created in 1990, XFS is a 64-bit high-performance journaling system shell. It works well with large files but not particularly with smaller ones.

Btrfs — an alternative file system proposed by Oracle in 2009. It is considered a competing file system to Ext4, although the latter is generally regarded as the better version (faster data transfer, more stability). However, Btrfs has several unique advantages. Overall, it offers excellent performance.

Types of Linux Files

Linux file types include:

regular file
named pipe
device file
soft link (symbolic link)
directories
socket
door

File Types	Purpose
Regular files	Storing character and binary data
Directories (`d`)	Organizing access to files
Symbolic links (`l`)	Providing access to files located on any media
Block devices (`b`)	Interface for interacting with computer hardware
Character devices (`c`)	Interface for interacting with computer hardware
Pipes (`p`)	Organizing inter-process communication
Sockets (`s`)	Organizing inter-process communication

A directory is a file containing other organized data structures (directories) and provides pointers to them. It acts as a folder in a filing cabinet (grouping related files). But while folders contain only files, directories may contain additional directories (subdirectories).

A symbolic (soft) link points to the name and location of a specific file. When a user copies, moves, or otherwise acts on the link, the operation is performed on the file it references.

Hard links are created separately. A hard link points to the actual data in the file just like a regular file. Apart from the name, there is no difference between the original file and a hard link pointing to the same data. Both files are regular files. A hard link can only be distinguished from any other regular file by the number of links each has. The number of links is shown in the second column of the ls -l listing. If the number is greater than 1, then additional hard links to the data exist.

All physical devices used by Linux are represented by device files. Device files are classified as special characters or special blocks. Special character files represent devices that interact with Linux character by character. Printers are an example of such devices.

Block-special files are hard and floppy disks and CD-ROMs interacting with the OS using data blocks.

Device files are extremely powerful because they allow users to access hardware devices such as drives, modems, and printers as if they were data files. They can be easily moved and copied, and data can be transferred between devices often without using special commands or syntax.

Linux OS Directories

The Linux directory structure is tree-shaped (branching). It’s important to highlight a characteristic specific to Unix-like systems: these OSes aim for simplicity and treat every object as a sequence of bytes. In Unix, these sequences are represented as files.

Unlike Windows OS, which has multiple roots, the Linux file system allows only one root. The root directory is where all other directories and OS files reside (denoted by a forward slash /).

The entire Linux folder structure is represented in a single directory called the root directory.

Main Directories in the Root Directory

/home
This is the home directory. Since Linux is a multi-user environment, each user is assigned a separate object in the system, accessible only to them and the superuser.
/bin and /sbin
bin stands for binary. This is where the OS stores core program codes. Binary files are executable structured data containing compiled source code.
sbin stands for system binary. This directory is reserved for software necessary for system recovery, booting, and rollback.
/opt
Stands for "optional". This is where manually installed applications and programs are stored.
/usr
usr stands for Unix System Resources. This directory contains user-level applications, unlike /bin or /sbin, which house system-level applications.
Subdirectories under /usr include:
- /usr/bin – most binary programs
- /usr/include – header files needed for source code compilation
- /usr/sbin – directories for recurring tasks
- /usr/lib – libraries
- /usr/src – kernel source code and header files
- /usr/share – architecture-independent files (documents, icons, fonts)
  Originally intended for all user-related content, /usr has evolved into a location for software and data used by users.
/lib, /lib32, /lib64
These are directories of library files — programs used by other applications.
/boot
A static bootloader that contains the kernel's executable file and other configuration files needed to start the PC.
/sys
This is where the user interacts with the kernel. It is considered a structured path to the kernel. The directory is mounted with a virtual file system called sysfs, serving as the kernel interface for accessing data about connected devices.
/tmp
Temporary files needed by applications during a session are stored here.
/dev
Contains special device files that allow software to interact with peripherals. Device files are categorized into character and block devices.
A block device performs data input/output in blocks (e.g., an SSD), while a character device handles input/output as a stream of characters (e.g., a keyboard).
/proc
proc stands for process. This directory contains pseudo-files that provide information about system resources.
/run
This directory is mounted with a virtual tmpfs file system and holds runtime files related to active processes. These files exist in RAM and disappear when the session ends.
/root
The home directory for the superuser (administrator).
/srv
The service catalog. If you use a web server, you can store data for a specific webpage here.

File System and Data Storage Paths on Physical Disk

Linux directories map the names of structured data to their addresses on the physical disk. Linux directories have a predefined size to store metadata.

Files in directories use inodes (index nodes). An inode stores the disk block address and file attributes.

Each directory and file information in Linux contains an inode, and the inode itself holds a list of pointers referencing disk blocks.

A directory in the file system is an inode that stores information about all structured data names it contains.

Another note about inodes: Inodes are unique, but the names pointing to these nodes are not. This is why inodes track hard links.

Linux Architecture

The architecture of Linux consists of the hardware layer, kernel, system library, system, and utilities.

At the top is user space, where user applications run. Below this is the kernel space, where the OS kernel resides.

There is also a specific library collection called the GNU C Library (glibc). This library provides the OS call interface that bridges the kernel and user applications. Both user applications and the kernel operate in their own protected address spaces. Each user process has its own virtual address space, while the kernel has a unified address space.

The kernel structure includes three main levels:

System Call Interface (SCI) – the top level that handles system calls (e.g., file writing).
Core kernel code – an architecture-independent object shared across supported architectures.
Architecture-specific code – forms the Board Support Package, designed specifically for the processor and platform of the given architecture.

Linux architecture is examined from various perspectives. A key goal of architectural decomposition is to enhance understanding.

Kernel Tasks

The kernel performs several functions:

Process management – determines which processes use the CPU, when, and for how long.
Memory management – monitors memory usage, allocation location, and duration.
Device drivers – serve as interpreters between hardware and processes.
System calls – handle service requests from active processes.

The kernel is invisible to the user and operates in its own realm (kernel space). What users see (browsers, files) exists in user space. These applications interact with the kernel through the System Call Interface.

Linux Operating Layers

Linux Kernel – OS software residing in memory that instructs the CPU.
Hardware – the physical machine consisting of RAM, CPU, and I/O devices like storage, network, and graphics. The CPU performs computations, reads memory, and writes to RAM.
User Processes – running programs managed by the kernel, collectively forming user space. These processes interact with each other via inter-process communication (IPC).

OS code executes on CPUs in two modes: kernel mode and user mode. Kernel mode has unrestricted hardware access, while user mode restricts access to memory, SCI, and CPU. This division also applies to memory (kernel space vs. user space) and enables complex operations like privilege separation and virtual machine creation.

Linux Distributions

Above the OS kernel, a Linux distribution is a collection of applications (typically open-source). A distribution may include server software, admin tools, documentation, and various desktop applications.

It aims to offer a consistent interface, safe and simple software management, and often a specific operational purpose.

Linux is freely distributed and accessible through multiple means. It is used by individuals and organizations and is often combined with free or proprietary software.

A distribution typically includes all software needed for installation and use.

Popular Linux distributions include:

Red Hat
Ubuntu
Debian
CentOS
Arch Linux
Linux Mint

These distributions can be used by beginners and system administrators. For example, Ubuntu is suitable for novices due to its user-friendly interface. Arch Linux is more suited to professionals, offering fewer pre-installed packages.

Infrastructure

05.05.2025

11 min read