The Linux file system is a complex tree-structured system that begins at the root. It consists of directories and subdirectories. Every file and file system is interconnected. This structure follows the typical scheme proposed by the FHS — a standard supported by the Linux Foundation.
A file system is how files are named, stored, retrieved, and updated on a disk or storage partition. The file system's structure must have a predefined format that the operating system understands.
The organization of a file system involves formatting, partitioning, and the method of storing organized data structures on a hard (or floppy) disk.
Such a system shell is divided into two segments: metadata (file name, creation date, size) and user data.
Your computer uses this file system to determine the location of files in your storage.
For example, Windows' main file systems are NTFS, FAT, and FAT32. NTFS supports three types of file links: hard links, junction points, and symbolic links (NTFS Links). The NTFS structure is one of the most efficient and complex to date. Each cluster on the medium has an entry in the FAT table. Entries indicate the assignment of file parts to a cluster. Each file entry concatenates with other file entries, starting from the first cluster. Since the first FAT system could handle only eight-character filenames, some limitations were lifted in FAT16 and then in FAT32.
File system types offered during the installation of a Linux-based OS include:
These file system types have different functionalities and sets of predefined commands.
Ext — extended file system. It was introduced in 1992 and is considered one of the first.
Its functionality was partly developed based on the UNIX file system. The initial goal was to go beyond the file system used before it (MINIX) and overcome its limitations. Today it is hardly used.
Ext2 — "second extended file system". Known since 1993. It was developed as an analog of the previous file system.
It implemented innovations in memory volume and changed overall performance. It allows storing up to 2 TB of data. Like ext, it has little prospect, so it should be avoided.
Ext3 — third extended file system. Introduced in 2001. It surpasses the previous one in that it is journaled.
A journaling file system is one that writes changes (updates) to files and data in a separate journal before these actions are completed.
This file system uses an algorithm that allows recovering files after a reboot.
Ext4 — fourth extended system. Created in 2006. It overcame many limitations of the third version. It is widely used today and is the default file system in most Linux distributions.
Although it may not be the most advanced, it is reliable and stable enough, so it is commonly used in a wide range of Unix systems.
Therefore, if you don’t want to overthink the pros and cons of the many file systems you can choose from, experts recommend sticking with this one.
JFS — created by IBM in 1990. The name JFS stands for Journaling File System. It easily restores data after a power failure and is quite reliable. Moreover, it consumes less processor power than other file systems.
XFS — high-performance file system. Created by Silicon Graphics. Originally intended for their IRIX OS, it was later ported to Linux. Today, XFS for Windows also exists.
Created in 1990, XFS is a 64-bit high-performance journaling system shell. It works well with large files but not particularly with smaller ones.
Btrfs — an alternative file system proposed by Oracle in 2009. It is considered a competing file system to Ext4, although the latter is generally regarded as the better version (faster data transfer, more stability). However, Btrfs has several unique advantages. Overall, it offers excellent performance.
Linux file types include:
File Types |
Purpose |
Regular files |
Storing character and binary data |
Directories ( |
Organizing access to files |
Symbolic links ( |
Providing access to files located on any media |
Block devices ( |
Interface for interacting with computer hardware |
Character devices ( |
|
Pipes ( |
Organizing inter-process communication |
Sockets ( |
A directory is a file containing other organized data structures (directories) and provides pointers to them. It acts as a folder in a filing cabinet (grouping related files). But while folders contain only files, directories may contain additional directories (subdirectories).
A symbolic (soft) link points to the name and location of a specific file. When a user copies, moves, or otherwise acts on the link, the operation is performed on the file it references.
Hard links are created separately. A hard link points to the actual data in the file just like a regular file. Apart from the name, there is no difference between the original file and a hard link pointing to the same data. Both files are regular files. A hard link can only be distinguished from any other regular file by the number of links each has. The number of links is shown in the second column of the ls -l
listing. If the number is greater than 1, then additional hard links to the data exist.
All physical devices used by Linux are represented by device files. Device files are classified as special characters or special blocks. Special character files represent devices that interact with Linux character by character. Printers are an example of such devices.
Block-special files are hard and floppy disks and CD-ROMs interacting with the OS using data blocks.
Device files are extremely powerful because they allow users to access hardware devices such as drives, modems, and printers as if they were data files. They can be easily moved and copied, and data can be transferred between devices often without using special commands or syntax.
The Linux directory structure is tree-shaped (branching). It’s important to highlight a characteristic specific to Unix-like systems: these OSes aim for simplicity and treat every object as a sequence of bytes. In Unix, these sequences are represented as files.
Unlike Windows OS, which has multiple roots, the Linux file system allows only one root. The root directory is where all other directories and OS files reside (denoted by a forward slash /).
The entire Linux folder structure is represented in a single directory called the root directory.
/home
/bin
and /sbin
/opt
/usr
/bin
or /sbin
, which house system-level applications./usr
include:/usr/bin
– most binary programs/usr/include
– header files needed for source code compilation/usr/sbin
– directories for recurring tasks/usr/lib
– libraries/usr/src
– kernel source code and header files/usr/share
– architecture-independent files (documents, icons, fonts)/usr
has evolved into a location for software and data used by users./lib
, /lib32
, /lib64
/boot
/sys
sysfs
, serving as the kernel interface for accessing data about connected devices./tmp
/dev
/proc
/run
tmpfs
file system and holds runtime files related to active processes. These files exist in RAM and disappear when the session ends./root
/srv
Linux directories map the names of structured data to their addresses on the physical disk. Linux directories have a predefined size to store metadata.
Files in directories use inodes (index nodes). An inode stores the disk block address and file attributes.
Each directory and file information in Linux contains an inode, and the inode itself holds a list of pointers referencing disk blocks.
A directory in the file system is an inode that stores information about all structured data names it contains.
Another note about inodes: Inodes are unique, but the names pointing to these nodes are not. This is why inodes track hard links.
The architecture of Linux consists of the hardware layer, kernel, system library, system, and utilities.
At the top is user space, where user applications run. Below this is the kernel space, where the OS kernel resides.
There is also a specific library collection called the GNU C Library (glibc). This library provides the OS call interface that bridges the kernel and user applications. Both user applications and the kernel operate in their own protected address spaces. Each user process has its own virtual address space, while the kernel has a unified address space.
The kernel structure includes three main levels:
Linux architecture is examined from various perspectives. A key goal of architectural decomposition is to enhance understanding.
The kernel performs several functions:
The kernel is invisible to the user and operates in its own realm (kernel space). What users see (browsers, files) exists in user space. These applications interact with the kernel through the System Call Interface.
OS code executes on CPUs in two modes: kernel mode and user mode. Kernel mode has unrestricted hardware access, while user mode restricts access to memory, SCI, and CPU. This division also applies to memory (kernel space vs. user space) and enables complex operations like privilege separation and virtual machine creation.
Above the OS kernel, a Linux distribution is a collection of applications (typically open-source). A distribution may include server software, admin tools, documentation, and various desktop applications.
It aims to offer a consistent interface, safe and simple software management, and often a specific operational purpose.
Linux is freely distributed and accessible through multiple means. It is used by individuals and organizations and is often combined with free or proprietary software.
A distribution typically includes all software needed for installation and use.
Popular Linux distributions include:
These distributions can be used by beginners and system administrators. For example, Ubuntu is suitable for novices due to its user-friendly interface. Arch Linux is more suited to professionals, offering fewer pre-installed packages.