Manipulating file system paths in Python 3 is handled by several libraries, one of which is pathlib
. In this article, we'll show you how to use the pathlib
module and solve practical tasks with it.
Before diving into the practical use of the pathlib
library, we need to refresh some foundational concepts about the file system and its terminology.
Let's consider a file example.exe
located in C:\Program Files
(Windows file system). This file has four main characteristics from the perspective of the file system:
Path: This is the identifier of the file, determining its location in the file system based on a sequence of directory names. Paths can be divided into two types:
Absolute path: A full path that starts from the root directory, e.g., C:\Program Files\example.exe
.
Relative path: A path relative to another directory, e.g., Program Files\example.exe
.
Directory (C:\Program Files
): This is an object in the file system used to organize files hierarchically. It's often referred to as a directory or folder. The term "folder" perfectly describes this object’s practical use — for convenient storage and quick retrieval when needed.
Extension (.exe
): This indicates the file type. Extensions help users or programs determine the type of data stored in the file (video, text, music, etc.).
Let's import the pathlib
module to manipulate file system paths:
from pathlib import *
The classes in this module can be divided into two types: Pure
(or "abstract") and Concrete
. The Pure
classes are used for abstract computational work with paths, where no actual interaction with the OS file system occurs. The Concrete
classes, on the other hand, allow for direct interaction with the file system (e.g., creating and deleting directories, reading files, etc.).
If you're unsure which class to use, the Path
class is likely the right choice. It automatically adapts to the OS and allows interaction with the file system without restrictions. In this article, we'll cover all classes but focus specifically on Path
.
Let's start small by creating a variable of type Path
:
>>> PathExample = Path('Hostman', 'Cloud', 'Pathlib')
>>> PathExample
WindowsPath('Hostman/Cloud/Pathlib')
As we can see, the module automatically adapts to the operating system — in this case, Windows 10. The constructor Path(args)
creates a new Path
instance, accepting directories, files, or other paths as arguments.
The PathExample
is a relative path. We need to add the root directory to make it an absolute path.
Using the Path.home()
and Path.cwd()
methods, we can get the current user's home directory and the current working directory:
>>> Path.home()
WindowsPath('C:/Users/Blog')
>>> Path.cwd()
WindowsPath('C:/Users/Blog/AppData/Local/Programs/Python/Python310')
Let's make PathExample
an absolute path and explore other aspects of working with pathlib
:
>>> PathExample = Path.home() / PathExample
>>> PathExample
WindowsPath('C:/Users/Blog/Hostman/Cloud/Pathlib')
Using the /
operator, we can concatenate paths to create new ones.
The Python pathlib
module provides various methods and properties to retrieve information about file paths. For illustration, let's create a new variable AttributeExample
and append a file to the path:
>>> AttributeExample = PathExample / 'file.txt'
>>> AttributeExample
WindowsPath('C:/Users/Blog/Hostman/Cloud/Pathlib/file.txt')
To retrieve the drive letter or name, use the .drive
property:
>>> AttributeExample.drive
'C:'
You can get parent directories using two properties: .parent
and .parents[n]
.
The .parent
property returns the immediate parent directory:
>>> AttributeExample.parent
WindowsPath('C:/Users/Blog/Hostman/Cloud/Pathlib')
To get higher-level parent directories, you can either call .parent
multiple times:
>>> AttributeExample.parent.parent
WindowsPath('C:/Users/Blog/Hostman/Cloud')
Or use .parents[n]
to retrieve the nth ancestor:
>>> AttributeExample.parents[3]
WindowsPath('C:/Users/Blog')
To get the file name, use the .name
property:
>>> AttributeExample.name
'file.txt'
To get the file extension, use the .suffix
property (or .suffixes
for multiple extensions, such as .tar.gz
):
>>> AttributeExample.suffix
'.txt'
>>> Path('file.tar.gz').suffixes
['.tar', '.gz']
To check if the path is absolute, use the .is_absolute()
method:
>>> AttributeExample.is_absolute()
True
To split the path into its individual components, use the .parts
property:
>>> AttributeExample.parts
('C:\\', 'Users', 'Blog', 'Hostman', 'Cloud', 'Pathlib', 'file.txt')
You can compare paths using both comparison operators and various methods.
You can check if two paths are the same:
>>> Path('hostman') == Path('HOSTMAN')
True
Note: On UNIX-based systems, case sensitivity matters, so the result would be False:
>>> PurePosixPath('hostman') == PurePosixPath('HOSTMAN')
False
This is because Windows file systems are case-insensitive, unlike UNIX-based systems.
You can check if one path is part of another using .is_relative_to()
:
>>> CompareExample = AttributeExample
>>> CompareExample.is_relative_to('C:')
True
>>> CompareExample.is_relative_to('D:/')
False
You can also use patterns for matching with the .match()
method:
>>> CompareExample.match('*.txt')
True
To create directories using the pathlib
module, you can use the .mkdir(parents=True/False, exist_ok=True/False)
method. This method takes two boolean parameters in addition to the path where the folder is to be created:
parents
: If True, it creates any necessary parent directories. If False, it raises an error if they don't exist.
exist_ok
: Determines whether an error should be raised if the directory already exists.
Let's create a folder CreateExample
, but first, check if such a directory already exists using the .is_dir()
method:
>>> CreateExample = CompareExample
>>> CreateExample.is_dir()
False
Now, let's create the folder:
>>> CreateExample.mkdir(parents=True, exist_ok=True)
>>> CreateExample.is_dir()
True
If you check the result in Windows Explorer, you will see that a directory (not a file) was created. To create an empty file, use the .touch()
method. But first, let's remove the file.txt
directory using the .rmdir()
method:
>>> CreateExample.rmdir()
>>> CreateExample.touch()
>>> CreateExample.is_file()
True
To delete files, use the .unlink()
method.
Let's create a more complex directory structure based on the existing folder:
>>> SearchingExample = CreateExample
>>> Hosting = Path(SearchingExample.parents[2], 'hosting/host.txt')
>>> Hosting.parent.mkdir(parents=True, exist_ok=True)
>>> Hosting.touch()
>>> Docker = Path(SearchingExample.parents[1], 'Docker/desk.txt')
>>> Docker.parent.mkdir(parents=True, exist_ok=True)
>>> Docker.touch()
We now have the following structure (starting from C:\Users\Blog\Hostman
):
Cloud
|— Pathlib
| `— file.txt
`— Docker
`— desk.txt
Hosting
`— host.txt
To search for files using pathlib
, you can use a for loop along with the .glob()
method:
>>> for file_cloud in SearchingExample.parents[2].glob('*.txt'):
print(file_cloud)
This code doesn't find anything because it doesn't search in subfolders. To search recursively in subdirectories, modify it as follows:
>>> for file_cloud in SearchingExample.parents[2].glob('**/*.txt'):
print(file_cloud)
...
C:\Users\Blog\Hostman\Cloud\Docker\desk.txt
C:\Users\Blog\Hostman\Cloud\Pathlib\file.txt
C:\Users\Blog\Hostman\hosting\host.txt
You can perform both reading and writing operations in text or binary mode. We'll focus on text mode. The pathlib
module provides four methods:
Reading: .read_text() and .read_bytes()
Writing: .write_text() and .write_bytes()
Let's write some important information to a file, for example, "Hostman offers really cool cloud servers!". This is definitely something worth saving:
>>> WRExample = SearchingExample
>>> WRExample.is_file()
True
>>> WRExample.write_text('Hostman offers really cool cloud servers!')
55 # Length of the message
>>> WRExample.read_text()
'Hostman offers really cool cloud servers!'
In conclusion, the pathlib
module is an essential tool for efficient file system management in Python. It allows developers to easily create, manipulate, and query file paths, regardless of the operating system. pathlib
simplifies tasks such as file creation, deletion, and searching by supporting both absolute and relative paths and providing detailed metadata access. Its flexibility and ease of use make it a preferred choice for modern Python development, allowing for cleaner, more maintainable code. Embracing pathlib
can significantly enhance productivity in any file-related project.