Sign In
Sign In

Working with the pathlib Module in Python 3

Working with the pathlib Module in Python 3
Hostman Team
Technical writer
Python
23.10.2024
Reading time: 8 min

Manipulating file system paths in Python 3 is handled by several libraries, one of which is pathlib. In this article, we'll show you how to use the pathlib module and solve practical tasks with it.

File System Basics

Before diving into the practical use of the pathlib library, we need to refresh some foundational concepts about the file system and its terminology.

Let's consider a file example.exe located in C:\Program Files (Windows file system). This file has four main characteristics from the perspective of the file system:

  • Path: This is the identifier of the file, determining its location in the file system based on a sequence of directory names. Paths can be divided into two types:

    • Absolute path: A full path that starts from the root directory, e.g., C:\Program Files\example.exe.

    • Relative path: A path relative to another directory, e.g., Program Files\example.exe.

  • Directory (C:\Program Files): This is an object in the file system used to organize files hierarchically. It's often referred to as a directory or folder. The term "folder" perfectly describes this object’s practical use — for convenient storage and quick retrieval when needed.

  • Extension (.exe): This indicates the file type. Extensions help users or programs determine the type of data stored in the file (video, text, music, etc.).

Path Instances

Let's import the pathlib module to manipulate file system paths:

from pathlib import *

The classes in this module can be divided into two types: Pure (or "abstract") and Concrete. The Pure classes are used for abstract computational work with paths, where no actual interaction with the OS file system occurs. The Concrete classes, on the other hand, allow for direct interaction with the file system (e.g., creating and deleting directories, reading files, etc.).

If you're unsure which class to use, the Path class is likely the right choice. It automatically adapts to the OS and allows interaction with the file system without restrictions. In this article, we'll cover all classes but focus specifically on Path.

Let's start small by creating a variable of type Path:

>>> PathExample = Path('Hostman', 'Cloud', 'Pathlib')
>>> PathExample
WindowsPath('Hostman/Cloud/Pathlib')

As we can see, the module automatically adapts to the operating system — in this case, Windows 10. The constructor Path(args) creates a new Path instance, accepting directories, files, or other paths as arguments.

The PathExample is a relative path. We need to add the root directory to make it an absolute path.

Using the Path.home() and Path.cwd() methods, we can get the current user's home directory and the current working directory:

>>> Path.home()
WindowsPath('C:/Users/Blog')
>>> Path.cwd()
WindowsPath('C:/Users/Blog/AppData/Local/Programs/Python/Python310')

Let's make PathExample an absolute path and explore other aspects of working with pathlib:

>>> PathExample = Path.home() / PathExample
>>> PathExample
WindowsPath('C:/Users/Blog/Hostman/Cloud/Pathlib')

Using the / operator, we can concatenate paths to create new ones.

Path Attributes

The Python pathlib module provides various methods and properties to retrieve information about file paths. For illustration, let's create a new variable AttributeExample and append a file to the path:

>>> AttributeExample = PathExample / 'file.txt'
>>> AttributeExample
WindowsPath('C:/Users/Blog/Hostman/Cloud/Pathlib/file.txt')

Disk

To retrieve the drive letter or name, use the .drive property:

>>> AttributeExample.drive
'C:'

Parent Directories

You can get parent directories using two properties: .parent and .parents[n].

The .parent property returns the immediate parent directory:

>>> AttributeExample.parent
WindowsPath('C:/Users/Blog/Hostman/Cloud/Pathlib')

To get higher-level parent directories, you can either call .parent multiple times:

>>> AttributeExample.parent.parent
WindowsPath('C:/Users/Blog/Hostman/Cloud')

Or use .parents[n] to retrieve the nth ancestor:

>>> AttributeExample.parents[3]
WindowsPath('C:/Users/Blog')

Name

To get the file name, use the .name property:

>>> AttributeExample.name
'file.txt'

Extension

To get the file extension, use the .suffix property (or .suffixes for multiple extensions, such as .tar.gz):

>>> AttributeExample.suffix
'.txt'
>>> Path('file.tar.gz').suffixes
['.tar', '.gz']

Absolute or Relative Path

To check if the path is absolute, use the .is_absolute() method:

>>> AttributeExample.is_absolute()
True

Path Components

To split the path into its individual components, use the .parts property:

>>> AttributeExample.parts
('C:\\', 'Users', 'Blog', 'Hostman', 'Cloud', 'Pathlib', 'file.txt')

Paths Comparison

You can compare paths using both comparison operators and various methods.

Comparison Operators

You can check if two paths are the same:

>>> Path('hostman') == Path('HOSTMAN')
True

Note: On UNIX-based systems, case sensitivity matters, so the result would be False:

>>> PurePosixPath('hostman') == PurePosixPath('HOSTMAN')
False

This is because Windows file systems are case-insensitive, unlike UNIX-based systems.

Comparison Methods

You can check if one path is part of another using .is_relative_to():

>>> CompareExample = AttributeExample
>>> CompareExample.is_relative_to('C:')
True

>>> CompareExample.is_relative_to('D:/')
False

You can also use patterns for matching with the .match() method:

>>> CompareExample.match('*.txt')
True

Creating and Deleting Folders and Files

To create directories using the pathlib module, you can use the .mkdir(parents=True/False, exist_ok=True/False) method. This method takes two boolean parameters in addition to the path where the folder is to be created:

  • parents: If True, it creates any necessary parent directories. If False, it raises an error if they don't exist.

  • exist_ok: Determines whether an error should be raised if the directory already exists.

Let's create a folder CreateExample, but first, check if such a directory already exists using the .is_dir() method:

>>> CreateExample = CompareExample
>>> CreateExample.is_dir()
False

Now, let's create the folder:

>>> CreateExample.mkdir(parents=True, exist_ok=True)
>>> CreateExample.is_dir()
True

If you check the result in Windows Explorer, you will see that a directory (not a file) was created. To create an empty file, use the .touch() method. But first, let's remove the file.txt directory using the .rmdir() method:

>>> CreateExample.rmdir()
>>> CreateExample.touch()
>>> CreateExample.is_file()
True

To delete files, use the .unlink() method.

Searching for Files

Let's create a more complex directory structure based on the existing folder:

>>> SearchingExample = CreateExample

>>> Hosting = Path(SearchingExample.parents[2], 'hosting/host.txt')
>>> Hosting.parent.mkdir(parents=True, exist_ok=True)
>>> Hosting.touch()

>>> Docker = Path(SearchingExample.parents[1], 'Docker/desk.txt')
>>> Docker.parent.mkdir(parents=True, exist_ok=True)
>>> Docker.touch()

We now have the following structure (starting from C:\Users\Blog\Hostman):

Cloud
|— Pathlib
|    `— file.txt
`— Docker
        `— desk.txt
Hosting
`— host.txt

To search for files using pathlib, you can use a for loop along with the .glob() method:

>>> for file_cloud in SearchingExample.parents[2].glob('*.txt'):
   print(file_cloud)

This code doesn't find anything because it doesn't search in subfolders. To search recursively in subdirectories, modify it as follows:

>>> for file_cloud in SearchingExample.parents[2].glob('**/*.txt'):
    print(file_cloud)
...
C:\Users\Blog\Hostman\Cloud\Docker\desk.txt
C:\Users\Blog\Hostman\Cloud\Pathlib\file.txt
C:\Users\Blog\Hostman\hosting\host.txt

Reading and Writing to Files

You can perform both reading and writing operations in text or binary mode. We'll focus on text mode. The pathlib module provides four methods:

  • Reading: .read_text() and .read_bytes()

  • Writing: .write_text() and .write_bytes()

Let's write some important information to a file, for example, "Hostman offers really cool cloud servers!". This is definitely something worth saving:

>>> WRExample = SearchingExample

>>> WRExample.is_file()
True
>>> WRExample.write_text('Hostman offers really cool cloud servers!')
55  # Length of the message
>>> WRExample.read_text()
'Hostman offers really cool cloud servers!'

Conclusion

In conclusion, the pathlib module is an essential tool for efficient file system management in Python. It allows developers to easily create, manipulate, and query file paths, regardless of the operating system. pathlib simplifies tasks such as file creation, deletion, and searching by supporting both absolute and relative paths and providing detailed metadata access. Its flexibility and ease of use make it a preferred choice for modern Python development, allowing for cleaner, more maintainable code. Embracing pathlib can significantly enhance productivity in any file-related project.

Python
23.10.2024
Reading time: 8 min

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start
Email us