Sign In
Sign In

How to Safely Store Passwords Using PostgreSQL

How to Safely Store Passwords Using PostgreSQL
Hostman Team
Technical writer
PostgreSQL
16.12.2024
Reading time: 11 min

PostgreSQL is a free, open-source object-relational database. Object-relational databases differ from regular relational ones. Data is still stored in tables, with columns that are linked to each other. However, PostgreSQL operates according to ACID standards (Atomicity, Consistency, Isolation, and Durability), ensuring data accuracy through consistency and atomicity of operations within the tables. Changes are made sequentially, allowing for the immediate detection of failures while writing values.

PostgreSQL supports Multi-Version Concurrency Control (MVCC), a database feature that creates copies of records during modifications, protecting against loss and conflicts during simultaneous read or write operations.

The indexing system in PostgreSQL is more complex and operates faster, using trees and various types of indexing, such as partial, hash, and expression indexing.

The syntax of PostgreSQL is similar to MySQL, but PostgreSQL supports additional subqueries like LIMIT or ALL.

Furthermore, PostgreSQL is compatible with a large number of programming languages. The most common ones are:

  • C/C++
  • Delphi
  • Erlang
  • Go
  • Java
  • Javascript
  • JSON (native since version 9.2)
  • Lisp
  • .NET
  • Python
  • R
  • Tcl

Projects involving user registration and authentication must store passwords on the server side, — usually encrypted. Databases are the most suitable tool for this.

In this article, we will discuss how to properly store passwords in a database (writing and reading them) using PostgreSQL while ensuring all security measures are followed.

Why Protect Credentials

Before discussing storing passwords in databases, we should note that no serious project should store credentials in plaintext, meaning unencrypted, human-readable text. Data is always encrypted. Always.

Here are some brief reasons why:

  • Developer Breach

The developer's servers or the application host's servers can be hacked, for example, through an SQL injection that could retrieve a password string from the database. Unencrypted data will immediately fall into the hands of hackers and be compromised. From that point on, we can only guess what will happen next and what the consequences could be.

  • User Carelessness

Users almost always ignore security recommendations — they don't use password managers or disregard password naming rules for different services. A password leak in one application can lead to the compromise of accounts in other services.

  • Reputation and Trust

Users will easily accuse the service provider (the application developer) of dishonesty if they can read passwords. This can happen even if no illegal actions were taken by employees. It would be a blow to the reputation of the company or project.

Password Hashing

First, it's important to note that when dealing with passwords, what is done is not encryption but rather hashing.

It’s crucial to understand that if something is encrypted, it can always be decrypted. Encrypted information is the same information, just represented differently.

However, hashing works differently. A hash is completely new, unique information derived from some input data — in our case, from a password.

The key point is that retrieving the original data from a hash is impossible (theoretically, possible but practically not feasible). In short, hashing is a one-way operation.

Here are some other less obvious drawbacks of encryption that have made hashing the preferred method over time:

  • Unlike a hash, an encryption method has a variable length, which isn’t ideal for storage within a database or for sending server (or client) packets.

  • Generating an encryption key takes more computational time than generating a hash.

  • When using encryption, key management becomes necessary. This means the keys must be stored somewhere and must be kept secure to prevent unauthorized access.

What Does a Hash Look Like?

A hash is essentially a string of random characters — a set of symbols that has no inherent meaning. The algorithm that generates such a string is called a hash function.

Currently, hashing can only be broken by brute-force attempts. This method is quite crude and only works on initially weak passwords. Hackers simply try a long list of common passwords based on a dictionary. Each password is hashed and sent to the targeted server for authentication attempts. This continues until a match is found. So, no magic is involved.

At the same time, hashing itself is a computationally complex task. Deriving information from the original data (the password) takes time. Some hash functions generate longer keys (for example, through repeated hashing) specifically to increase the time needed for generation. In this case, a dictionary brute-force attack takes more time, giving the security team or the user an advantage.

Hashing in PostgreSQL

Built-in Extension pgcrypto

PostgreSQL has a built-in extension specifically designed for password hashing, so it does not need to be loaded separately. To activate it, you need to run the following command:

CREATE EXTENSION pgcrypto;

This command will load the available extension into your current database. Essentially, this will execute the extension script that adds new SQL objects, such as functions, data types, operators, and indexing methods.

Adding Salt with gen_salt()

To make the hash even more secure, a salt is added during the hashing operation.

The issue is that a hash function always generates the same value for specific input data. This characteristic leads to several problems:

  • Two identical passwords for two different users will have the same hash, whereas it would be better if they were different.

  • Hackers use so-called rainbow tables — pre-hashed dictionaries with commonly used passwords to avoid unnecessary hash calculations during a brute-force attack.

The solution is simple: use not only the password but also an additional piece of text — the salt — as input during hashing.

A salt is a pseudorandom string that ensures the uniqueness of the resulting hash.

PostgreSQL has a gen_salt() function that allows you to pass the type of cryptographic algorithm as an argument:

  • md5 (MD5)
  • des (DES)
  • xdes (Extended DES)
  • bf (Blowfish)

For example, here’s how you can get a salt using the fairly popular MD5:

SELECT gen_salt('md5');

In reality, many developers no longer recommend using MD5 as it is considered insecure.

Now that we've covered salt, let's look at the options for hashing itself.

Password Hashing with the crypt() Function

Whenever a user creates a new password or changes an existing one, PostgreSQL needs to store its hash.

The generation itself is done using the built-in crypt() function. It has two arguments:

  • the password string
  • the salt string

Thus, the process of generating the hash along with the salt looks as follows:

SELECT crypt('password', gen_salt('md5'));

By the way, if for any reason you don't want the hash to be unique, you can simply pass a constant value as the second argument:

SELECT crypt('password', 'nothing');

Verifying a Password Against a Previously Created Hash

Interestingly, verifying a password is done using the same hashing function. The only difference is the arguments.

For example, to check if the password "password" matches its hash, you would run:

SELECT crypt('password', hash);

Notice that the salt is not specified here; it is only the password and the hash. The output of this command will be the same hash created earlier and provided as the second argument.

However, if the password is incorrect, the hash will differ:

SELECT crypt('another_password', hash);

To reiterate, calling crypt with the password "password" and the hash of this password (hash) will return the same hash as hash. In any other case, the output will differ.

How to Use PostgreSQL Hashing in Practice?

Creating a Table for Passwords

In a real project, credentials are stored in tables and are read or written as needed.
Therefore, we will create a table accounts with three columns: identifier, login, and password hash:

CREATE table accounts (identifier SERIAL, login VARCHAR(10), password VARCHAR(100));

Next, let's populate the table with some example credentials:

INSERT INTO accounts (login, password)
VALUES ('login_1', crypt('some_password', gen_salt('md5')));

This is roughly how you can store a password in a PostgreSQL database. The login field is also specified, which is typically the user's email or phone number.

Updating a Password in the Table

Whenever a user changes their password, the data in the table needs to be updated. In the most basic case, the query to update the hash of the new password looks as follows:

UPDATE accounts
SET password = crypt('new_password', gen_salt('md5'))
WHERE login = 'login_1';

Verifying the Entered Password Against the Saved One

During user authentication, the user's credentials are retrieved from the database, and the password hashes are compared:

SELECT (password = crypt(entered_password, password)) 
AS password_match
FROM accounts
WHERE login = 'login_1';

If password_match equals t (true), the passwords match. If it equals f (false), the passwords are different.

By the way, the gen_salt function has an additional argument — the number of iterations. This works only with the xdes and bf algorithms:

  • The number of iterations for xdes can be any odd number between 1 and 16777215. The default is 725.

  • The number of iterations for bf can be any integer between 4 and 31. The default is 6.

For example, here’s how to set the number of iterations for Extended DES:

SELECT crypt('password', gen_salt('xdes', 963));

Of course, the number of iterations affects the hashing time and increases password security. The choice of the right number is a balance between speed and security. A high value may reduce convenience for both users and developers, while a low value reduces resistance. On average, server hardware is expected to compute between 15 to 100 hashes per second.

Hashing on the Client-Side or Server-Side Application

One approach to prevent sending the password (from the client to the server) in plain text is to hash the password on the application side. This, in fact, complicates the client-server communication mechanism, but it is sometimes used in certain cases. However, most web resources use HTTPS encryption, which allows transmitting sensitive data in "plain" form over a secure connection.

Another option is to hash the password on the server-side application level (not on the client-side), rather than in the database itself. In this case, the pre-hashed password is placed in the database table, just like any other value.

Many popular programming languages provide ready-made modules or libraries designed for password hashing. These are not just hash functions but higher-level constructs that allow you to work directly with password hashes and authentication.

One such module is the bcrypt package based on the Blowfish algorithm. The interface may vary from language to language, but the functionality remains the same.

Here is a simple example of using bcrypt in Python:

import bcrypt

# Generate a hash before sending it to the DB
def generate_hash(password):
    # Convert string to bytes
    password_bytes = password.encode("utf-8")

    # Generate salt
    password_salt = bcrypt.gensalt()

    # Generate hash
    hash_bytes = bcrypt.hashpw(password_bytes, password_salt)

    # Convert bytes back to string
    hash_str = hash_bytes.decode("utf-8")

    return hash_str

# Authenticate by checking the password against the hash
def authenticate(password, hash):
    # Convert both password and hash from string to bytes
    password_bytes = password.encode("utf-8")
    hash_bytes = hash.encode("utf-8")

    # The salt is automatically (cryptographically) "discovered" in the hash, so it's not needed separately during verification
    result = bcrypt.checkpw(password_bytes, hash_bytes)

    return result

Subsequently, API function calls are added to either send the generated hash to the database or read it from the database during the authentication process.

In any case, each programming language has its own special libraries for working with hashes. Therefore, depending on which language you use, you should study the relevant documentation, standards, or open-source libraries that simplify working with passwords.

And most importantly — do not reinvent the wheel. Both PostgreSQL's built-in functions (extensions) and time-tested external library solutions are created by experienced developers who have gone through numerous iterations of bug fixes and vulnerability patches.

There is no sense in creating your own cryptographic "mashups," naively thinking that it would be a better solution. It will most likely lead to numerous internal issues and increase the chances of hacks.

Conclusion

Storing passwords on the server side requires responsibility from the application's developer and system administrator. Therefore, it carries certain reputational risks in the event of failures or hacker attacks.

Authentication is a separate IT domain. Creating a reliable authentication system requires both experience and time. This is why there has been a growing trend toward "outsourcing" authentication. More and more services rely on external authentication systems, whose developers specialize mainly in security rather than business logic. This is a form of division of labor.

For example, there are protocols (standards) like OpenID and OAuth 2.0. The latter is used in Google API for user authentication, so anyone can integrate Google authentication into their application or online service.

This is also beneficial for users, as they can log in using their familiar email, avoiding the creation of numerous credentials, which always come with the risk of being lost.

However, password-based authentication remains a reliable (albeit conservative) method. Therefore, securely storing user passwords in the database is essential to such an implementation.

PostgreSQL
16.12.2024
Reading time: 11 min

Similar

PostgreSQL

How to Set Up Physical Streaming Replication with PostgreSQL on Ubuntu

Streaming replication is a common method for horizontally scaling relational databases. It involves one or more copies of the same database cluster operating on different devices. The primary database cluster handles both read and write operations, while the replicas are read-only. We can also use streaming replication to provide high availability: if the primary database cluster or server fails unexpectedly, the replicas can continue handling read operations, or one of them can be promoted to become the new primary cluster. PostgreSQL, a popular relational database, supports both logical and physical replication: Logical replication streams high-level changes from the primary cluster to replicas, allowing you to replicate changes to a single database or table. Physical replication, on the other hand, streams changes from the Write-Ahead Log (WAL) files, copying the entire cluster's state rather than specific areas. This method ensures that all changes to the primary cluster are replicated. This guide will help you set up physical streaming replication with PostgreSQL on Ubuntu 22.04 across two separate devices, each running PostgreSQL 17 clusters. One device will host the primary cluster, and the other will serve as the replica. Hostman offers a cloud PostgreSQL for your projects.  Prerequisites To follow this tutorial, you will need: Two separate devices running Ubuntu 22.04: One will act as the primary server and the other as the replica. Firewall settings that allow HTTP/HTTPS traffic and traffic on port 5432 (the default port for PostgreSQL 17). PostgreSQL 17 installed and running on both servers. Step 1: Configuring the Primary Database to Accept Connections The first step is to configure the primary database to allow connections from the replica(s). By default, PostgreSQL only accepts connections from localhost (127.0.0.1). To change this behavior, you need to modify the listen_addresses configuration parameter in the primary database. On the primary server, open the PostgreSQL configuration file postgresql.conf, located in the /etc/postgresql/17/main/ directory: sudo nano /etc/postgresql/17/main/postgresql.conf Once the file is open, find the listen_addresses variable and change its value from localhost to the IP address of the primary server. Remove the # symbol at the beginning of the line as well: listen_addresses = 'your_primary_IP_address' Save the changes and exit the file. The primary database is now ready to accept connections from other devices using the specified IP address. Next, you need to create a user role with the appropriate permissions that the replica will use to connect to the primary database. Step 2: Creating a Replication Role with Permissions Next, you need to create a dedicated role in the primary database with permissions for database replication. The replica will use this role to connect to the primary database. Creating a specific role for replication is crucial for security, as the replica will only have permission to copy data, not modify it. Connect to the database cluster: Log in as the postgres user by running: sudo -u postgres psql Create a replication role: Use the CREATE ROLE command to set up a role for replication: CREATE ROLE test WITH REPLICATION PASSWORD 'testpassword' LOGIN; This will output: CREATE ROLE We have now created the test role with the password testpassword, which has replication permissions for the database cluster. Configure access for replication: PostgreSQL has a special pseudo-database, replication, which replicas use to connect. To allow access, edit the pg_hba.conf file. Exit the PostgreSQL prompt by typing: \q Then open the configuration file using nano or your preferred editor: sudo nano /etc/postgresql/17/main/pg_hba.conf Add a rule for the replica: Append the following line to the end of the pg_hba.conf file: host  replication   test  your-replica-IP/32  md5 host: Enables non-local connections over plain or SSL-encrypted TCP/IP sockets. replication: Specifies the special pseudo-database used for replication. test: Refers to the previously created replication role. your-replica-IP/32: Restricts access to the specific IP address of your replica. md5: Sets the authentication method to password-based. If you plan to create multiple replicas, repeat this step for each additional replica, specifying its IP address. Restart the primary database cluster: To apply these changes, restart the primary cluster: sudo systemctl restart postgresql@17-main If the primary cluster restarts successfully, it is properly configured and ready to stream data once the replica connects. Next, proceed with configuring the replica cluster. Step 3: Backing Up the Primary Cluster to the Replica During the setup of physical replication with PostgreSQL, you need to perform a physical backup of the primary cluster’s data directory to the replica’s data directory. Before doing this, you must clear the replica’s data directory of all existing files. On Ubuntu, the default data directory for PostgreSQL is /var/lib/postgresql/17/main/. To find the data directory, you can run the following command on the replica database: SHOW data_directory; Once you locate the data directory, run the following command to clear all files: sudo -u postgres rm -r /var/lib/postgresql/17/main/* Since the files in the default data directory are owned by the postgres user, you need to run the command as postgres using sudo -u postgres. Note: If a file in the directory is corrupted and the command does not work (this is very rare), you can remove the main directory entirely and recreate it with the correct permissions: sudo -u postgres rm -r /var/lib/postgresql/17/mainsudo -u postgres mkdir /var/lib/postgresql/17/mainsudo -u postgres chmod 700 /var/lib/postgresql/17/main Now that the replica’s data directory is cleared, you can physically back up the primary server’s data files. PostgreSQL provides a useful utility called pg_basebackup to simplify this process. It even allows you to promote the server to standby mode using the -R option. Run the following pg_basebackup command on the replica: sudo -u postgres pg_basebackup -h primary-ip-addr -p 5432 -U test -D /var/lib/postgresql/17/main/ -Fp -Xs -R -h: Specifies the remote host. Enter the IP address of your primary server. -p: Specifies the port number for connecting to the primary server. By default, PostgreSQL uses port 5432. -U: Specifies the user role to connect to the primary cluster (the role created in the previous step). -D: Specifies the backup's destination directory, which is your replica's cleared data directory. -Fp: Ensures the backup is output in plain format (instead of a tar file). -Xs: Streams the contents of the WAL file during the backup from the primary database. -R: Creates a file named standby.signal in the replica’s data directory, signaling that the replica should operate in standby mode. It also adds the connection information for the primary server to the postgresql.auto.conf file. This configuration file is read each time the standard postgresql.conf is read, but the values in the .auto.conf file override those in the regular configuration file. When you run this command, you will be prompted to enter the password for the replication role created earlier. The time required to copy all the files depends on the size of your primary database cluster. At this point, your replica now has all the necessary data files from the primary server to begin replication. Next, you need to configure the replica to start in standby mode and proceed with replication. Step 4: Restarting and Testing Clusters After successfully creating a backup of the primary cluster’s data files on the replica, you need to restart the replica database cluster and switch it to standby mode. To restart the replica, run the following command: sudo systemctl restart postgresql@17-main Once the replica has restarted in standby mode, it should automatically connect to the primary database cluster on the other machine. To check whether the replica is connected and receiving the stream from the primary server, connect to the primary database cluster with the following command: sudo -u postgres psql Next, query the pg_stat_replication table on the primary cluster as follows: SELECT client_addr, state FROM pg_stat_replication; The output should look something like this: client_addr  | state----------------+-----------your_replica_IP | streaming If you see this result, the streaming replication from the primary server to the replica is correctly set up. Conclusion You now have two Ubuntu 22.04 servers with PostgreSQL 17 clusters, and streaming replication is configured between the servers. Any changes made in the primary database cluster will be reflected in the replica cluster. You can add more replicas if your databases need to handle higher traffic. To learn more about physical streaming replication, including how to configure synchronous replication to prevent the loss of critical data, refer to the official PostgreSQL documentation.
20 December 2024 · 8 min to read
PostgreSQL

Managing PostgreSQL Extensions

PostgreSQL offers a vast array of extensions designed to simplify solving complex and non-standard tasks. They allow you to enhance the capabilities of your database management system, bypass certain limitations, and streamline analysts' work. There are two types of extensions: Bundled extensions: These come with PostgreSQL (e.g., in the contrib package). Custom extensions: These are created by users based on their specific needs. Once downloaded and installed, custom functions work just like standard ones. Extensions enable the handling of temporal, spatial, and other data types. Remember: if you can't find a ready-made solution, you can always create one yourself. If you're using a cloud PostgreSQL database on Hostman, you can easily install several popular extensions directly from the control panel. Simply open your database page, navigate to Configuration → Modify, and enable the extensions you need. Installation and Management of Extensions Since the contents of the contrib package differ for each PostgreSQL version, start by checking which functions are available in your version. Viewing the List of Standard Extensions To list the extensions available for installation, the default version, the version of the installed application, and a brief description of their functions, run the following command: SELECT * FROM pg_available_extensions; Note: Some features are only accessible with a superuser account (postgres) or an account with installation privileges.  Installing Extensions Let's break down the command used to install any PostgreSQL extensions: CREATE EXTENSION IF NOT EXISTS extension_nameWITH SCHEMA schema_nameVERSION versionCASCADE; The command includes optional but helpful parameters that you can use during installation: IF NOT EXISTS: Checks if an extension with the specified name already exists. WITH SCHEMA: Specifies the schema where the extension will be installed. If not provided, it will be installed in the current schema. VERSION: Specifies the version to install. If not specified, the latest version will be installed. CASCADE: Automatically installs all additional extensions required for proper functioning. Important: After installation using this command, you need to make specific entries in the PostgreSQL configuration file and then restart the server. Updating Extensions A new version is almost always better than the old one, right? Developers refine the code, fix bugs, and introduce new features, making it important and beneficial to update extensions. To upgrade an extension to a specific version, use the following command: ALTER EXTENSION extension_nameUPDATE TO version; If we omit the version parameter, the latest version will be installed. Removing Extensions Sometimes, an extension is no longer needed, and you might want to free up memory for better use. You can remove an extension with the following command: DROP EXTENSION IF EXISTS extension_nameCASCADE | RESTRICT; Additional Parameters: IF EXISTS: Checks whether the extension exists before attempting to remove it. CASCADE: Automatically removes all objects that depend on the extension. RESTRICT: Prevents removal if other objects depend on the extension. Top Most Useful Extensions for PostgreSQL pg_stat_statements The pg_stat_statements extension helps identify queries that place a heavy load on the system, how often they are executed, and how long they take. This information is crucial for evaluating database performance, identifying bottlenecks, and optimizing processes. Given the large size of many databases, query execution time must be efficient. This extension provides the metrics to assess and improve performance. Example Usage The following command shows the SQL query (query), its total execution time in minutes (total_min), average execution time in milliseconds (avg_ms), and the number of times it was called (calls): SELECT query, (total_exec_time / 1000 / 60) AS total_min, mean_exec_time AS avg_ms, callsFROM pg_stat_statementsORDER BY 1 DESCLIMIT 10; pgcrypto If you’re interested in encrypting data in PostgreSQL, the pgcrypto extension is essential. It offers cryptographic functions for encrypting data, such as passwords. By default, it supports the following encryption algorithms: md5, sha1, sha224, sha256, sha384, and sha512. You can expand the list of supported algorithms by configuring the extension to use OpenSSL in its settings. btree_gist You need the btree_gist extension to leverage different types of PostgreSQL database indexes (B-tree and GiST). It is especially useful for databases containing spatial data, such as city or store coordinates. B-tree Index: The default index type in PostgreSQL. It can index any sortable data, such as numbers and dates. B-tree is efficient and versatile but unsuitable for unsortable data. GiST Index: Handles any type of data, including geospatial data. Key Feature: In addition to the typical search operators for B-tree indexes, btree_gist also supports the PostgreSQL "not equal" operator (<>). timescaledb Time-series data tracks changes over time, such as application requests, sales volumes, or weather temperatures. While specialized databases like InfluxDB or ClickHouse are designed for time-series data, they may not handle other data types effectively. In such cases, the timescaledb extension for PostgreSQL offers a convenient alternative. timescaledb enables the storage and processing of time-series data directly in PostgreSQL. To use it: Download the appropriate version from the official website. Follow the installation instructions. Add the extension with the CREATE EXTENSION command. hstore The hstore extension allows PostgreSQL to store key-value pairs in a single data field. This is similar to data structures found in object-oriented programming languages like Python. With hstore, you can store grouped data without requiring additional database columns. For example, in a bookstore database, a single column could hold attributes such as the number of pages, genre, and illustration details for each book. Example Usage: Create a table with an hstore column: CREATE TABLE books (    id serial PRIMARY KEY,    name varchar,    attributes hstore); Insert data into the table: INSERT INTO books (name, attributes) VALUES (    'Harry Potter and the Philosopher''s Stone',    'author => "J. K. Rowling", pages => 223, series => "Harry Potter"'); Query books in the "Harry Potter" series: SELECT name, attributes->'author' AS authorFROM booksWHERE attributes->'series' = 'Harry Potter'; Result: The attributes for an individual book are displayed like this: SELECT * FROM books WHERE attributes->'series' = 'Harry Potter'; Conclusion PostgreSQL extensions significantly enhance database capabilities, enabling efficient handling of time-series data, encryption, indexing, key-value storage, and performance analysis. We went over a few popular tools like timescaledb, pgcrypto, and hstore; however, in reality, PostgreSQL supports many more extensions, offering solutions for a variety of use cases.
20 December 2024 · 6 min to read
PostgreSQL

How to Change the PostgreSQL Data Directory on Ubuntu

Databases tend to grow beyond their original filesystem over time. If they share the same partition as the operating system, it could potentially lead to I/O conflicts. Devices like network block storage RAID provide redundancy and enhance scalability. Regardless of your goals—whether increasing space or optimizing performance—this guide will assist you in moving the PostgreSQL data directory. Prerequisites To follow this guide, you will need: An Ubuntu 22.04 VPS server with a non-privileged user account and sudo privileges. PostgreSQL installed on your server. In this guide, we will move the data to a block storage device mounted at /mnt/volume_01. The method described here is universal and will help relocate the data directory to another location in any basic storage. Step 1. Move the PostgreSQL Data Directory First, check the current location of the data directory by starting an interactive PostgreSQL session. Here, psql is the command to enter the interactive monitor, and -u postgres instructs sudo to run psql as the system user postgres: sudo -u postgres psql In the PostgreSQL command line, enter this command to show the current directory: SHOW data_directory; By default, the directory is /var/lib/postgresql/xx/main, where xx is your PostgreSQL version. Exit the PostgreSQL prompt by typing \q and pressing ENTER. Before making any changes to the directory, stop PostgreSQL to avoid compromising data integrity: sudo systemctl stop postgresql You won’t be able to check the service status directly through systemctl after stopping it. To ensure the service has been stopped, run the following command: sudo systemctl status postgresql The last line of the output will confirm that PostgreSQL has indeed stopped. To copy the directory to the new location, use the rsync command. You can add flags: -a preserves the permissions and other attributes of the directory, while -v ensures detailed output so you can track progress. To replicate the original directory structure in the new location, run rsync from the postgresql directory. Creating this postgresql directory at the mount point and preserving ownership by the PostgreSQL user will prevent permission issues during future updates. Note: If tab completion is enabled, make sure the directory doesn't have a trailing slash. Otherwise, rsync will only copy the directory's contents to the mount point, not the directory itself. Strictly speaking, the versioned directory (e.g., 16) is unnecessary since the location is explicitly defined in the postgresql.conf file. However, it is recommended to follow the project's conventions, especially if you later need to run multiple versions of PostgreSQL: sudo rsync -av /var/lib/postgresql /mnt/volume_01 Once the copy is complete, rename the original folder with a .bak extension, and don't delete it until the move is complete. This ensures that nothing gets mixed up due to directories with the same name: sudo mv /var/lib/postgresql/16/main /var/lib/postgresql/10/main.bak Now, we can configure PostgreSQL to access the data directory in the new location. Step 2. Point to the New Location of the Directory By default, the value for data_directory is set to /var/lib/postgresql/16/main in the file /etc/postgresql/16/main/postgresql.conf. You need to edit this file to point to the new directory: sudo nano /etc/postgresql/16/main/postgresql.conf Now, locate the line starting with data_directory and change the path to point to the new location. The updated directive will look something like this: # /etc/postgresql/16/main/postgresql.conf...data_directory = '/mnt/volume_01/postgresql/10/main'... Save the file and close it by pressing CTRL+X, then Y, and finally ENTER. No further configuration is needed for PostgreSQL in the new directory. The only thing left to do at this point is restart the PostgreSQL service and verify that it correctly points to the new data directory. Step 3. Restart PostgreSQL After changing the data_directory directive in the postgresql.conf file, start the PostgreSQL server using systemctl: sudo systemctl start postgresql Check the server status: sudo systemctl status postgresql If the service started correctly, you will see a line like this at the end of the output. Finally, to ensure that the new directory is being used, open the PostgreSQL command line: sudo -u postgres psql Check the value of the data directory again: SHOW data_directory; The output will confirm that PostgreSQL is using the new data directory location. After verifying that everything is working properly, ensure you have access to your database and can interact with its data without issues. Once you are confident that the existing data is intact, you can delete the backup directory: sudo rm -Rf /var/lib/postgresql/16/main.bak Thus, you have successfully moved the PostgreSQL data directory to a new location. Conclusion If you followed the instructions correctly, your database directory is now in a new location, and you're closer to being able to scale your storage. Congrats!
18 December 2024 · 4 min to read

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start.
Email us
Hostman's Support