Sign In
Sign In

Managing PostgreSQL Extensions

Managing PostgreSQL Extensions
Hostman Team
Technical writer
PostgreSQL
20.12.2024
Reading time: 6 min

PostgreSQL offers a vast array of extensions designed to simplify solving complex and non-standard tasks. They allow you to enhance the capabilities of your database management system, bypass certain limitations, and streamline analysts' work.

There are two types of extensions:

  1. Bundled extensions: These come with PostgreSQL (e.g., in the contrib package).

  2. Custom extensions: These are created by users based on their specific needs. Once downloaded and installed, custom functions work just like standard ones.

Extensions enable the handling of temporal, spatial, and other data types. Remember: if you can't find a ready-made solution, you can always create one yourself.

If you're using a cloud PostgreSQL database on Hostman, you can easily install several popular extensions directly from the control panel. Simply open your database page, navigate to ConfigurationModify, and enable the extensions you need.

Installation and Management of Extensions

Since the contents of the contrib package differ for each PostgreSQL version, start by checking which functions are available in your version.

Viewing the List of Standard Extensions

To list the extensions available for installation, the default version, the version of the installed application, and a brief description of their functions, run the following command:

SELECT * FROM pg_available_extensions;

Note: Some features are only accessible with a superuser account (postgres) or an account with installation privileges. 

Installing Extensions

Let's break down the command used to install any PostgreSQL extensions:

CREATE EXTENSION IF NOT EXISTS extension_name
WITH SCHEMA schema_name
VERSION version
CASCADE;

The command includes optional but helpful parameters that you can use during installation:

  • IF NOT EXISTS: Checks if an extension with the specified name already exists.

  • WITH SCHEMA: Specifies the schema where the extension will be installed. If not provided, it will be installed in the current schema.

  • VERSION: Specifies the version to install. If not specified, the latest version will be installed.

  • CASCADE: Automatically installs all additional extensions required for proper functioning.

Important: After installation using this command, you need to make specific entries in the PostgreSQL configuration file and then restart the server.

Updating Extensions

A new version is almost always better than the old one, right? Developers refine the code, fix bugs, and introduce new features, making it important and beneficial to update extensions. To upgrade an extension to a specific version, use the following command:

ALTER EXTENSION extension_name
UPDATE TO version;

If we omit the version parameter, the latest version will be installed.

Removing Extensions

Sometimes, an extension is no longer needed, and you might want to free up memory for better use. You can remove an extension with the following command:

DROP EXTENSION IF EXISTS extension_name
CASCADE | RESTRICT;

Additional Parameters:

  • IF EXISTS: Checks whether the extension exists before attempting to remove it.

  • CASCADE: Automatically removes all objects that depend on the extension.

  • RESTRICT: Prevents removal if other objects depend on the extension.

Top Most Useful Extensions for PostgreSQL

pg_stat_statements

The pg_stat_statements extension helps identify queries that place a heavy load on the system, how often they are executed, and how long they take. This information is crucial for evaluating database performance, identifying bottlenecks, and optimizing processes. Given the large size of many databases, query execution time must be efficient. This extension provides the metrics to assess and improve performance.

Example Usage

The following command shows the SQL query (query), its total execution time in minutes (total_min), average execution time in milliseconds (avg_ms), and the number of times it was called (calls):

SELECT query, (total_exec_time / 1000 / 60) AS total_min, mean_exec_time AS avg_ms, calls
FROM pg_stat_statements
ORDER BY 1 DESC
LIMIT 10;

pgcrypto

If you’re interested in encrypting data in PostgreSQL, the pgcrypto extension is essential. It offers cryptographic functions for encrypting data, such as passwords. By default, it supports the following encryption algorithms: md5, sha1, sha224, sha256, sha384, and sha512. You can expand the list of supported algorithms by configuring the extension to use OpenSSL in its settings.

btree_gist

You need the btree_gist extension to leverage different types of PostgreSQL database indexes (B-tree and GiST). It is especially useful for databases containing spatial data, such as city or store coordinates.

  • B-tree Index: The default index type in PostgreSQL. It can index any sortable data, such as numbers and dates. B-tree is efficient and versatile but unsuitable for unsortable data.

  • GiST Index: Handles any type of data, including geospatial data.

Key Feature: In addition to the typical search operators for B-tree indexes, btree_gist also supports the PostgreSQL "not equal" operator (<>).

timescaledb

Time-series data tracks changes over time, such as application requests, sales volumes, or weather temperatures. While specialized databases like InfluxDB or ClickHouse are designed for time-series data, they may not handle other data types effectively. In such cases, the timescaledb extension for PostgreSQL offers a convenient alternative.

timescaledb enables the storage and processing of time-series data directly in PostgreSQL. To use it:

  1. Download the appropriate version from the official website.
  2. Follow the installation instructions.
  3. Add the extension with the CREATE EXTENSION command.

hstore

The hstore extension allows PostgreSQL to store key-value pairs in a single data field. This is similar to data structures found in object-oriented programming languages like Python.

With hstore, you can store grouped data without requiring additional database columns. For example, in a bookstore database, a single column could hold attributes such as the number of pages, genre, and illustration details for each book.

Example Usage:

Create a table with an hstore column:

CREATE TABLE books (
   id serial PRIMARY KEY,
   name varchar,
   attributes hstore
);

Insert data into the table:

INSERT INTO books (name, attributes) VALUES (
   'Harry Potter and the Philosopher''s Stone',
   'author => "J. K. Rowling", pages => 223, series => "Harry Potter"'
);

Query books in the "Harry Potter" series:

SELECT name, attributes->'author' AS author
FROM books
WHERE attributes->'series' = 'Harry Potter';

Result: The attributes for an individual book are displayed like this:

SELECT * FROM books WHERE attributes->'series' = 'Harry Potter';

Conclusion

PostgreSQL extensions significantly enhance database capabilities, enabling efficient handling of time-series data, encryption, indexing, key-value storage, and performance analysis. We went over a few popular tools like timescaledb, pgcrypto, and hstore; however, in reality, PostgreSQL supports many more extensions, offering solutions for a variety of use cases.

PostgreSQL
20.12.2024
Reading time: 6 min

Similar

PostgreSQL

How to Install PostgreSQL on Ubuntu

PostgreSQL is a well-known relational database management system that provides high-availability features. These are renowned for their functionalities, such as support for complex statements, views, triggers, and foreign keys. PostgreSQL is installed on Ubuntu to provide a secure as well as flexible DB infrastructure. By installing it on the Linux distribution, you can enable and deal with the vast data in an efficient and secure manner. This infrastructure is designed to deal with different workloads, from single-machine apps to large internet-facing apps with multiple concurrent users. In this tutorial, we will walk through installing and setting up PostgreSQL on Ubuntu. Installing PostgreSQL on Ubuntu Ubuntu comes with the PostgreSQL package in its default repositories. You can install it on a Linux system following the given steps. It downloads the stable package. This is important for DB analysts, administrators, and industries that require a dependable DB solution for managing a large volume of data.  Let’s move forward into each step of installing PostgreSQL with detailed explanations on Ubuntu. Step 1: Refresh Ubuntu Repository First, refresh the Ubuntu package indexes. It is useful for maintaining system stability and security: sudo apt update Step 2: PostgreSQL Installation  PostgreSQL is included in Ubuntu's default repositories, making installation straightforward. To install it with some additional tools, execute the below command: sudo apt install -y postgresql postgresql-contrib Here: postgresql: The core PostgreSQL DB server. postgresql-contrib: Additional utilities and extensions useful for DB. Step 3: Start and Enable Services It is essential to enable the PostgreSQL service after installation and ensure it starts automatically upon bootup. The installed package utilizes the systemd daemon, which deals with the DB server runtime operations.  Run the commands below to make sure the DB server automatically initializes when the server boots up. Let’s initialize the service: sudo systemctl start postgresql And enable it so that it launches automatically when the machine boots up: sudo systemctl enable postgresql Step 4: Verify Installation To verify if the installed package is operating properly, check out its status through the below: sudo systemctl status postgresql In the figure, you can see an active status. It represents the services that are correctly running. Step 5: Access Shell Finally, switch to the PostgreSQL user account and launch the CLI to interact with database operations. Here, the -i flag provides an interactive login shell: sudo -i -u postgres The above display gives access to a user account. Once you're in the user environment, launch the CLI through the given command: psql As you can see in the above screenshot, this command launchs the CLI, where users can perform multiple operations by executing commands. Step 6: Basic Database Commands The creation of a database and a new user for any application is a good practice rather than using the root. In this way, a particular user can access the shell interface and run basic statements on the particular database. Here are general commands with thorough explanations and examples: Creation of the New Database To prevent conflicts, ensure data integrity and control access privileges for different applications or users, you need to create a new database. For creation, use the given command within the shell: CREATE DATABASE hostmandb; The screenshot shows that a database has been created. Creation of the New User Create a customized user and set a strong password: CREATE USER hostman_user WITH ENCRYPTED PASSWORD 'Qwer1234'; We have created a hostman_user with the password Qwer1234. Give Privileges to the Created User You can grant hostman_user complete privileges to the hostmandb through the given command. This allows controlled accessibility to database objects, enhancing data integrity as well as security: GRANT ALL PRIVILEGES ON DATABASE hostmandb TO hostman_user; Exit Shell For closing the shell, you can utilize the \q option as below: \q The figure shows that the shell has been exited. Finally, you can exit the user environment by executing the given command: exit The output confirms that you have returned to the main system user account. Access a Database as the Particular User You can connect to the particular database from the specific user account through the -U flag: psql -U hostman_user -d hostmandb -h 127.0.0.1 -p 5432 Here: -U hostman_user: Indicates the user. -d hostmandb: Indicates the database name. -h 127.0.0.1: Indicates the host (local machine). -p 5432: Indicates the PostgreSQL default port. In the given output, you can see that the user hostman_user has accessed the hostmandb console. Listing of All Databases View all databases and check whether the hostmandb database is available or not: \l In the figure, you can see that the database hostmandb is listed. Switch/Connect to the Particular Database To switch to a specific database, such as hostmandb, within PostgreSQL, use the \c command: \c hostmandb The above output confirms that the postgres user has successfully connected to the hostmandb database. Creation of the Particular Table To create a customized table, use the below statement. In our example, we create a hostmantb table: CREATE TABLE hostmantb(    id SERIAL PRIMARY KEY,    name VARCHAR(100),    age INTEGER,    department_id INTEGER); Here: id SERIAL PRIMARY KEY: It creates an auto-incrementing primary key column with the name id. name VARCHAR(100): It creates a name column that can store up to 100 characters. age INTEGER: It creates an age column that stores integer values. department_id INTEGER: This line creates a department_id column that stores integer values. The outcome of the above screenshot confirms that the table hostmantb is successfully created with specific attributes. List All Tables in the Particular Database For listing all tables in the hostmandb database, use the \dt command: \dt It displays all tables as well as confirms that the hostmantb is listed above. Overview a Table's Structure You can view the structure of the created table through the \d command. Let’s display the structure of hostmantb table: \d hostmantb In the above figure, you can see the complete structure of hostmantb table with id, name, age, and department_id having specific types. Input Entries into a Particular Table To input entries into a hostmantb table, use the below statement. This way, we input the values Peter, 35, and 1 to the columns name, age, and department_id, respectively: INSERT INTO hostmantb(name, age, department_id) VALUES ('Peter', 35, 1); The output confirms that we have successfully input data into the hostmantb table. Select All Data from a Particular Table You can select all data from the hostmantb table through the below statement: SELECT * FROM hostmantb; Modify Data in Particular Table For modifying data in hostmantb, you can utilize the below statement. For example, set the age column to 40 for the row where the name column equals Peter: UPDATE hostmantb SET age = 40 WHERE name = 'Peter'; The output confirms that the age column is updated. Remove Data from the Particular Table In this section, you can remove data from the specific table through the DELETE statement. It deletes all entries from the hostmantb table where the value in the name column is Peter: DELETE FROM hostmantb WHERE name = 'Peter'; In the screenshot, one entry from the hostmantb table is removed. Exit CLI You can easily exit the CLI via the \q utility: \q This tutorial has given you thorough guidelines for every step involved in installing as well as setting PostgreSQL on Ubuntu.  Conclusion By installing PostgreSQL on Ubuntu, you can optimize the DB infrastructure. PostgreSQL is often chosen for its freely available nature, which allows for customization, the system's stability and security features. In addition, PostgreSQL's supporters' help and comprehensive guide make it simpler to enhance operations and troubleshoot problems.  This combination makes sure that the DB system is both powerful as well as adaptable to several needs. In this tutorial, there are a lot of fundamental DB commands to get you started with database administration. Therefore, you will be able to create databases, and users and perform basic tasks.
24 January 2025 · 7 min to read
Kubernetes

How to Deploy PostgreSQL on Kubernetes

PostgreSQL is a popular relational database management system (RDBMS) that provides high-availability features like streaming replication, logical replication, and failover solutions. Deploying PostgreSQL on Kubernetes allows organizations to build resilient systems that ensure minimal downtime and data availability. With Kubernetes StatefulSets, you can scale PostgreSQL deployment in response to demand. Kubernetes Environment Setup To get started, make sure you have the following: Kubernetes Cluster (Cloud or Local):  You can set up a Kubernetes cluster on Hostman within no time. To follow this tutorial with a local Kubernetes cluster, you can use one of these tools: k3s, minikube, microk8s, kind. Kubectl: Kubectl allows users to interact with a Kubernetes cluster. The kubectl needs a configuration YAML file which contains cluster details and is usually provided by your cloud provider.  From the Hostman control panel, you can simply download this configuration file with a click of a button as indicated in the below screenshot. To connect, you need to set KUBECONFIG environment variable accordingly. export KUBECONFIG=/absolute/path/to/file/k8s-cluster-config.yaml Helm: You need Helm CLI to install Helm charts. Helm version 3 is required. Deploy PostgreSQL Using a Helm Chart Helm is a package manager for Kubernetes just like apt for Ubuntu and Debian. Instead of manually creating multiple YAML files for Pods, Services, Persistent Volumes, Secrets, etc., the Helm chart simplifies this to a single command (e.g., helm install), streamlining the deployment process. Step 1: Add helm repository To add the Bitnami PostgreSQL Helm repo, run this command: helm repo add bitnami https://charts.bitnami.com/bitnami To sync your local Helm repository with the remote one: helm repo update Step 2: Manage Data Persistence PostgreSQL requires persistent storage to ensure that data is preserved even if a pod crashes or is rescheduled. When a Persistent Volume Claim (PVC) is combined with a Persistent Volume (PV), Kubernetes can allocate a desired chunk of storage either in disk or cloud storage. PVC requests the Kubernetes cluster for storage space. Kubernetes then looks at the available PVs and assigns one to it. Create a file named postgres-local-pv.yaml with the YAML manifest: apiVersion: v1 kind: PersistentVolume metadata: name: postgresql-local-pv spec: capacity: storage: 5Gi accessModes: - ReadWriteOnce persistentVolumeReclaimPolicy: Retain storageClassName: manual hostPath: path: /mnt/data/postgresql This manifest creates a PersistentVolume backed by a local directory (/mnt/data/postgresql) on a specific node. This means if the node goes down or becomes unavailable, the data stored in that PV will be inaccessible, which is a critical risk in production. Therefore, it’s highly recommended to use cloud-native storage solutions instead of hostPath to ensure reliability, scalability and data protection. This PV has a reclaim policy of Retain, ensuring that it is not deleted when no longer in use by a PVC. You can set storageClassName to ceph-storage, glusterfs, portworx-sc, or openebs-standard based on your needs. Create a file named postgres-local-pvc.yaml with this text: apiVersion: v1 kind: PersistentVolumeClaim metadata: name: postgresql-local-pvc spec: accessModes: - ReadWriteOnce resources: requests: storage: 5Gi storageClassName: manual The ReadWriteOnce config means the volume can be read-write by a single node at a time. You might think, replacing it with ReadWriteMany will make your application highly available. This isn’t the case. ReadWriteMany (RWX) access mode allows multiple pods to access the same PersistentVolume simultaneously, this can indeed create serious issues leading to potential race conditions, data corruption, or inconsistent state. Apply these manifests using kubectl and create new resources. kubectl apply -f postgres-local-pv.yamlkubectl apply -f postgres-local-pvc.yaml Step 3: Install PostgreSQL Helm Chart Run the following command to install the Helm chart. helm install tutorial-db bitnami/postgresql --set auth.username=bhuwan \ --set auth.password=”AeSeigh2gieshe” \ --set auth.database=k8s-tutorial \ --set auth.postgresPassword=”Ze4hahshez6dop9vaing” \ --set primary.persistence.existingClaim=postgresql-local-pvc \ --set volumePermissions.enabled=true After a couple of minutes, verify if things have worked successfully with this command: kubectl get all Step 4: Test and Connect The following command runs a temporary PostgreSQL client pod. The pod connects to the database named k8s-tutorial, using the username bhuwan and the password from the environment variable $POSTGRES_PASSWORD. export POSTGRES_PASSWORD=$(kubectl get secret --namespace default tutorial-db-postgresql -o jsonpath="{.data.password}" | base64 -d) kubectl run tutorial-db-postgresql-client --rm --tty -i --restart='Never' \ --image docker.io/bitnami/postgresql:17.2.0-debian-12-r6 \ --env="PGPASSWORD=$POSTGRES_PASSWORD" \ --command -- psql --host tutorial-db-postgresql \ -U bhuwan -d k8s-tutorial -p 5432 After the session ends, the pod will be deleted automatically due to the --rm flag. A quick reminder, if you have changed the Helm chart release name, users, or database name, adjust the above commands accordingly. Deploy Postgres on Kubernetes from scratch A StatefulSet is the best Kubernetes resource for deploying stateful applications like PostgreSQL. This way, every PostgreSQL pod gets its own stable network identities and persistent volumes. Note: you’ll be using a previously created Persistent Volume Claim (PVC) and Persistent Volume(PV). So, do some cleanup and recreate those resources. helm delete tutorial-db kubectl delete pvc postgresql-local-pvc kubectl delete pv postgresql-local-pv kubectl apply -f postgres-local-pv.yaml -f postgres-local-pvc.yaml Create a file named postgres-statefulset.yaml with the following text: apiVersion: apps/v1 kind: StatefulSet metadata: name: postgres-statefulset labels: app: postgres spec: serviceName: "postgresql-headless-svc" replicas: 1 selector: matchLabels: app: postgres template: metadata: labels: app: postgres spec: containers: - name: postgres image: postgres:17.2 envFrom: - secretRef: name: postgresql-secret ports: - containerPort: 5432 name: postgresdb volumeMounts: - name: pv-data mountPath: /var/lib/postgresql/db volumes: - name: pv-data persistentVolumeClaim: claimName: postgresql-local-pvc Before you can apply these changes, create a new Secret for handling sensitive details like passwords with kubectl. kubectl create secret generic postgresql-secret --from-literal=POSTGRES_PASSWORD=Ze4hahshez6dop9vaing kubectl apply -f postgres-statefulset.yaml If the pod gets stuck with Pending state, you can try creating a StorageClass with the following manifest. kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: manual provisioner: kubernetes.io/no-provisioner volumeBindingMode: WaitForFirstConsumer To investigate any further issues with the pod, you can use the command: kubectl describe pod postgres-statefulset-0 This command will report any issues related to scheduling the pod to a node, mounting volumes, or resource constraints. Databases like PostgreSQL are typically accessed internally by other services or applications within the cluster, so it's better to create a Headless service for it. Create a file called postgres-service.yaml and include the following YAML manifest: apiVersion: v1 kind: Service metadata: name: postgresql-headless-svc spec: type: ClusterIP selector: app: postgres ports: - port: 5432 targetPort: 5432 clusterIP: None Finally, you can test the connection with kubectl run. kubectl run tutorial-db-postgresql-client --rm --tty -i --restart='Never' \ --image docker.io/bitnami/postgresql:17.2.0-debian-12-r6 \ --env="PGPASSWORD=Ze4hahshez6dop9vaing" \ --command -- psql --host postgres-statefulset-0.postgresql-headless-svc \ -U postgres -p 5432 Scale, Replication, and Backup To scale up a Statefulset, simply pass the number of replicas with --replicas flag.  kubectl scale statefulset postgres-statefulset --replicas=3  To reach replicas, you can make use of headless service. For instance, with hostname postgres-statefulset-1.postgresql-headless-svc you can send requests to pod 1. For handling backups, you can use CronJob with the pg_dump utility provided by PostgreSQL. Best Practices Throughout the tutorial, the decision to handle passwords via Kubernetes Secret, using StatefulSet instead of Deployment was a good move. To make this deployment even more secure, reliable, and highly available, here are some ideas: Set Resource Requests and Limits: Set appropriate CPU and memory requests and limits to avoid over-provisioning and under-provisioning. Backups: Use Kubernetes CronJobs to regularly back up your PostgreSQL data. Consider implementing Volume Snapshots as well. Monitoring and Log Postgresql: You can use tools like Prometheus and Grafana to collect and visualize PostgreSQL metrics, such as query performance, disk usage, and replication status. Use Pod Disruption Budgets (PDBs): If too many PostgreSQL pods are disrupted at once (e.g., during a rolling update), it can lead to database unavailability or replication issues. Conclusion Helm chart is the recommended way of complex and production deployment. Helm provides an automated version manager alongside hiding the complexities of configuring individual Kubernetes components. Using the Helm template command, you can even render the Helm chart locally and make necessary adjustments with its YAML Kubernetes manifests. Kubernetes provides scalability, flexibility, and ease of automation for PostgreSQL databases. By leveraging Kubernetes features like StatefulSets, PVCs, PDBs, and secrets management, you can ensure that your PostgreSQL database is tuned for the production environment.
24 January 2025 · 8 min to read
PostgreSQL

How to Set Up Physical Streaming Replication with PostgreSQL on Ubuntu

Streaming replication is a common method for horizontally scaling relational databases. It involves one or more copies of the same database cluster operating on different devices. The primary database cluster handles both read and write operations, while the replicas are read-only. We can also use streaming replication to provide high availability: if the primary database cluster or server fails unexpectedly, the replicas can continue handling read operations, or one of them can be promoted to become the new primary cluster. PostgreSQL, a popular relational database, supports both logical and physical replication: Logical replication streams high-level changes from the primary cluster to replicas, allowing you to replicate changes to a single database or table. Physical replication, on the other hand, streams changes from the Write-Ahead Log (WAL) files, copying the entire cluster's state rather than specific areas. This method ensures that all changes to the primary cluster are replicated. This guide will help you set up physical streaming replication with PostgreSQL on Ubuntu 22.04 across two separate devices, each running PostgreSQL 17 clusters. One device will host the primary cluster, and the other will serve as the replica. Hostman offers a cloud PostgreSQL for your projects.  Prerequisites To follow this tutorial, you will need: Two separate devices running Ubuntu 22.04: One will act as the primary server and the other as the replica. Firewall settings that allow HTTP/HTTPS traffic and traffic on port 5432 (the default port for PostgreSQL 17). PostgreSQL 17 installed and running on both servers. Step 1: Configuring the Primary Database to Accept Connections The first step is to configure the primary database to allow connections from the replica(s). By default, PostgreSQL only accepts connections from localhost (127.0.0.1). To change this behavior, you need to modify the listen_addresses configuration parameter in the primary database. On the primary server, open the PostgreSQL configuration file postgresql.conf, located in the /etc/postgresql/17/main/ directory: sudo nano /etc/postgresql/17/main/postgresql.conf Once the file is open, find the listen_addresses variable and change its value from localhost to the IP address of the primary server. Remove the # symbol at the beginning of the line as well: listen_addresses = 'your_primary_IP_address' Save the changes and exit the file. The primary database is now ready to accept connections from other devices using the specified IP address. Next, you need to create a user role with the appropriate permissions that the replica will use to connect to the primary database. Step 2: Creating a Replication Role with Permissions Next, you need to create a dedicated role in the primary database with permissions for database replication. The replica will use this role to connect to the primary database. Creating a specific role for replication is crucial for security, as the replica will only have permission to copy data, not modify it. Connect to the database cluster: Log in as the postgres user by running: sudo -u postgres psql Create a replication role: Use the CREATE ROLE command to set up a role for replication: CREATE ROLE test WITH REPLICATION PASSWORD 'testpassword' LOGIN; This will output: CREATE ROLE We have now created the test role with the password testpassword, which has replication permissions for the database cluster. Configure access for replication: PostgreSQL has a special pseudo-database, replication, which replicas use to connect. To allow access, edit the pg_hba.conf file. Exit the PostgreSQL prompt by typing: \q Then open the configuration file using nano or your preferred editor: sudo nano /etc/postgresql/17/main/pg_hba.conf Add a rule for the replica: Append the following line to the end of the pg_hba.conf file: host  replication   test  your-replica-IP/32  md5 host: Enables non-local connections over plain or SSL-encrypted TCP/IP sockets. replication: Specifies the special pseudo-database used for replication. test: Refers to the previously created replication role. your-replica-IP/32: Restricts access to the specific IP address of your replica. md5: Sets the authentication method to password-based. If you plan to create multiple replicas, repeat this step for each additional replica, specifying its IP address. Restart the primary database cluster: To apply these changes, restart the primary cluster: sudo systemctl restart postgresql@17-main If the primary cluster restarts successfully, it is properly configured and ready to stream data once the replica connects. Next, proceed with configuring the replica cluster. Step 3: Backing Up the Primary Cluster to the Replica During the setup of physical replication with PostgreSQL, you need to perform a physical backup of the primary cluster’s data directory to the replica’s data directory. Before doing this, you must clear the replica’s data directory of all existing files. On Ubuntu, the default data directory for PostgreSQL is /var/lib/postgresql/17/main/. To find the data directory, you can run the following command on the replica database: SHOW data_directory; Once you locate the data directory, run the following command to clear all files: sudo -u postgres rm -r /var/lib/postgresql/17/main/* Since the files in the default data directory are owned by the postgres user, you need to run the command as postgres using sudo -u postgres. Note: If a file in the directory is corrupted and the command does not work (this is very rare), you can remove the main directory entirely and recreate it with the correct permissions: sudo -u postgres rm -r /var/lib/postgresql/17/mainsudo -u postgres mkdir /var/lib/postgresql/17/mainsudo -u postgres chmod 700 /var/lib/postgresql/17/main Now that the replica’s data directory is cleared, you can physically back up the primary server’s data files. PostgreSQL provides a useful utility called pg_basebackup to simplify this process. It even allows you to promote the server to standby mode using the -R option. Run the following pg_basebackup command on the replica: sudo -u postgres pg_basebackup -h primary-ip-addr -p 5432 -U test -D /var/lib/postgresql/17/main/ -Fp -Xs -R -h: Specifies the remote host. Enter the IP address of your primary server. -p: Specifies the port number for connecting to the primary server. By default, PostgreSQL uses port 5432. -U: Specifies the user role to connect to the primary cluster (the role created in the previous step). -D: Specifies the backup's destination directory, which is your replica's cleared data directory. -Fp: Ensures the backup is output in plain format (instead of a tar file). -Xs: Streams the contents of the WAL file during the backup from the primary database. -R: Creates a file named standby.signal in the replica’s data directory, signaling that the replica should operate in standby mode. It also adds the connection information for the primary server to the postgresql.auto.conf file. This configuration file is read each time the standard postgresql.conf is read, but the values in the .auto.conf file override those in the regular configuration file. When you run this command, you will be prompted to enter the password for the replication role created earlier. The time required to copy all the files depends on the size of your primary database cluster. At this point, your replica now has all the necessary data files from the primary server to begin replication. Next, you need to configure the replica to start in standby mode and proceed with replication. Step 4: Restarting and Testing Clusters After successfully creating a backup of the primary cluster’s data files on the replica, you need to restart the replica database cluster and switch it to standby mode. To restart the replica, run the following command: sudo systemctl restart postgresql@17-main Once the replica has restarted in standby mode, it should automatically connect to the primary database cluster on the other machine. To check whether the replica is connected and receiving the stream from the primary server, connect to the primary database cluster with the following command: sudo -u postgres psql Next, query the pg_stat_replication table on the primary cluster as follows: SELECT client_addr, state FROM pg_stat_replication; The output should look something like this: client_addr  | state----------------+-----------your_replica_IP | streaming If you see this result, the streaming replication from the primary server to the replica is correctly set up. Conclusion You now have two Ubuntu 22.04 servers with PostgreSQL 17 clusters, and streaming replication is configured between the servers. Any changes made in the primary database cluster will be reflected in the replica cluster. You can add more replicas if your databases need to handle higher traffic. To learn more about physical streaming replication, including how to configure synchronous replication to prevent the loss of critical data, refer to the official PostgreSQL documentation.
20 December 2024 · 8 min to read

Do you have questions,
comments, or concerns?

Our professionals are available to assist you at any moment,
whether you need help or are just unsure of where to start.
Email us
Hostman's Support