PostgreSQL is a free, open-source object-relational database. Object-relational databases differ from regular relational ones. Data is still stored in tables, with columns that are linked to each other. However, PostgreSQL operates according to ACID standards (Atomicity, Consistency, Isolation, and Durability), ensuring data accuracy through consistency and atomicity of operations within the tables. Changes are made sequentially, allowing for the immediate detection of failures while writing values.
PostgreSQL supports Multi-Version Concurrency Control (MVCC), a database feature that creates copies of records during modifications, protecting against loss and conflicts during simultaneous read or write operations.
The indexing system in PostgreSQL is more complex and operates faster, using trees and various types of indexing, such as partial, hash, and expression indexing.
The syntax of PostgreSQL is similar to MySQL, but PostgreSQL supports additional subqueries like LIMIT
or ALL
.
Furthermore, PostgreSQL is compatible with a large number of programming languages. The most common ones are:
Projects involving user registration and authentication must store passwords on the server side, — usually encrypted. Databases are the most suitable tool for this.
In this article, we will discuss how to properly store passwords in a database (writing and reading them) using PostgreSQL while ensuring all security measures are followed.
Before discussing storing passwords in databases, we should note that no serious project should store credentials in plaintext, meaning unencrypted, human-readable text. Data is always encrypted. Always.
Here are some brief reasons why:
The developer's servers or the application host's servers can be hacked, for example, through an SQL injection that could retrieve a password string from the database. Unencrypted data will immediately fall into the hands of hackers and be compromised. From that point on, we can only guess what will happen next and what the consequences could be.
Users almost always ignore security recommendations — they don't use password managers or disregard password naming rules for different services. A password leak in one application can lead to the compromise of accounts in other services.
Users will easily accuse the service provider (the application developer) of dishonesty if they can read passwords. This can happen even if no illegal actions were taken by employees. It would be a blow to the reputation of the company or project.
First, it's important to note that when dealing with passwords, what is done is not encryption but rather hashing.
It’s crucial to understand that if something is encrypted, it can always be decrypted. Encrypted information is the same information, just represented differently.
However, hashing works differently. A hash is completely new, unique information derived from some input data — in our case, from a password.
The key point is that retrieving the original data from a hash is impossible (theoretically, possible but practically not feasible). In short, hashing is a one-way operation.
Here are some other less obvious drawbacks of encryption that have made hashing the preferred method over time:
Unlike a hash, an encryption method has a variable length, which isn’t ideal for storage within a database or for sending server (or client) packets.
Generating an encryption key takes more computational time than generating a hash.
When using encryption, key management becomes necessary. This means the keys must be stored somewhere and must be kept secure to prevent unauthorized access.
What Does a Hash Look Like?
A hash is essentially a string of random characters — a set of symbols that has no inherent meaning. The algorithm that generates such a string is called a hash function.
Currently, hashing can only be broken by brute-force attempts. This method is quite crude and only works on initially weak passwords. Hackers simply try a long list of common passwords based on a dictionary. Each password is hashed and sent to the targeted server for authentication attempts. This continues until a match is found. So, no magic is involved.
At the same time, hashing itself is a computationally complex task. Deriving information from the original data (the password) takes time. Some hash functions generate longer keys (for example, through repeated hashing) specifically to increase the time needed for generation. In this case, a dictionary brute-force attack takes more time, giving the security team or the user an advantage.
PostgreSQL has a built-in extension specifically designed for password hashing, so it does not need to be loaded separately. To activate it, you need to run the following command:
CREATE EXTENSION pgcrypto;
This command will load the available extension into your current database. Essentially, this will execute the extension script that adds new SQL objects, such as functions, data types, operators, and indexing methods.
To make the hash even more secure, a salt is added during the hashing operation.
The issue is that a hash function always generates the same value for specific input data. This characteristic leads to several problems:
Two identical passwords for two different users will have the same hash, whereas it would be better if they were different.
Hackers use so-called rainbow tables — pre-hashed dictionaries with commonly used passwords to avoid unnecessary hash calculations during a brute-force attack.
The solution is simple: use not only the password but also an additional piece of text — the salt — as input during hashing.
A salt is a pseudorandom string that ensures the uniqueness of the resulting hash.
PostgreSQL has a gen_salt()
function that allows you to pass the type of cryptographic algorithm as an argument:
md5
(MD5)des
(DES)xdes
(Extended DES)bf
(Blowfish)For example, here’s how you can get a salt using the fairly popular MD5:
SELECT gen_salt('md5');
In reality, many developers no longer recommend using MD5 as it is considered insecure.
Now that we've covered salt, let's look at the options for hashing itself.
Whenever a user creates a new password or changes an existing one, PostgreSQL needs to store its hash.
The generation itself is done using the built-in crypt()
function. It has two arguments:
Thus, the process of generating the hash along with the salt looks as follows:
SELECT crypt('password', gen_salt('md5'));
By the way, if for any reason you don't want the hash to be unique, you can simply pass a constant value as the second argument:
SELECT crypt('password', 'nothing');
Interestingly, verifying a password is done using the same hashing function. The only difference is the arguments.
For example, to check if the password "password" matches its hash, you would run:
SELECT crypt('password', hash);
Notice that the salt is not specified here; it is only the password and the hash. The output of this command will be the same hash created earlier and provided as the second argument.
However, if the password is incorrect, the hash will differ:
SELECT crypt('another_password', hash);
To reiterate, calling crypt
with the password "password" and the hash of this password (hash
) will return the same hash as hash
. In any other case, the output will differ.
In a real project, credentials are stored in tables and are read or written as needed.
Therefore, we will create a table accounts with three columns: identifier, login, and password hash:
CREATE table accounts (identifier SERIAL, login VARCHAR(10), password VARCHAR(100));
Next, let's populate the table with some example credentials:
INSERT INTO accounts (login, password)
VALUES ('login_1', crypt('some_password', gen_salt('md5')));
This is roughly how you can store a password in a PostgreSQL database. The login field is also specified, which is typically the user's email or phone number.
Whenever a user changes their password, the data in the table needs to be updated. In the most basic case, the query to update the hash of the new password looks as follows:
UPDATE accounts
SET password = crypt('new_password', gen_salt('md5'))
WHERE login = 'login_1';
During user authentication, the user's credentials are retrieved from the database, and the password hashes are compared:
SELECT (password = crypt(entered_password, password))
AS password_match
FROM accounts
WHERE login = 'login_1';
If password_match
equals t
(true), the passwords match. If it equals f
(false), the passwords are different.
By the way, the gen_salt
function has an additional argument — the number of iterations. This works only with the xdes
and bf
algorithms:
The number of iterations for xdes
can be any odd number between 1 and 16777215. The default is 725.
The number of iterations for bf
can be any integer between 4 and 31. The default is 6.
For example, here’s how to set the number of iterations for Extended DES:
SELECT crypt('password', gen_salt('xdes', 963));
Of course, the number of iterations affects the hashing time and increases password security. The choice of the right number is a balance between speed and security. A high value may reduce convenience for both users and developers, while a low value reduces resistance. On average, server hardware is expected to compute between 15 to 100 hashes per second.
One approach to prevent sending the password (from the client to the server) in plain text is to hash the password on the application side. This, in fact, complicates the client-server communication mechanism, but it is sometimes used in certain cases. However, most web resources use HTTPS encryption, which allows transmitting sensitive data in "plain" form over a secure connection.
Another option is to hash the password on the server-side application level (not on the client-side), rather than in the database itself. In this case, the pre-hashed password is placed in the database table, just like any other value.
Many popular programming languages provide ready-made modules or libraries designed for password hashing. These are not just hash functions but higher-level constructs that allow you to work directly with password hashes and authentication.
One such module is the bcrypt
package based on the Blowfish algorithm. The interface may vary from language to language, but the functionality remains the same.
Here is a simple example of using bcrypt
in Python:
import bcrypt
# Generate a hash before sending it to the DB
def generate_hash(password):
# Convert string to bytes
password_bytes = password.encode("utf-8")
# Generate salt
password_salt = bcrypt.gensalt()
# Generate hash
hash_bytes = bcrypt.hashpw(password_bytes, password_salt)
# Convert bytes back to string
hash_str = hash_bytes.decode("utf-8")
return hash_str
# Authenticate by checking the password against the hash
def authenticate(password, hash):
# Convert both password and hash from string to bytes
password_bytes = password.encode("utf-8")
hash_bytes = hash.encode("utf-8")
# The salt is automatically (cryptographically) "discovered" in the hash, so it's not needed separately during verification
result = bcrypt.checkpw(password_bytes, hash_bytes)
return result
Subsequently, API function calls are added to either send the generated hash to the database or read it from the database during the authentication process.
In any case, each programming language has its own special libraries for working with hashes. Therefore, depending on which language you use, you should study the relevant documentation, standards, or open-source libraries that simplify working with passwords.
And most importantly — do not reinvent the wheel. Both PostgreSQL's built-in functions (extensions) and time-tested external library solutions are created by experienced developers who have gone through numerous iterations of bug fixes and vulnerability patches.
There is no sense in creating your own cryptographic "mashups," naively thinking that it would be a better solution. It will most likely lead to numerous internal issues and increase the chances of hacks.
Storing passwords on the server side requires responsibility from the application's developer and system administrator. Therefore, it carries certain reputational risks in the event of failures or hacker attacks.
Authentication is a separate IT domain. Creating a reliable authentication system requires both experience and time. This is why there has been a growing trend toward "outsourcing" authentication. More and more services rely on external authentication systems, whose developers specialize mainly in security rather than business logic. This is a form of division of labor.
For example, there are protocols (standards) like OpenID and OAuth 2.0. The latter is used in Google API for user authentication, so anyone can integrate Google authentication into their application or online service.
This is also beneficial for users, as they can log in using their familiar email, avoiding the creation of numerous credentials, which always come with the risk of being lost.
However, password-based authentication remains a reliable (albeit conservative) method. Therefore, securely storing user passwords in the database is essential to such an implementation.