Indexes in SQL: Creation, Types, and How They Work

Technical writer

SQL

12.09.2024

Reading time: 6 min

When working with SQL (Structured Query Language), you may encounter an object called an "index." In this article, we'll discuss indexes and practical examples of working with them.

An index is an object created for one or more columns in a database table to enhance performance, specifically to speed up searching and retrieving the necessary data from the database.

To understand what an index is, consider a real-life analogy — a book. A book has a table of contents or an index, which helps us quickly find the section we need. Without an index, we'd have to search for the section manually, which would take much longer. Database indexing works similarly by enabling fast data retrieval.

How Indexes Work

An index in SQL is a structure built for table columns and view objects. It consists of keys constructed from one or more columns in a table. The keys are stored in a balanced tree structure (a tree-like structure designed for quick data access). This structure starts from a root node at the top of the hierarchy and continues to leaf nodes at the bottom.

The key advantage of indexes is the speed of data lookup, achieved because the index is built from a structure (a balanced tree) optimized for searching.

It's also important to note that whenever new data is added, or old data is removed, the tree structure of the index is recalculated. This means the more data and indexes stored in a database, the more trees need to be recalculated. For example, if a table has five indexes and 10,000 records, each new entry will trigger the recalculation of all five indexes.

Types of Indexes

There are several types of indexes:

Unique Index: Ensures all values are unique. Adding a duplicate value will cause an error.
Non-Unique Index: Allows duplicate values.
Simple Index:Built on a single column.
Composite Index: Built on multiple columns, and the column order matters.
B-tree Index: Represented by a root node and leaf nodes.
Partial Index: Created from a subset of table rows based on a specific condition.

Clustered and Non-Clustered Indexes

There are also clustered and non-clustered indexes. Let's examine these more closely using PostgreSQL 15 as an example.

Clustered Index

A clustered index in SQL sorts the data rows in a table and stores the data within the index's leaves. The key feature of a clustered index is that all the values are sorted in a specific order, either ascending or descending. Data in a table is physically sorted only if the table has a clustered index.

With a clustered index, data is physically organized on a disk, speeding up data retrieval when accessed sequentially. However, reorganizing the data can be costly and may require rebuilding. Unlike other indexes, a clustered index is created for the entire table, not just one or more columns. A table can have only one clustered index.

Non-Clustered Index

A non-clustered index is applied to non-key columns and does not physically order the data. Instead, it stores the index separately from the actual data. The leaves of a non-clustered index contain only the columns included in the index. This means additional operations are required to retrieve the necessary data. Non-clustered indexes can't be sorted like clustered ones, but multiple non-clustered indexes can be created for a table. You can also add unique constraints to a non-clustered index.

Comparison of Clustered and Non-Clustered Indexes

Clustered Index	Non-Clustered Index
Sorts and physically stores data according to the sort rule	Does not physically sort the data; uses pointers to access data
Data is stored in the leaf nodes of the index	Does not store data in the leaf nodes
Occupies more disk space	Occupies less disk space
Faster access to data	Slower access to data
No additional disk space needed	Requires additional disk space for index storage
Increases performance for data retrieval	Applied only to columns used in joins or queries

Creating Clustered and Non-Clustered Indexes

Here are examples using PostgreSQL 15. Let’s create a table named movies that contains information about films. The table includes two columns: id (the unique identifier for each film) and title (the film's name). The SQL code to create this table is as follows:

CREATE TABLE movies (
    id SERIAL PRIMARY KEY,
    title TEXT NOT NULL
);

To create an index for the id column, use:

CREATE INDEX cluster_id ON movies (id);

You can then create a clustered index using the following command:

CLUSTER movies USING cluster_id;

This clustered index speeds up data retrieval.

To create a non-clustered index for the title column:

CREATE INDEX non_clustered_index ON movies (title);

Database Structure Example

Consider a table orders that stores information about orders in an online store. We'll create this table with several necessary columns:

CREATE TABLE orders (             
    order_id INT PRIMARY KEY,
    client_id INT,
    client_name VARCHAR(100) NOT NULL,
    client_address VARCHAR(255) NOT NULL,
    client_city VARCHAR(100) NOT NULL,
    client_country VARCHAR(100) NOT NULL,
    client_ip_address inet
);

Creating SQL Indexes

To create an index, use the CREATE INDEX command. The general syntax is:

CREATE INDEX <index_name> ON <table_name> (<column1>, <column2>);

For example, to create an index for the order_id column in the orders table:

CREATE INDEX index_for_order ON orders (order_id);

You can create indexes for multiple columns:

CREATE INDEX index_for_order ON orders (order_id, client_id);

To create a unique index:

CREATE UNIQUE INDEX index_for_order ON orders (order_id);

To create a partial index:

CREATE INDEX clients_ip ON orders (client_ip_address)
WHERE NOT (client_ip_address > '10.26.74.1' AND client_ip_address < '10.26.74.2');

Deleting and Modifying Indexes

You can delete indexes with the DROP INDEX command:

DROP INDEX index_for_order;

To rename an index:

ALTER INDEX index_for_order RENAME TO new_index_for_order;

Best Practices for Indexes

While indexes are useful, there are rules to follow for their effective use:

Avoid indexes in small tables.
Avoid indexes in tables with frequent data changes (inserts/updates).
Avoid indexes on columns that will undergo complex data selection queries.
Avoid indexes on columns that frequently contain NULL values.
Use indexes on columns that are frequently searched.

Conclusion

In this article, we explored what an SQL index is and how it is used in databases. Proper use of indexing in SQL can significantly improve the performance of queries in your database.

Hostman provides a SQL database hosting for your needs.

SQL