How To Use Nested Queries in SQL

25 Dec 2024

Nested queries, usually referred to as subqueries, are a fundamental feature of SQL that empower users To execute advanced data retrieval and analysis. By embedding one query within another, tasks that might otherwise be challenging or unfeasible utilizing a unique query can be efficiently executed. This tutorial outlines the concept of nested queries through the use of a managed database and explores their various applications for extracting meaningful insights.

Creating the Sample Database

To demonstrate the potential of nested searches, assume a database called Company. It consists of two primary tables: Employees and Departments. The Employees table contains information about individual employees, while the Departments table provides data about the departments they are associated with. This structured setup serves as the foundation for demonstrating how several types of nested queries can address specific problems.

-- Create the database called Company
CREATE DATABASE Company ;
USE Company ;

-- Create the Departments table
CREATE TABLE Departments (
    department_id INT PRIMARY KEY,
    department_name VARCHAR(50),
    location VARCHAR(50)
);

-- Insert data into Departments
INSERT INTO Departments VALUES
(101, 'Sales', 'New York'),
(102, 'HR', 'Chicago'),
(103, 'IT', 'San Francisco');


-- Create the Employees table
CREATE TABLE Employees (
    employee_id INT PRIMARY KEY,
    name VARCHAR(50),
    department_id INT,
    salary DECIMAL(10, 2),
    hire_date DATE,
    FOREIGN KEY (department_id) REFERENCES Departments(department_id)
);

-- Insert data into Employees
INSERT INTO Employees VALUES
(1, 'Alice', 101, 60000, '2020-01-15'),
(2, 'Bob', 102, 55000, '2018-03-22'),
(3, 'Charlie', 101, 70000, '2019-11-01'),
(4, 'David', 103, 50000, '2021-06-10'),
(5, 'Eve', 102, 45000, '2017-07-19');

The tables should look like this:

The Departments table

The Employees table

Applications of Nested Queries

Single-Row Subqueries

A frequent scenario for single-row subqueries is extracting employees' wages that surpass the company's overall average. In this scenario, an inner query computes the overall average wage, while an outer query retrieves the employees earning above this benchmark. Such queries are particularly effective for leveraging aggregate functions like AVG, MAX, or MIN.

-- Select the name and salary of employees
SELECT name, salary
FROM Employees
-- Where the salary is greater than the average salary of all employees
WHERE salary > (SELECT AVG(salary) FROM Employees);

Multi-Row Subqueries

Another practical application involves listing employees who work in departments based in a specific location, such as New York. Here, the inner query identifies the relevant department IDs, and the outer query selects employees linked to these departments. Multi-row subqueries depend on operators like IN, ANY, or ALL to compare sets of values and reach the desired results..

-- Select the name of employees
SELECT name
FROM Employees
-- Where the department ID is in the list of department IDs from the Departments table
WHERE department_id IN (SELECT department_id FROM Departments WHERE location = 'New York');

Correlated Subqueries

Correlated subqueries support more dynamic comparisons by tying the inner query to each row of the outer query. For locating employees earning more than the average wage within their respective departments, the inner query computes the department-specific average, and the outer query selects employees based on this criterion. While highly flexible, correlated subqueries could be computationally intensive.

SELECT name
FROM Employees e1
-- Where the salary is greater than the average salary of employees in the same department
WHERE salary > (SELECT AVG(salary) FROM Employees e2 WHERE e1.department_id = e2.department_id);

Subqueries in the FROM Clause

Subqueries can be employed in the FROM clause to generate temporary result sets, often referred to as derived tables. For example, locating departments with average salaries above $50,000 entails calculating department-level salary averages in the inner query and filtering the results in the outer query. This approach is particularly useful for organizing intermediate data before applying further analysis.

SELECT department_name, avg_salary
FROM (SELECT department_id, AVG(salary) AS avg_salary FROM Employees GROUP BY department_id) AS avg_table
-- Join the average salary table with the Departments table on department ID
JOIN Departments ON avg_table.department_id = Departments.department_id
-- Filter the results to include only departments with an average salary greater than 50,000
WHERE avg_salary > 50000;

Data Validation with Nested Queries

Nested queries are valuable for validating data integrity. For example, identifying employees associated with non-existent departments involves comparing employee department IDs against a list of valid IDs retrieved by the inner query. This technique helps ensure referential accuracy and detect potential anomalies in the data.

SELECT name
FROM Employees
WHERE department_id NOT IN (SELECT department_id FROM Departments);

Conditional Logic with Subqueries

Combining subqueries with conditional logic allows for more nuanced insights. For example, to identify departments with at least one employee earning more than $60,000, the inner query selects relevant department IDs based on salary criteria, and the outer query gathers the related department names. This method highlights meaningful relationships between tables through filtering and comparison.

SELECT DISTINCT department_name
FROM Departments
WHERE department_id IN (
    -- Select the department ID from the Departments table
    SELECT department_id
    FROM Employees
    WHERE salary > 60000
);

Best Practices for Using Nested Queries

Optimize for Performance: When working with huge datasets, evaluate the performance of nested searches and consider indexing commonly used columns to increase efficiency.
Simplify Complex Queries: Prevent excessive nesting by leveraging common table expressions (CTEs) or temporary tables, which improve readability and simplify debugging.
Validate Inner Queries: Run inner queries independently to ensure they produce the expected results before integrating them into outer queries.
Utilize Joins Where Possible: In some scenarios, joins can achieve similar outcomes as nested queries but with better performance. For instance, filtering employees in specific departments can often be implemented using joins.
Reduce Correlated Subqueries: Since correlated subqueries execute for each row in the outer query, consider replacing them with joins or CTEs to improve performance.

And if you’re looking for a reliable, high-performance, and budget-friendly solution for your workflows, Hostman has you covered with Linux VPS Hosting options, including Debian VPS, Ubuntu VPS, and VPS CentOS.

Conclusion

Nested queries are a versatile tool in SQL, offering solutions to sophisticated data retrieval challenges through advanced filtering, aggregation, and comparison techniques. Using the Company database as a reference, this discussion has showcased the utility of various types of nested queries in solving real-world problems. By practicing these techniques and adhering to best practices, you can enhance your SQL proficiency and craft efficient, maintainable queries.

With Hostman, you can use pre-configured cloud SQL databases.

How To Use Nested Queries in SQL

Creating the Sample Database Copy link

Applications of Nested Queries Copy link

Single-Row Subqueries Copy link

Multi-Row Subqueries Copy link

Correlated Subqueries Copy link

Subqueries in the FROM Clause Copy link

Data Validation with Nested Queries Copy link

Conditional Logic with Subqueries Copy link

Best Practices for Using Nested Queries Copy link

Conclusion Copy link