Nested queries, usually referred to as subqueries, are a fundamental feature of SQL that empower users To execute advanced data retrieval and analysis. By embedding one query within another, tasks that might otherwise be challenging or unfeasible utilizing a unique query can be efficiently executed. This tutorial outlines the concept of nested queries through the use of a sample database and explores their various applications for extracting meaningful insights.
To demonstrate the potential of nested searches, assume a database called Company
. It consists of two primary tables: Employees
and Departments
. The Employees
table contains information about individual employees, while the Departments
table provides data about the departments they are associated with. This structured setup serves as the foundation for demonstrating how several types of nested queries can address specific problems.
-- Create the database called Company
CREATE DATABASE Company ;
USE Company ;
-- Create the Departments table
CREATE TABLE Departments (
department_id INT PRIMARY KEY,
department_name VARCHAR(50),
location VARCHAR(50)
);
-- Insert data into Departments
INSERT INTO Departments VALUES
(101, 'Sales', 'New York'),
(102, 'HR', 'Chicago'),
(103, 'IT', 'San Francisco');
-- Create the Employees table
CREATE TABLE Employees (
employee_id INT PRIMARY KEY,
name VARCHAR(50),
department_id INT,
salary DECIMAL(10, 2),
hire_date DATE,
FOREIGN KEY (department_id) REFERENCES Departments(department_id)
);
-- Insert data into Employees
INSERT INTO Employees VALUES
(1, 'Alice', 101, 60000, '2020-01-15'),
(2, 'Bob', 102, 55000, '2018-03-22'),
(3, 'Charlie', 101, 70000, '2019-11-01'),
(4, 'David', 103, 50000, '2021-06-10'),
(5, 'Eve', 102, 45000, '2017-07-19');
The tables should look like this:
The Departments table
The Employees table
A frequent scenario for single-row subqueries is extracting employees' wages that surpass the company's overall average. In this scenario, an inner query computes the overall average wage, while an outer query retrieves the employees earning above this benchmark. Such queries are particularly effective for leveraging aggregate functions like AVG
, MAX
, or MIN
.
-- Select the name and salary of employees
SELECT name, salary
FROM Employees
-- Where the salary is greater than the average salary of all employees
WHERE salary > (SELECT AVG(salary) FROM Employees);
Another practical application involves listing employees who work in departments based in a specific location, such as New York. Here, the inner query identifies the relevant department IDs, and the outer query selects employees linked to these departments. Multi-row subqueries depend on operators like IN
, ANY
, or ALL
to compare sets of values and reach the desired results..
-- Select the name of employees
SELECT name
FROM Employees
-- Where the department ID is in the list of department IDs from the Departments table
WHERE department_id IN (SELECT department_id FROM Departments WHERE location = 'New York');
Correlated subqueries support more dynamic comparisons by tying the inner query to each row of the outer query. For locating employees earning more than the average wage within their respective departments, the inner query computes the department-specific average, and the outer query selects employees based on this criterion. While highly flexible, correlated subqueries could be computationally intensive.
SELECT name
FROM Employees e1
-- Where the salary is greater than the average salary of employees in the same department
WHERE salary > (SELECT AVG(salary) FROM Employees e2 WHERE e1.department_id = e2.department_id);
Subqueries can be employed in the FROM
clause to generate temporary result sets, often referred to as derived tables. For example, locating departments with average salaries above $50,000 entails calculating department-level salary averages in the inner query and filtering the results in the outer query. This approach is particularly useful for organizing intermediate data before applying further analysis.
SELECT department_name, avg_salary
FROM (SELECT department_id, AVG(salary) AS avg_salary FROM Employees GROUP BY department_id) AS avg_table
-- Join the average salary table with the Departments table on department ID
JOIN Departments ON avg_table.department_id = Departments.department_id
-- Filter the results to include only departments with an average salary greater than 50,000
WHERE avg_salary > 50000;
Nested queries are valuable for validating data integrity. For example, identifying employees associated with non-existent departments involves comparing employee department IDs against a list of valid IDs retrieved by the inner query. This technique helps ensure referential accuracy and detect potential anomalies in the data.
SELECT name
FROM Employees
WHERE department_id NOT IN (SELECT department_id FROM Departments);
Combining subqueries with conditional logic allows for more nuanced insights. For example, to identify departments with at least one employee earning more than $60,000, the inner query selects relevant department IDs based on salary criteria, and the outer query gathers the related department names. This method highlights meaningful relationships between tables through filtering and comparison.
SELECT DISTINCT department_name
FROM Departments
WHERE department_id IN (
-- Select the department ID from the Departments table
SELECT department_id
FROM Employees
WHERE salary > 60000
);
Nested queries are a versatile tool in SQL, offering solutions to sophisticated data retrieval challenges through advanced filtering, aggregation, and comparison techniques. Using the Company database as a reference, this discussion has showcased the utility of various types of nested queries in solving real-world problems. By practicing these techniques and adhering to best practices, you can enhance your SQL proficiency and craft efficient, maintainable queries.
Hostman provides pre-configured and ready-to-use cloud SQL databases.