And look, plain single‑table queries are fine… if your data model was designed by a golden retriever. For actual systems, you’re going to need subqueries talking to other subqueries, sometimes about other subqueries. This lesson is where that madness becomes intentional.
After this chapter you should be able to:
- Write multiple‑column subqueries (pairwise vs non‑pairwise)
- Use scalar subqueries anywhere a single value can go
- Solve problems with correlated subqueries
- Use
EXISTS/NOT EXISTSlike a grown‑up - Use the
WITHclause (including recursiveWITH) to keep big queries sane
1. Subqueries as Data Sources
You’ve already used subqueries in WHERE and HAVING. Here we lean into them as full‑blown data sources.
Example pattern:
SELECT d.department_name,
v.avg_sal
FROM departments d
NATURAL JOIN (
SELECT department_id, AVG(salary) AS avg_sal
FROM employees
GROUP BY department_id
) v;
Why do this?
- To prototype the query that will become a view
- To factor complex logic into a readable “inner result set”
- To join base tables to pre‑aggregated or filtered subsets
Think of the inner SELECT as a disposable view you didn’t have to CREATE.
2. Multiple‑Column Subqueries: Pairwise vs Non‑Pairwise
Sometimes your condition depends on combinations of columns, not just one. That’s where multiple‑column subqueries show up.
2.1 The “John and that John’s manager” problem
Goal: > “Find employees who work with John and are managed by John’s manager – but don’t show John.”
Interpretation 1 – “any John, any manager” (non‑pairwise):
SELECT last_name, department_id, manager_id
FROM employees
WHERE department_id IN (
SELECT department_id
FROM employees
WHERE first_name = 'John'
)
AND manager_id IN (
SELECT manager_id
FROM employees
WHERE first_name = 'John'
)
AND first_name <> 'John';
Here you separately:
- Grab all departments containing a “John”
- Grab all managers of any “John”
- Combine them like a Cartesian dating app
Result: potentially more rows than intended, because any “John department” + any “John manager” is allowed.
2.2 Pairwise comparison – “that John’s manager”
Use a multi‑column subquery to keep department and manager paired:
SELECT last_name, department_id, manager_id
FROM employees e
WHERE (department_id, manager_id) IN (
SELECT department_id, manager_id
FROM employees
WHERE first_name = 'John'
)
AND first_name <> 'John';
Key idea: (department_id, manager_id) on the outer row must match one of the (department_id, manager_id) pairs returned by the subquery. No cross‑mixing.
Takeaway:
(col1, col2) IN (SELECT col1, col2 …)→ pairwisecol1 IN (SELECT col1 …) AND col2 IN (SELECT col2 …)→ non‑pairwise, more permissive
3. Scalar Subqueries: Tiny Queries in Weird Places
A scalar subquery returns exactly one row, one column. Oracle will then happily treat it like a single value.
You can use scalar subqueries:
- Inside
CASE/DECODE - In
SELECT,WHERE,ORDER BY,HAVING - In
UPDATESETandWHEREclauses - Basically anywhere an expression is allowed (except
GROUP BY)
3.1 Scalar subquery in a CASE expression
SELECT employee_id,
last_name,
CASE
WHEN department_id = (
SELECT department_id
FROM departments
WHERE department_name = 'Sales'
)
THEN 'Canada'
ELSE 'USA'
END AS country_guess
FROM employees;
Here the inner query returns a single department_id (for Sales). The CASE reads like:
> if this employee’s department_id equals that value, label ‘Canada’, else ‘USA’.
3.2 Scalar subquery as a “derived column”
Count employees per department without a JOIN:
SELECT d.department_id,
d.department_name,
(SELECT COUNT(*)
FROM employees e
WHERE e.department_id = d.department_id) AS emp_count
FROM departments d;
For each department row, the scalar subquery:
- Uses
d.department_idfrom the outer query - Counts employees in that department
- Returns a single number as
emp_count
Yes, it’s technically a correlated subquery too – welcome to overlapping terminology.
4. Correlated Subqueries: The Ping‑Pong Pattern
Normal subquery: inner query runs once, outer query uses the result. Correlated subquery: outer row feeds into the inner query, over and over.
Process:
- Take a candidate row from the outer query
- Plug its values into the subquery
- Use subquery result to accept/reject the row
- Repeat for the next outer row
4.1 Classic: managers vs non‑managers
Non‑correlated version (manager list first):
-- Managers
SELECT last_name
FROM employees
WHERE employee_id IN (
SELECT DISTINCT manager_id
FROM employees
WHERE manager_id IS NOT NULL
);
Correlated + EXISTS version:
-- Managers
SELECT e.last_name
FROM employees e
WHERE EXISTS (
SELECT 1
FROM employees m
WHERE m.manager_id = e.employee_id
);
Here:
- Outer row = candidate employee
e - Inner query asks, “does anyone have
e.employee_idas theirmanager_id?” - If yes → row qualifies.
Switch to NOT EXISTS for non‑managers:
SELECT e.last_name
FROM employees e
WHERE NOT EXISTS (
SELECT 1
FROM employees m
WHERE m.manager_id = e.employee_id
);
Unlike NOT IN, NOT EXISTS doesn’t lose its mind because of a NULL in the subquery.
4.2 “More than the average salary for their department”
SELECT e.employee_id,
e.last_name,
e.department_id,
e.salary
FROM employees e
WHERE e.salary >
(SELECT AVG(salary)
FROM employees i
WHERE i.department_id = e.department_id);
Per outer row:
- Grab that employee’s
department_id - Compute the average salary for that department in the inner query
- Return the row if
salaryexceeds that average
This pattern generalises to “top earners per group”, “rows above group average”, and basically any “compare to group metric” logic without a self‑join.
5. EXISTS and NOT EXISTS: Boolean Subqueries
EXISTS doesn’t care what the subquery returns, only whether it returns at least one row.
Basic form:
SELECT ...
FROM outer_table o
WHERE EXISTS (
SELECT 1
FROM inner_table i
WHERE i.some_col = o.some_col
);
Notes:
SELECT 1could beSELECT 'x'orSELECT NULL– it’s ignored.EXISTSstops at the first match, so it can be more efficient thanINon complex subqueries.
Example from the lesson:
EXISTS→ employees who are managersNOT EXISTS→ employees who are not managers
And crucially, NOT EXISTS behaves even if the inner query returns NULLs. NOT IN with a NULL in the subquery, on the other hand, is where logic goes to die.
6. The WITH Clause: CTEs So Your Query Doesn’t Look Like Fan Fiction
And look, once your query gets past about 20 lines, you either:
- Break it up with
WITH(common table expressions), or - End up with a single
SELECTthat no one will ever maintain again.
6.1 Non‑recursive WITH
Use WITH to name subqueries you want to reuse or just stop from visually melting your brain.
WITH avg_sal_tab AS (
SELECT AVG(salary) AS avg_sal
FROM employees
)
SELECT last_name,
salary,
(SELECT avg_sal FROM avg_sal_tab) AS avg_sal
FROM employees
WHERE salary >
(SELECT avg_sal FROM avg_sal_tab);
What actually happens:
avg_sal_tabis materialised in your session’s temp area (or cache).- Main query joins / references it like a tiny, temporary table.
- Oracle can reuse that result instead of recomputing the average multiple times.
6.2 CTE with grouping per department
From the example:
WITH cnt_dept AS (
SELECT department_id,
COUNT(*) AS dept_count
FROM employees
GROUP BY department_id
)
SELECT e.employee_id,
e.salary / d.dept_count AS salary_per_head
FROM employees e
JOIN cnt_dept d
ON e.department_id = d.department_id;
For each employee, you divide their salary by the number of employees in their department (because nothing says “motivation” like seeing your salary diluted by your coworkers).
6.3 Recursive WITH
Recursive CTEs let you walk hierarchies or paths without writing procedural loops.
Structure:
- Anchor query: base rows
UNION ALL- Recursive query: reuses the CTE name to build further levels
Flight‑time example (simplified):
WITH reachable_from (source, destination, total_time) AS (
-- Anchor: direct flights
SELECT source, destination, flight_time
FROM flights
UNION ALL
-- Recursive: extend routes
SELECT rf.source,
f.destination,
rf.total_time + f.flight_time
FROM reachable_from rf
JOIN flights f
ON rf.destination = f.source
)
SELECT *
FROM reachable_from;
The result includes:
- Direct paths from
flights - Multi‑hop paths with accumulated
total_time
This lets you answer questions like “how can I get from San Jose to Boston, and how long will it take, assuming planes connect and no one drops my luggage in Denver?”
What You Should Be Able To Do Now
By the end of this lesson, you should be able to:
- Write multiple‑column subqueries, and know when to use pairwise vs non‑pairwise comparisons
- Drop scalar subqueries into
SELECT,CASE,UPDATE, and enjoy the power - Use correlated subqueries to compare each row to group metrics or related data
- Reach for
EXISTS/NOT EXISTSinstead ofIN/NOT INwhen dealing with complex orNULL‑laden subqueries - Use the
WITHclause – including recursiveWITH– to structure and speed up ugly queries
In other words, you can now write the sort of queries that cause less experienced developers to say “wait, that’s legal?” – and yes, yes it is. Just don’t forget to format them.