In SQL, itās common to calculate aggregates such as counts, sums, or averages grouped by some key. But sometimes you also want to enrich those grouped results with statistics calculated across all rows in the result setāfor example, adding the overall average as a reference point.
Letās look at a practical example.
Suppose you have two tables:
orders
order_items
(storing the items linked to each order)
You want to count how many items each order has:
select
order_id,
count(*) as item_count
from
order_items
join orders on order_items.order_id = orders.id
group by
order_id
order by
item_count desc;
This query returns the number of items per order, sorted from highest to lowest.
Now letās say you want to add the average item count across all orders as an extra column. Every row should show the same average, making it easy to compare each orderās count against the global mean.
You might think you need a subquery or CTE, but thereās a much cleaner way: window functions.
select
order_id,
count(*) as item_count,
avg(count(*)) over () as avg_item_count
from
order_items
join orders on order_items.order_id = orders.id
group by
order_id
order by
item_count desc;
This is how it works:
count(*)
gives the number of items per order.avg(count(*)) over ()
applies the aggregateavg()
over the entire result set.over ()
with empty parentheses means "no partitioning, no ordering"ājust compute the average across all grouped rows.
The result might look like this:
order_id | item_count | avg_item_count |
---|---|---|
42 | 15 | 7.3 |
37 | 12 | 7.3 |
91 | 10 | 7.3 |
⦠| ⦠| 7.3 |
Every row now includes the global average, making it straightforward to see whether an order is above or below it.
Sometimes the global average isnāt enoughāyou might also want to see how an order compares within its own customerās orders. For that, you can use partitioned window functions.
Letās assume orders
has a customer_id
column. We can adjust the query like this:
select
o.customer_id,
oi.order_id,
count(*) as item_count,
avg(count(*)) over () as global_avg_item_count,
avg(count(*)) over (partition by o.customer_id) as customer_avg_item_count
from
order_items oi
join orders o on oi.order_id = o.id
group by
o.customer_id,
oi.order_id
order by
item_count desc;
Here's what happens:
avg(count(*)) over ()
ā global average across all orders.avg(count(*)) over (partition by o.customer_id)
ā average item count within each customerās orders only.
Now the result contains both reference points:
customer_id | order_id | item_count | global_avg_item_count | customer_avg_item_count |
---|---|---|---|---|
101 | 42 | 15 | 7.3 | 12.5 |
101 | 37 | 10 | 7.3 | 12.5 |
202 | 91 | 9 | 7.3 | 7.0 |
This makes it much easier to compare orders both globally and within each customerās context.
What you should remember:
- Use
aggregate(...) over ()
to add global statistics. - Use
aggregate(...) over (partition by some_column)
to add per-group statistics. - Window functions let you enrich grouped results without subqueries or joins, keeping SQL clean and expressive.
If this post was enjoyable or useful for you, please share it! If you have comments, questions, or feedback, you can email my personal email. To get new posts, subscribe use the RSS feed.