- WHERE lets you limit the rows returned.
- GROUP BY lets you perform aggregations (like count, sum, average, minimum, and maximum).
- HAVING lets you limit the rows returned, based on aggregated values.
But that doesn’t really sink in well, so we’ll walk through some examples. We have a simple table (“Livestock”) that tracks livestock on different farms:
Farm | AnimalType | AnimalName | AnimalWeightLbs |
The Rocking J | Sheep | Andrew | 165 |
The Rocking J | Sheep | Bailey | 180 |
The Rocking J | Sheep | Caroline | 122 |
The Rocking J | Cow | Huberta | 1620 |
Lazy River | Sheep | Murray | 194 |
Lazy River | Cow | Zeke | 1704 |
Lazy River | Goat | Bill | 52 |
We can get different things out of it by using WHERE, GROUP BY, and/or HAVING.
IMPORTANT: You can play along at home with this script; it has the table definition, data, and all of the examples below (plus a few). For that matter, if you can’t stand long blog posts, just grab the script and go. I don’t mind.
WHERE
To limit the data I pull back – i.e., to limit what rows I get back – I use WHERE. Maybe I only want data from The Rocking J. I can do that with WHERE:
SELECT Farm, AnimalType, AnimalName, AnimalWeightLbs FROM dbo.Livestock WHERE Farm = 'The Rocking J';
Result:
Farm | AnimalType | AnimalName | AnimalWeightLbs |
The Rocking J | Sheep | Andrew | 165 |
The Rocking J | Sheep | Bailey | 180 |
The Rocking J | Sheep | Caroline | 122 |
The Rocking J | Cow | Huberta | 1620 |
If I only want to see sheep, from all farms:
SELECT Farm, AnimalType, AnimalName, AnimalWeightLbs FROM dbo.Livestock WHERE AnimalType = 'Sheep';
Result:
Farm | AnimalType | AnimalName | AnimalWeightLbs |
The Rocking J | Sheep | Andrew | 165 |
The Rocking J | Sheep | Bailey | 180 |
The Rocking J | Sheep | Caroline | 122 |
Lazy River | Sheep | Murray | 194 |
And if I only want to see data on sheep from The Rocking J:
SELECT Farm, AnimalType, AnimalName, AnimalWeightLbs FROM dbo.Livestock WHERE Farm = 'The Rocking J' AND AnimalType = 'Sheep';
Result:
Farm | AnimalType | AnimalName | AnimalWeightLbs |
The Rocking J | Sheep | Andrew | 165 |
The Rocking J | Sheep | Bailey | 180 |
The Rocking J | Sheep | Caroline | 122 |
Conclusion: WHERE lets you limit the rows returned.
GROUP BY
Group by is very different. It lets you divide your data up into groups. We can group our data and not bother to do any aggregations:
SELECT Farm, AnimalType FROM dbo.Livestock GROUP BY Farm, AnimalType;
All this does for us is to get a distinct list of Farms, and the animals that each farm happens to have:
Farm | AnimalType |
The Rocking J | Sheep |
The Rocking J | Cow |
Lazy River | Sheep |
Lazy River | Cow |
Lazy River | Goat |
(There are other ways to get this, too, but that’s not what we’re talking about right now. And this is a perfectly reasonable use for GROUP BY.)
But, what if we want a COUNT of each kind of animal at each farm? Same thing, but add in the built-in function COUNT():
SELECT Farm, AnimalType, COUNT(*) AS AnimalCount FROM dbo.Livestock GROUP BY Farm, AnimalType;
Result:
Farm | AnimalType | AnimalCount |
The Rocking J | Sheep | 3 |
The Rocking J | Cow | 1 |
Lazy River | Sheep | 1 |
Lazy River | Cow | 1 |
Lazy River | Goat | 1 |
There! We were able to tell SQL, “I want to make groups, defined by the farm and animal type. And then I want a count of each group’s members.”
We could do any aggregation, or any bunch of aggregations, on this group. In addition to the count, let’s get the average weight, maximum weight, and minimum weight of the animals in each group:
SELECT Farm, AnimalType, COUNT(*) AS AnimalCount , AVG(AnimalWeightLbs) AS AvgWeightLbs , MIN(AnimalWeightLbs) AS MinWeightLbs , MAX(AnimalWeightLbs) AS MaxWeightLbs FROM dbo.Livestock GROUP BY Farm, AnimalType;
Results:
Farm | AnimalType | AnimalCount | AvgWeightLbs | MinWeightLbs | MaxWeightLbs |
The Rocking J | Sheep | 3 | 155 | 122 | 180 |
The Rocking J | Cow | 1 | 1620 | 1620 | 1620 |
Lazy River | Sheep | 1 | 194 | 194 | 194 |
Lazy River | Cow | 1 | 1704 | 1704 | 1704 |
Lazy River | Goat | 1 | 52 | 52 | 52 |
Side note: GROUP BY and WHERE
A quick example:
SELECT Farm, AnimalType, COUNT(*) AS AnimalCount , AVG(AnimalWeightLbs) AS AvgWeightLbs , MIN(AnimalWeightLbs) AS MinWeightLbs , MAX(AnimalWeightLbs) AS MaxWeightLbs FROM dbo.Livestock WHERE Farm = 'The Rocking J' GROUP BY Farm, AnimalType;
Don’t get freaked out by GROUP BY and WHERE. Take it step by step: if you had to do this by hand, the simplest thing is to narrow down your data first (WHERE Farm = The Rocking J), and THEN divide the remaining data into groups (GROUP BY Farm, AnimalType).
The resultset from our query above:
Farm | AnimalType | AnimalCount | AvgWeightLbs | MinWeightLbs | MaxWeightLbs |
The Rocking J | Cow | 1 | 1620 | 1620 | 1620 |
The Rocking J | Sheep | 3 | 155 | 122 | 180 |
HAVING
“Having” works on aggregations.
A query to pull back data on animals that weigh less than 200# would use a WHERE clause: “WHERE AnimalWeightLbs < 200”.
But if I want to pull back data on groups with an average weight less than 200#, I can’t use WHERE, because WHERE won’t work on aggregates. I have to use GROUP BY and HAVING:
SELECT Farm, AnimalType, COUNT(*) AS AnimalCount , AVG(AnimalWeightLbs) AS AvgWeightLbs FROM dbo.Livestock GROUP BY Farm, AnimalType HAVING AVG(AnimalWeightLbs) < 200;
Result:
Farm | AnimalType | AnimalCount | AvgWeightLbs |
The Rocking J | Sheep | 3 | 155 |
Lazy River | Sheep | 1 | 194 |
Lazy River | Goat | 1 | 52 |
Bonus and Conclusion
There are other things to know and play with, of course. (For example, you can GROUP BY and perform HAVING on columns that aren’t actually in your SELECT list.) But that’s info for another time.
So, like it says at the top:
- WHERE lets you limit the rows returned.
- GROUP BY lets you perform aggregations (like count, sum, average, minimum, and maximum).
- HAVING lets you limit the rows returned, based on aggregated values.
I have a blog that expands on these topics – SELECT, Deconstructed – but we really need a targeted introduction here.
Remember to download the script with all the examples, and play with the code yourself. That’s the best way to learn!
Happy days,
Jen McCown
www.MidnightDBA.com/Jen