Category Archives: Beginner

Session: T-SQL’s Hidden Support Feature

Today I presented one of my favorite sessions – T-SQL’s Hidden Support Feature – for the DBA Fundamentals group! They’ll put up the recording shortly, but in the meantime I thought I’d post the slide deck and header template:

Download Goodies

Also, here are the resources I point to at the end of the session:

Oh heck, here’s the session abstract, too:

The most effective T-SQL support feature comes installed with every edition of SQL Server, is enabled by default, and costs no overhead. Yet, the vast majority of database administrator underutilize or completely neglect it. That feature’s name is “comments”.

In this session, Microsoft Certified Master Jennifer McCown will demonstrate the various commenting methods that make code supportable. Attendees will learn what’s important in a header comment, use code blocking to edit code, build a comprehensive help system, and explore alternative comment methods in stored procedures, SSIS packages, SSRS reports, and beyond. These methods help prevent errors and reduce troubleshooting.

Thanks for having me, DBA Fundamentals!

Compare column names for case sensitivity

I’m reviewing the code for the upcoming Minion CheckDB, and one of the things we’re checking for is case consistency in column names. For example, if Table1 has a column named Col1, and Table2 has COL1, well that’s no good.

But, how do we easily find those mismatches on a system that’s not case sensitive? Easy: collations.

This query compares all columns with the same name (speaking case insensitively) for case equality:

SELECT OBJECT_NAME(c1.object_id) AS TableName1
, OBJECT_NAME(c.object_id) AS TableName2
, c1.name AS ColName1
, c.name AS ColName2
FROM sys.columns AS c1
INNER JOIN sys.columns AS c ON c1.object_id > c.object_id
WHERE UPPER(c1.name) = UPPER(c.name)
AND c1.name COLLATE Latin1_General_CS_AS <> c.name COLLATE Latin1_General_CS_AS
ORDER BY ColName1
, TableName1
, ColName2;

Notice that we’re joining on c1’s object_id GREATER THAN c’s object_id. If we did <> (not equals), then we’d double our results (we’d see T1 C1 | T2 c1, and another row for T2 c1 | T1 C1).

We also have, in the where clause, UPPER(c1.name) = UPPER(c.name). We want column names that match, except for case.

And the “except for case” part comes with collating that last AND with a case sensitive collation: Latin1_General_CS_AS.

Easy. Done. Off you go.

-Jen

Step by Step: WHERE, GROUP BY, and HAVING

People new to SQL often have trouble with the difference between WHERE, GROUP BY, and HAVING…three separate clauses in the SELECT statement. The summary of each of these is simple to tell:

  • WHERE lets you limit the rows returned.
  • GROUP BY lets you perform aggregations (like count, sum, average, minimum, and maximum).
  • HAVING lets you limit the rows returned, based on aggregated values.

But that doesn’t really sink in well, so we’ll walk through some examples.  We have a simple table (“Livestock”) that tracks livestock on different farms:

Farm AnimalType AnimalName AnimalWeightLbs
The Rocking J Sheep Andrew 165
The Rocking J Sheep Bailey 180
The Rocking J Sheep Caroline 122
The Rocking J Cow Huberta 1620
Lazy River Sheep Murray 194
Lazy River Cow Zeke 1704
Lazy River Goat Bill 52

We can get different things out of it by using WHERE, GROUP BY, and/or HAVING.

IMPORTANT: You can play along at home with this script; it has the table definition, data, and all of the examples below (plus a few). For that matter, if you can’t stand long blog posts, just grab the script and go. I don’t mind.

WHERE

To limit the data I pull back – i.e., to limit what rows I get back – I use WHERE.  Maybe I only want data from The Rocking J. I can do that with WHERE:

SELECT  Farm, AnimalType, AnimalName, AnimalWeightLbs
FROM    dbo.Livestock
WHERE   Farm = 'The Rocking J';

Result:

Farm AnimalType AnimalName AnimalWeightLbs
The Rocking J Sheep Andrew 165
The Rocking J Sheep Bailey 180
The Rocking J Sheep Caroline 122
The Rocking J Cow Huberta 1620

If I only want to see sheep, from all farms:

SELECT  Farm, AnimalType, AnimalName, AnimalWeightLbs
FROM    dbo.Livestock
WHERE  AnimalType = 'Sheep';

Result:

Farm AnimalType AnimalName AnimalWeightLbs
The Rocking J Sheep Andrew 165
The Rocking J Sheep Bailey 180
The Rocking J Sheep Caroline 122
Lazy River Sheep Murray 194

And if I only want to see data on sheep from The Rocking J:

SELECT  Farm, AnimalType, AnimalName, AnimalWeightLbs
FROM    dbo.Livestock
WHERE   Farm = 'The Rocking J'
        AND AnimalType = 'Sheep';

Result:

Farm AnimalType AnimalName AnimalWeightLbs
The Rocking J Sheep Andrew 165
The Rocking J Sheep Bailey 180
The Rocking J Sheep Caroline 122

 

Conclusion: WHERE lets you limit the rows returned.

GROUP BY

Group by is very different. It lets you divide your data up into groups. We can group our data and not bother to do any aggregations:

SELECT  Farm, AnimalType
FROM    dbo.Livestock
GROUP BY Farm, AnimalType;

All this does for us is to get a distinct list of Farms, and the animals that each farm happens to have:

Farm AnimalType
The Rocking J Sheep
The Rocking J Cow
Lazy River Sheep
Lazy River Cow
Lazy River Goat

(There are other ways to get this, too, but that’s not what we’re talking about right now. And this is a perfectly reasonable use for GROUP BY.)


 

But, what if we want a COUNT of each kind of animal at each farm? Same thing, but add in the built-in function COUNT():

SELECT  Farm, AnimalType, COUNT(*) AS AnimalCount
FROM    dbo.Livestock
GROUP BY Farm, AnimalType;

Result:

Farm AnimalType AnimalCount
The Rocking J Sheep 3
The Rocking J Cow 1
Lazy River Sheep 1
Lazy River Cow 1
Lazy River Goat 1

 

There! We were able to tell SQL, “I want to make groups, defined by the farm and animal type. And then I want a count of each group’s members.”


 

We could do any aggregation, or any bunch of aggregations, on this group. In addition to the count, let’s get the average weight, maximum weight, and minimum weight of the animals in each group:

SELECT  Farm, AnimalType, COUNT(*) AS AnimalCount
                , AVG(AnimalWeightLbs) AS AvgWeightLbs
                , MIN(AnimalWeightLbs) AS MinWeightLbs
                , MAX(AnimalWeightLbs) AS MaxWeightLbs
FROM    dbo.Livestock
GROUP BY Farm, AnimalType;

Results:

Farm AnimalType AnimalCount AvgWeightLbs MinWeightLbs MaxWeightLbs
The Rocking J Sheep 3 155 122 180
The Rocking J Cow 1 1620 1620 1620
Lazy River Sheep 1 194 194 194
Lazy River Cow 1 1704 1704 1704
Lazy River Goat 1 52 52 52

Side note: GROUP BY and WHERE

A quick example:

SELECT  Farm, AnimalType, COUNT(*) AS AnimalCount
                , AVG(AnimalWeightLbs) AS AvgWeightLbs
                , MIN(AnimalWeightLbs) AS MinWeightLbs
                , MAX(AnimalWeightLbs) AS MaxWeightLbs
FROM    dbo.Livestock
WHERE Farm = 'The Rocking J'
GROUP BY Farm, AnimalType;

Don’t get freaked out by GROUP BY and WHERE. Take it step by step: if you had to do this by hand, the simplest thing is to narrow down your data first (WHERE Farm = The Rocking J), and THEN divide the remaining data into groups (GROUP BY Farm, AnimalType).

The resultset from our query above:

Farm AnimalType AnimalCount AvgWeightLbs MinWeightLbs MaxWeightLbs
The Rocking J Cow 1 1620 1620 1620
The Rocking J Sheep 3 155 122 180

HAVING

“Having” works on aggregations.

A query to pull back data on animals that weigh less than 200# would use a WHERE clause: “WHERE AnimalWeightLbs < 200”.

But if I want to pull back data on groups with an average weight less than 200#, I can’t use WHERE, because WHERE won’t work on aggregates. I have to use GROUP BY and HAVING:

SELECT  Farm, AnimalType, COUNT(*) AS AnimalCount
                , AVG(AnimalWeightLbs) AS AvgWeightLbs
FROM    dbo.Livestock
GROUP BY Farm, AnimalType
HAVING AVG(AnimalWeightLbs) < 200;

Result:

Farm AnimalType AnimalCount AvgWeightLbs
The Rocking J Sheep 3 155
Lazy River Sheep 1 194
Lazy River Goat 1 52

Bonus and Conclusion

There are other things to know and play with, of course. (For example, you can GROUP BY and perform HAVING on columns that aren’t actually in your SELECT list.) But that’s info for another time.

So, like it says at the top:

  • WHERE lets you limit the rows returned.
  • GROUP BY lets you perform aggregations (like count, sum, average, minimum, and maximum).
  • HAVING lets you limit the rows returned, based on aggregated values.

I have a blog that expands on these topics – SELECT, Deconstructed – but we really need a targeted introduction here.

Remember to download the script with all the examples, and play with the code yourself. That’s the best way to learn!

Happy days,
Jen McCown
www.MidnightDBA.com/Jen

Practice!