Here's the accompanying video – same material, different format.
A solution is ever so much more elegant if you know the path that led to it. Which is another way of saying, you only know how totally smart I am if you see how much work I had to do 🙂
I had to do a lot of work today – we have a stored procedure that's being called a few times a second, and maxing out 8 processors on our beefy test server during a moderate load test. So I dug in and had some fun. When it came to the final fine tuning, I was looking at a query full of date maninpulation. That sort of thing is performance death, so I stepped through it. We started with something like this, a query that allows the user to get a SUM by hour, day, week, or month. Here we'll just show week and month for berevity:
SELECT SUM(o.Sale) AS Sales
FROM Orders o
WHERE o.BizID = @BizID
AND (
— Interval: Week
( @Interval = 'wk'
AND DATEPART(wk, o.OrderDate) = DATEPART(wk, GetDate())
AND DATEPART(yyyy, o.OrderDate) = DATEPART(yyyy, GetDate())
)
— Interval: Month
OR ( @Interval = 'mm'
AND DATEPART(mm, o.OrderDate) = DATEPART(mm, GetDate())
AND DATEPART(yyyy, o.OrderDate) = DATEPART(yyyy, GetDate())
)
)
GROUP BY O.BizID
Principle #1: Free the Functions
Right now we're looking disdainfully at all those DATEPARTs and GetDates(). Principle #1: Get functions out of the WHERE clause, if possible. That's easy enough, we can declare @wk @mm and @yyyy variables and set them before our query. Let's also change this from OR to IF (for more on that, see my previous blog IF is better than OR), :
SELECT @wk = DATEPART(wk, GetDate()) , @mm = DATEPART(mm, GetDate()) , @yyyy = DATEPART(yyyy, GetDate())
— Interval: Week
IF @Interval = 'wk'
SELECT SUM(o.Sale) AS Sales
FROM Orders o
WHERE o.BizID = @BizID
AND DATEPART(wk, o.OrderDate) = @wk
AND DATEPART(yyyy, o.OrderDate) = @yyyy
GROUP BY O.BizID
— Interval: Month
ELSE IF @Interval = 'mm'
SELECT SUM(o.Sale) AS Sales
FROM Orders o
WHERE o.BizID = @BizID
AND DATEPART(mm, o.OrderDate) = @mm
AND DATEPART(yyyy, o.OrderDate) = @yyyy
GROUP BY O.BizID
Principle #2: Free the Optimizer
That'll help a lot, but we're not through yet. It occurrs to me that this still isn't an elegant solution. The real brainstorm comes: If we compare date part to date part, we'll never get away from DATEPART. What's better than that? How about comparing the whole date to another whole date, allowing the optimizer to use that column's index? Booya! Principle #2, let the optimizer do its job (and that includes making indexes availble). Use @DateStart and @DateEnd instead…see:
SET @date = CONVERT(VARCHAR(10), GETDATE(), 101) — This gives us a clean date with time set at midnight, 00:00:00
— Interval: Week
IF @Interval = 'wk'
BEGIN
SELECT @dateStart = DATEADD(dd,-1 * ( DATEPART(dw,GETDATE()) ) + 1,@date)
SELECT @dateEnd = DATEADD(dd, 7, @dateStart)
SELECT SUM(o.Sale) AS Sales
FROM Orders o
WHERE o.BizID = @BizID
AND o.OrderDate BETWEEN @dateStart AND @DateEnd
GROUP BY O.BizID
END
— Interval: Month
ELSE IF @Interval = 'mm'
BEGIN
SELECT @dateStart = DATEADD(dd,-1 * ( DATEPART(dd,@date) ) + 1,@date)
SELECT @dateEnd = DATEADD(m, 1, @dateStart)
SELECT SUM(o.Sale) AS Sales
FROM Orders o
WHERE o.BizID = @BizID
AND o.OrderDate BETWEEN @dateStart AND @DateEnd
GROUP BY O.BizID
END
Principle #3: Happify Your Code
We'll go through all that date math here in a minute, but first things first. Notice we've gotten all the funky functions and date manipulation out of our WHERE clause, where it'd mess up our friend the SQL optimizer, and placed it where it belongs: as a one-shot above the query itself. Oh and hey, see how the two SELECT statements are identical? Let's pull those out of the IF/ELSE and make the code more compact, readable, and easy to manage (oh hai, principle #3!):
SET @date = CONVERT(VARCHAR(10), GETDATE(), 101) — This gives us a clean date with time set at midnight, 00:00:00
— Interval: Week
IF @Interval = 'wk'
BEGIN
SELECT @dateStart = DATEADD(dd,-1 * ( DATEPART(dw,GETDATE()) ) + 1,@date)
SELECT @dateEnd = DATEADD(dd, 7, @dateStart)
END
— Interval: Month
ELSE IF @Interval = 'mm'
BEGIN
SELECT @dateStart = DATEADD(dd,-1 * ( DATEPART(dd,@date) ) + 1,@date)
SELECT @dateEnd = DATEADD(mm, 1, @dateStart)
END
SELECT SUM(o.Sale) AS Sales
FROM Orders o
WHERE o.BizID = @BizID
AND o.OrderDate BETWEEN @dateStart AND @DateEnd
GROUP BY O.BizID
Now, let's sit back and have a plate of #bacon. Oh yeah, and let's go over that date math.
Date Math
We were originally just thinking of matching month to month and year to year, for our first example. But that's not set based logic. To compare a date to a date, we needed a range to compare the column value to. In our example we're always going for the current date range (e.g., THIS month, THIS week, etc). So start with a clean, no-time @date:
SET @date = CONVERT(VARCHAR(10), GETDATE(), 101) — This gives us a clean date with time set at midnight, 00:00:00
Today that gives us 11/16/2009. Pseudocode: Use that to get the first of the month: Today's date, minus number of days (16), plus one = @date – datepart(dd,@date) + 1 = @date + (-1 * datepart(dd,@date)) + 1 =
SELECT @dateStart = DATEADD(dd,-1 * ( DATEPART(dd,@date) ) + 1,@date)
And DateEnd is just midnight on the first day of next month, or DATEADD(m,1,@dateStart). You can use this same kind of logic for hour:
SELECT @dateStart = DATEADD(hh, DATEPART(hh, GETDATE()),@date) — Clean date at midnight, plus the current hour
SELECT @dateEnd = DATEADD(hh, 1, @dateStart) — Start date plus 1 hour
And day is even simpler:
SELECT @dateStart = @date
SELECT @dateEnd = DATEADD(dd, 1, @dateStart)
For me, week was the tricky part, as we didn't wind up using the "week" date part at all. Pseudocode: start with the day of the week (Monday is 2, for today's example), and subtract that from today's date to get 0, add 1 to get Sunday of this week = today – day of week(today) + 1 =
SELECT @dateStart = DATEADD(dd,-1 * ( DATEPART(dw,@date) ) + 1,@date)
SELECT @dateEnd = DATEADD(dd, 7, @dateStart)
Nice, huh?
Review
Principles:
-
Principle #1: Get functions out of the WHERE clause, if possible. They're performance death.
-
Principle #2: Let the optimizer do its job. Make sure indexes are available, and the query can use them.
-
Principle #3: If all else remains equal, use fewer lines of code….make code more compact, readable, and easy to manage.
The major 2 things we did for this query were to get rid of the ORs (see IF is better than OR), and change our thinking from comparing date parts, to comparing the whole OrderDate to a range of dates.
Eat some #bacon. The end.
-Jen McCown, http://www.MidnightDBA.com