In last week's "Correlating Data in SQL Trace, Part 1", we took a trace to diagnose lock timeouts, and developed a query to correlate lines in the trace: this let us see what code was generating what lock timeouts.
The trick for the query is to use a correlated subquery to find the statement rownumber, thusly (simplified):
SELECT [Timeout].* , Statement.* FROM Trace AS [Timeout] INNER JOIN Trace AS [Statement] ON Statement.spid = [Timeout].spid AND Statement.DatabaseID = [Timeout].DatabaseID AND statement.rownumber = ( SELECT TOP 1 rownumber FROM trace WHERE spid = [Timeout].spid AND rownumber < [Timeout].rownumber AND eventclass = 11 ORDER BY rownumber DESC ) WHERE [Timeout].EventClass = 189A little translation: "WHERE EventClass = 189" gives us all our timeouts. "statement.rownumber = (subquery)" finds the closest (to the timeout) previous RPC:Starting row with the same spid…in other words, the timeout's associated RPC call.
Another use for this same type of correlation query is to find all "RPC starting" with no associated "RPC:complete" in a given timeframe. Okay, it's a pretty specific need, but it certainly illustrates that this can be a very versatile query.
Instead of correlating timeouts to queries, we want to correlate RPC start with RPC complete, and then display only those RPC:Starting rows with no associated RPC:Complete.
To achieve this, we have to make some changes to our query.
-
For one thing, we need a left outer join; this lets us return rows with NULL matches on the right side (meaning, no RPC:Complete)
-
In the rownumber=(subquery), we change the less than to greater than…before, we found timeouts and then went back in time to find their origin. Here, we're finding RPC:Starting and going forward in time to find RPC:Complete.
-
We change the subquery's ORDER BY to ASC for the same reason.
-
And of course, change the subquery eventclass = 10, as we're looking to return RPC:Completed.
-
Introducing an outer join to the query adds a new element: the possibility of matching up rows that don't actually go together. SPIDs get reused…if we keep our current criteria – an RPC:Completed with the same SPID and DBID with a greater rownumber – we're very likely to get false matches. The way to fix this is to match start times in the subquery.
So our final query looks like this:
SELECT TOP 1 [RPCStart].rownumber , [RPCStart].objectname , [RPCStart].textdata, [RPCStart].loginname , [RPCStart].spid , [RPCStart].starttime , [RPCStart].endtime , 'COMPLETE: ' AS spacer, [RPCComplete].duration , [RPCComplete].rownumber , [RPCComplete].objectname , [RPCComplete].textdata, [RPCComplete].loginname , [RPCComplete].spid , [RPCComplete].starttime , [RPCComplete].endtime FROM Trace AS [RPCStart] LEFT OUTER JOIN Trace AS [RPCComplete] ON [RPCComplete].spid = [RPCStart].spid AND [RPCComplete].DatabaseID = [RPCStart].DatabaseID AND [RPCComplete].rownumber = ( SELECT TOP 1 rownumber FROM Trace WHERE spid = [RPCStart].spid AND rownumber > [RPCStart].rownumber AND startTime > [RPCStart]. startTime AND eventclass = 10 — rpc completed ORDER BY rownumber ASC ) WHERE [RPCStart].EventClass = 11 — rpc startingWhat have we learned? That it's always best to tell the truth, little Johnny. Oh, and with a little careful restructuring, a good query can be recycled for another, equally cool use.
-Jen McCown, who is really posting this at midnight-thirty local time.