Tag Archives: Query techniques

Dropping DBs in Powershell

As long as I’m on a roll here with server-level ops, I thought I’d go ahead and cover dropping DBs in PS. This should be a fairly short post as there’s not much to this, but it’s worth talking about anyway.

Dropping DBs isn’t any big deal in T-SQL:

DROP DATABASE SeansDB

And the PS method for doing this is of course a 1-liner, but it’s not as easy as T-SQL. Again there’s more than one method for doing it and I’ll show you both of them. Here, whether you connect through sqlps, or through PS proper, you want to be at the Database node. Here’s a screenshot of doing it from sqlps, but again, it really doesn’t matter.

Method 1

dir | %{$_.Drop()

There are a couple things to note here. First of all, this is typical PS, so if you run this cmd as it’s written here, you’ll drop every DB on your server… totally uncool. And 2nd, I’m always talking about how if you see an Alter() method on the get-member list then that usually means it wants you to use it before the changes are pushed to the server itself. Well, this is one of those times that makes me put the ‘usually’ in there because while that’s a good rule of thumb, PS is nice enough to drop objects when you ask it to. So anyway, unless you want to lose everything, I wouldn’t run the code above. I just wanted to have a basis for future syntax.

So all we have to do now is limit our result list to the DB we’re interested in:

dir | ?{$_.Name -eq "SeansDB"} | %{$_.Drop()

It just doesn’t get any easier than that. Now, at this point T-SQL is still ahead, cause even I would still much rather use T-SQL for dropping a single DB. Powershell is going to pull ahead pretty fast when it comes to dropping several DBs, or even a single DB on multiple servers.

Let’s say you’ve got several test DBs, and they all have the word ‘test’ in them somewhere. Since different devs created them, they don’t have a solid naming convention. Hell, some of them may even just have ‘tst’ in them, who knows, right?
At this point it’s just a matter of altering the above script so that it can accommodate your criteria.

dir | ?{$_.Name -match "test" -or $_.Name -match "tst"} | %{$_.Drop()

T-SQL would require you to code a cursor for this, and while the for-each is technically a cursor in PS, it takes next to no coding for us. PS is starting to pull ahead a little now. And by simply changing the where-object criteria, you can easily change this script to do things that are more difficult in T-SQL like dropping DBs that are all owned by a certain login, or were created on a certain date, or are under or over a certain size, or even haven’t been backed up recently. Of course, some of that criteria you’d never use, but it’s all there for you. And again, you can find out what you can use by doing a get-member and anything that comes up as a property is usable.

dir | gm}

Method 2
Now let’s look at a shorter method for dropping a single DB.

(dir SeansDB).Drop()}

That’s pretty easy, and to do multiple DBs, it could look something like this:

(dir | ?{$_.Name -match "test" -or $_.Name -match "tst").Drop()

Now, if you have a DB that exists across multiple boxes then you can drop all of them like this:

$a = "server1", "server2", "server3"
$a | %{
$ServerName = $_; ## Just setting this to make the script easier to read.
cd sqlserver:\sql\$ServerName
(dir SeansDB).Drop()}

And that’s it. Going against multiple boxes is so easy. And the way I’m doing it by setting the $a var to the server list, you can easily populate $a anyway you like and you don’t have to change the rest of the script. So make it the results from a SQL query, or from a txt file, etc. It doesn’t matter.

OK, so that’s it this time. Be careful with this or you’ll find yourself doing a recovery on all your DBs. Man, it would be really easy to be evil with this huh? In less than 60secs you could kill every DB in your company if you wanted to… say if they fired you for no reason and you wanted to get back at them. I’m not saying you should, I’m just saying it’s possible.
And after something like that I’m compelled to say (again) that I’m not responsible for your (mis)use of any of these methods.

Checking for Active cluster node

Ok, I was on this blog today and it showed a cool Powershell method for checking which cluster node is active.  And while there’s nothing wrong with the script, it does go to show the thing I dislike the most about how many DBAs work with Powershell in SQL Server.  Now, I can’t stress enough that I’m not picking on this guy, and I’m not saying he’s wrong.  I’m just saying that I prefer to rely on the built-in methods of the provider because it’s just simpler.  Here’s what I mean.

I’m going to borrow his code for a moment just to show you an example.  But I encourage you to visit his blog yourself as he’s got some cool stuff out there.  Anyway, here’s his method for doing this in PS:

# Set cluster name
$cluster_name = "ClusterName";
# Load SMO extension
[System.Reflection.Assembly]::LoadWithPartialName("Microsoft.SqlServer.Smo") | out-null;
$srv = New-Object "Microsoft.SqlServer.Management.Smo.Server" $cluster_name;
# Get server properties
$properties = $srv.Properties
$owner_node = $properties.Item("ComputerNamePhysicalNetBIOS").Value;
$is_clustered = $properties.Item("IsClustered").Value
if($is_clustered)
{
	Write-Host "The current active node of $cluster_name is $owner_node.";
}
else
{
	Write-Host "$cluster_name is not a clustered instance of SQL Server.";
}

Overall, there’s nothing wrong with that except it’s just too long.  See, too many DBAs forget what Powershell is all about so they’re constantly re-inventing the wheel.  I like to use the kind of wheels I’m given whenever possible.

Here’s how I would tackle this same task (there are actually 2 ways):

Method 1

>$a = dir

>$a.ComputerNamePhysicalNetBIOS

Method 2

>(dir).ComputerNamePhysicalNetBIOS

And that’s it.  Why would you want to type all that other stuff when you can so easily type just a few characters… 33 to be exact. 

All you have to do is make sure you’re in the server node of the provider tree. 

So if you connect through sqlps, then you’ll right-click on the server itself and that’ll take you to the Default instance.  Then you just need to go one level up by typing ‘cd ..’  From there, just type one of the lines above and you’re golden.

Oh yeah, his script also told you whether the box was even clustered to begin with.  We can handle that with another line.

>$a.IsClustered

That returns ‘True’ so if you want to make it pretty for yourself then just a quick change can make that happen.

>IF ($a.IsClustered) {“It’s clustered alright.”} ELSE {“No it isn’t.”}

And strictly speaking you don’t need the ELSE in there.

Now, using both methods you can easily cycle through a ton of boxes so there are no worries there.  

I’m gonna get on my soapbox for a minute and say that this is the bulk of the PS I see being taught and to me it shows a lack of fundamental understanding of what PS is supposed to do for us.  I just prefer to use what the provider gives me.  And it really matters too.  If you find yourself somewhere without your scripts, which happens to me all the time when I’m working from a user’s box, or at another shop, then you’ve got to remember all that SMO provider info.  My method is not only much easier to remember, it’s much easier to investigate because you can easily do a GM against $a to see all the properties it has.  But if you go even just a couple weeks without typing in that code to load the SMO provider, you can forget nuances and you’ll find yourself looking something up.  And that’s not efficient DBA work.  PS is supposed to make our jobs easier, not complicate them.  PS isn’t just a replacement for VBScript.  It’s an entirely new way of thinking.  And there are tons of guys out there teaching PS who haven’t switched to the PS way of thinking.  To me this is the exact same thing as those guys who use those really complicated cursor methods for finding log space usage when they could just type DBCC sqlperf(logspace).  Can you do it the other way, sure, but why would you when the SQL team has given you such an easy method?  This is why most of the time whenever I see a PS blog somewhere, I typically have to translate it into real PS.

So guys, let’s get rid of all that needless SMO code and bring PS back to what it’s supposed to be… simple one-liners that make our job easier.

I also did a video last week about changing server-level properties that talks you through the same methods as well.  Take a look.

http://midnightdba.itbookworm.com/VidPages/PowershellServerProps/PowershellServerProps.aspx

Getting a good look at deadlocks

One of the things that’s a major concern for DBAs is monitoring deadlocks.  And that’s really easy.  The more difficult part can be to sift through all the info once you’ve got it.  The deadlock graph in Profiler is my favorite tool for gathering them, and I prefer to shred the XML results into a table so I have all the info right there in front of me.

So here I’ve got a script I’ve written that shreds your deadlock XML into a relational table.  Actually, I modified this from the one Trevor Barkhouse did for his deadlocking presentation.  I just wanted some friendlier cols, and more info in the table.  Once you get the data into this format though, you can see how easy it is to turn it into a real process. 

What I love the most about this is how easy it is to go through a high number of deadlocks and query them for anything you like.  You can easily get a picture of your entire environment with very little effort.  Now, for this, you’ll have to get the deadlock graphs into XML to begin with and while I could show it to you here, Trevor does a good job of it so I’ll just point you to his blog on the topic.  However, once you have the XML, then here’s my query to mine it.

Here’s the code:
USE [tempdb];

GO

DECLARE @DeadlockCount INT;
DECLARE @DeadlockGraph XML;
DECLARE @DeadlockGraphFileName NVARCHAR(255);
DECLARE @DynamicSQL NVARCHAR(1024);

   SET @DeadlockGraphFileName = 'G:\DeadlockFiles\DeleteCV3OrderGenerationQueue\AllEvents.xml'
  
--'G:\DeadlockFiles\20110405-20110418_1.xdl';

   -- The OpenRowset() T-SQL function requires a string literal for the file
   -- name, so use dynamic SQL to avoid hardcoding a path (other than in the
   -- above assignment to the @DeadlockGraphFileName variable... but that's
   -- a little clearer).
   SET @DynamicSQL = N'SELECT @XMLDocument = Cast([BulkColumn] AS XML)' +
      
NCHAR(10) +
      
N'FROM OpenRowset' +
      
NCHAR(10) +
      
N'(' +
      
NCHAR(10) +
      
N'    BULK ''' +
      
@DeadlockGraphFileName +
      
N''',' +
      
NCHAR(10) +
      
N'    SINGLE_BLOB' +
      
NCHAR(10) +
      
N') AS [XMLDocument];';

PRINT @DynamicSQL
    
--  Execute the dynamic SQL and pull the result into the @XMLDocument
    --  local variable.
  
EXECUTE [master].[sys].[sp_executesql]
      
@stmt = @DynamicSQL,
      
@params = N'@XMLDocument XML OUTPUT',
      
@XMLDocument = @DeadlockGraph OUTPUT;
      
      
      
    
--  Count the number of deadlock graphs in the XDL file.
  
SELECT @DeadlockCount = DeadlockList.Graphs.value('count(deadlock)', 'Int')
  
FROM @DeadlockGraph.nodes('/deadlock-list') AS DeadlockList(Graphs);

   PRINT 'File            :   ' +
      
@DeadlockGraphFileName +
      
CHAR(10) +
      
'Deadlock count  :   ' +
      
CAST(@DeadlockCount AS VARCHAR);

    --  Extract the parties to the deadlock.
    
    
DECLARE @PartiesToTheDeadlocks TABLE
(
  
[VictimProcessID] VARCHAR(25),
    
[VictimLastBatchStarted] datetime2,
    
[VictimLockMode] VARCHAR(15),
    
[VictimXActID] VARCHAR(15),
    
[VictimClientApp] VARCHAR(50),
    
    
[LiveProcessID] VARCHAR(25),
    
[LiveLastBatchStarted] NVARCHAR(23),
    
[LiveLockMode] VARCHAR(15),
    
[LiveXActID] VARCHAR(15),
    
[LiveClientApp] VARCHAR(50),
    
    
[LiveFileID] VARCHAR(15),
    
[LivePageID] VARCHAR(15),
    
[LiveObjName] VARCHAR(50),
    
[LiveLockModeHeld] VARCHAR(50),    
    
[VictimLockModeRequest] VARCHAR(50),    
    
    
[VictimFileID] VARCHAR(15),
    
[VictimPageID] VARCHAR(15),
    
[VictimObjName] VARCHAR(50),
    
[VictimLockModeHeld] VARCHAR(50),    
    
[LiveLockModeRequest] VARCHAR(50),
    
    
[VictimProcName] VARCHAR(100),
    
[VictimExecStack] VARCHAR(MAX),
    
[LiveProcName] VARCHAR(100),
    
[LiveExecStack] VARCHAR(MAX),    
    
[VictimInputBuffer] VARCHAR(2048),      
    
[LiveInputBuffer] VARCHAR(2048)

);

    
    INSERT INTO @PartiesToTheDeadlocks
  
SELECT VictimProcessID,
          
VictimLastBatchStarted,
          
VictimLockMode,
          
VictimXActID,
          
VictimClientApp,
          
          
LiveProcessID,
          
LiveLastBatchStarted,
          
LiveLockMode,
          
LiveXActID,
          
LiveClientApp,
          
          
LiveFileID,
          
LivePageID,
          
LiveObjName,
          
LiveLockModeHeld,
          
VictimLockModeRequest,
          
          
VictimFileID,
          
VictimPageID,
          
VictimObjName,
          
VictimLockModeHeld,
          
LiveLockModeRequest,
          
          
VictimProcName,
          
VictimExecStack,
          
LiveProcName,
          
LiveExecStack,
          
VictimInputBuffer,
          
LiveInputBuffer                

   FROM
  
(
  
  
SELECT
  
--Victim
  
DeadlockList.Graphs.value('(process-list/process[1]/@spid)[1]', 'NVarChar(15)') AS VictimProcessID,
  
CAST(REPLACE(DeadlockList.Graphs.value('(process-list/process[1]/@lastbatchstarted)[1]', 'NChar(23)'), N'T', N' ') AS DATETIME) AS VictimLastBatchStarted,
  
DeadlockList.Graphs.value('(process-list/process[1]/@lockMode)[1]', 'NVarChar(15)') AS VictimLockMode,
  
DeadlockList.Graphs.value('(process-list/process[1]/@xactid)[1]', 'NVarChar(15)') AS VictimXActID,
  
DeadlockList.Graphs.value('(process-list/process[1]/@clientapp)[1]', 'NVarChar(50)') AS VictimClientApp,
  
--Live
  
DeadlockList.Graphs.value('(process-list/process[2]/@spid)[1]', 'NVarChar(15)') AS LiveProcessID,
  
CAST(REPLACE(DeadlockList.Graphs.value('(pprocess-list/process[2]/@lastbatchstarted)[1]', 'NChar(23)'), N'T', N' ') AS DATETIME) AS LiveLastBatchStarted,
  
DeadlockList.Graphs.value('(process-list/process[2]/@lockMode)[1]', 'NVarChar(15)') AS LiveLockMode,
  
DeadlockList.Graphs.value('(process-list/process[2]/@xactid)[1]', 'NVarChar(15)') AS LiveXActID,
  
DeadlockList.Graphs.value('(process-list/process[2]/@clientapp)[1]', 'NVarChar(50)') AS LiveClientApp,
  
--Live resource.
  
DeadlockList.Graphs.value('(resource-list/pagelock[1]/@fileid)[1]', 'NVarChar(15)') AS LiveFileID,
  
DeadlockList.Graphs.value('(resource-list/pagelock[1]/@pageid)[1]', 'NVarChar(15)') AS LivePageID,
  
DeadlockList.Graphs.value('(resource-list/pagelock[1]/@objectname)[1]', 'NVarChar(50)') AS LiveObjName,
  
DeadlockList.Graphs.value('(resource-list/pagelock[1]/@mode)[1]', 'NVarChar(50)') AS LiveLockModeHeld,
  
DeadlockList.Graphs.value('(resource-list/pagelock[1]/waiter-list/waiter/@mode)[1]', 'NVarChar(50)') AS VictimLockModeRequest,    
  
--Victim resource.
  
DeadlockList.Graphs.value('(resource-list/pagelock[2]/@fileid)[1]', 'NVarChar(15)') AS VictimFileID,
  
DeadlockList.Graphs.value('(resource-list/pagelock[2]/@pageid)[1]', 'NVarChar(15)') AS VictimPageID,
  
DeadlockList.Graphs.value('(resource-list/pagelock[2]/@objectname)[1]', 'NVarChar(50)') AS VictimObjName,
  
DeadlockList.Graphs.value('(resource-list/pagelock[2]/@mode)[1]', 'NVarChar(50)') AS VictimLockModeHeld,  
  
DeadlockList.Graphs.value('(resource-list/pagelock[2]/waiter-list/waiter/@mode)[1]', 'NVarChar(50)') AS LiveLockModeRequest,
  
--Inputbuffers
  
DeadlockList.Graphs.value('(process-list/process[1]/executionStack/frame/@procname)[1]', 'NVarChar(100)') AS VictimProcName,
  
DeadlockList.Graphs.value('(process-list/process[1]/executionStack/frame)[1]', 'VarChar(max)') AS VictimExecStack,
  
DeadlockList.Graphs.value('(process-list/process[2]/executionStack/frame/@procname)[1]', 'NVarChar(max)') AS LiveProcName,
  
DeadlockList.Graphs.value('(process-list/process[2]/executionStack/frame)[1]', 'VarChar(max)') AS LiveExecStack,
  
RTRIM(LTRIM(REPLACE(DeadlockList.Graphs.value('(process-list/process[1]/inputbuf)[1]', 'NVarChar(2048)'), NCHAR(10), N''))) AS VictimInputBuffer,
RTrim(LTrim(Replace(DeadlockList.Graphs.value('
(process-list/process[2]/inputbuf)[1]', 'NVARCHAR(2048)'), NChar(10), N''))) AS LiveInputBuffer

FROM @DeadlockGraph.nodes('/deadlock-list/deadlock') AS DeadlockList(Graphs)
) AS [_ProcessDetails];

--  Display the parties to the deadlock.
SELECT *
FROM @PartiesToTheDeadlocks
ORDER BY VictimLastBatchStarted, VictimXActID;'

Now, the code formatter messed up the coloring, but the code works. 

Now, I’ll list the cols and explain them cause I know that’s the next thing you’re gonna want.  It’s broken up into 2 categories… the Victim process and the Live process.  I didn’t know what else to call it other than live and still keep it short enough to prefix the cols with.

VictimProcessID – spid of victim.

VictimLastBatchStarted – date/time of last batch for victim.

VictimLockMode – lock held by victim.

VictimXActID – trans ID for victim.

VictimClientApp – Victim client app.

LiveProcessID – spid of live process.

LiveLastBatchStarted – date/time of last batch for live process.

LiveLockMode – lock held by live process.

LiveXActID – trans ID within the batch for victim (I think).

LiveClientApp – client app for live process.

LiveFileID – file ID where the locked resource for live process is held.

LivePageID – page ID the processes are fighting over.  The live process currently holds the lock on this one.

LiveObjName – object the processes are fighting over.  The live process currently holds the lock on this one.

LiveLockModeHeld – lock held on the above object by the live process.

VictimLockModeRequest – lock being requested  on the above object by the victim.

VictimFileID – file ID where the object the victim process is holding is held.

VictimPageID – Page ID the processes are fighting over.  The victim currently holds the lock on this one.

VictimObjName – object the processes are fighting over.  the victim currently holds the lock on this one.

VictimLockModeHeld – lock held on the above object by the victim.

LiveLockModeRequest – lock being requested on the above object by the live process.

VictimProcName – the SP called by the victim.

VictimExecStack – the stmt in the SP that caused the deadlock.

LiveProcName – the SP called by the live process.

LiveExecStack – the stmt in the SP that caused the deadlock.

VictimInputBuffer – the exact SQL that was sent to SQL engine.

LiveInputBuffer – the exact SQL that was sent to SQL engine.

Now you can see at a glance where your problem deadlocks are, how often they’re occurring, and which resources are under the most contention.

Good stuff that.

A Lovely Statistics Query

Most of you know that you can set SQL to auto update stats.  And I’m sure that most of you know that it invalidates stats at about 20%.  What that means is that when about 20% of the data in the table has changed, the stats are invalidated.  It’s actually 500 + 20% if you wanna be specific.  And notice I say they’re invalidated and not updated.  Stats aren’t updated when that 20% mark is reached, they’re only invalidated.  They’re not rebuilt until they’re needed.  Well, for really large tables that can result in stats being oudated longer than they should because the amount of changes needed to invalidate them is much higher.  The same issue can occur with smaller tables where the changes are varied enough to throw the distribution out of whack.  Some tables are just more sensitive to change than others and that 20% may be too much.  You may find that you need to update stats more often… say at 10% or in extreme cases even 5%.

Here’s a sexy little query I put together that will give you the percentage of change an index has had.  You can use it to not only run a job to update stats on a different percentage, but you can also just query it and see what percentage your tables are at and how fast they change.  It’s an excellent tool for troubleshooting possible stats problems.  All the same, it’s pretty handy. 

–Get stats percentage of change for tables.

SELECT
so.SCHEMA_ID,
ss.NAME AS [Schema],
OBJECT_NAME(id) AS TableName, si.NAME AS IndexName,
CAST((CAST(rowmodctr AS float)/CAST(rowcnt AS float))*100 AS int)  AS Pct,
rowcnt,
STATS_DATE([id], indid) AS [StatsDate]
FROM sys.sysindexes si
INNER JOIN sys.objects so
ON si.id = so.object_id
INNER JOIN sys.schemas ss
ON so.schema_id = ss.schema_id
WHERE rowcnt >= 1000000--500
AND rowmodctr > 0
--AND OBJECT_NAME(id) = 'TableName'
ORDER BY Pct DESC, rowcnt DESC, rowmodctr

Here are a couple notes on the query:

1.  Notice there’s a way to query for a specific table.

2.  The StatsDate col shows you the last time the stats were updated.

3.  I’m also limiting the rowcount to indexes with more than 1mill rows.  Feel free to lower that if you like.

4.  rowmodctr > 0 is something I threw in there to keep the divide by zero error out of there.  This also filters out the system-created stats.

OK, I hope you guys like this one.  I personally love it.  It’s allowed me to build a process to update stats on some of my tables more aggressively than 20% and keep my server running at its peak.

Avoiding xp_cmdshell

This is gonna be a quick post, but I wanted to throw out a couple easy ways to avoid using xp_cmdshell.

Unfortunately this xp has gotten a bad rep, but only because it’s been abused.  If you use it properly there’s no reason why you can’t continue to get some good use out of it for many years to come.  Actually, I use it for my backup routines for a couple reasons… first, when I wrote them there really wasn’t a better choice.  And second, while I can architect a solution around it quite easily, it’s ridiculous to go so far out of my way to code around something that isn’t a huge problem.  The solution would be more complicated than the original problem, so why bother?  Oh yeah, and since I’m in a mixed shop there’s also that problem with having something like powershell on every box and I don’t wanna have to have a different routine for my SQL2K boxes, than I do for all the rest.  So using xp_cmdshell is still a good fit for me. 

Anyay though, there are some good reasons to avoid it when it’s not necessary (part of using it responsibly).  So we’re going to talk about 2 XPs today that will help you do just that.

First, there’s xp_FixedDrives.  This will report all the drives on your server and the amount of free space in MB.  Now, this can’t replace a well-formed PS script that can give you a lot more info, but it’s a nice start when you just need to look at drive space real quick.

Second, we have xp_create_subdir.  This handy little xp will let you create a directory.  You just pass it the path you want to create and you’re golden.  So it would look something like this:

xp_create_subdir ‘C:\Mydir\Anotherdir\theFinalDir’

And just like in DOS it’ll build the whole path for you so none of those dirs have to exist ahead of time.

Anyway, hope this finds someone well.

Get DB sums with Powershell

Here’s a cute little piece of code to add up the sizes of all your DBs on a server.  It comes in handy when you’re trying to see how much space you’ll need to duplicate the server either for DR, or reporting, or even just to see how much backup space you’ll need.

Make sure you’re at the DB context of PS:

>SQLSERVER:\sql\ServerName\default\databases

Then type this code:

>dir | %{$Total += $_.size}

>$Total

Ok, that’s it.  Not too bad, huh?  Now, let’s work on making it look like something you can actually work with. 

Ordinarily you would prettify this by just dividing it by the measure you want to convert it to.  So if you wanted to convert it to GB, then that would look like this:

>$Total/1GB

However, you can’t do that here.  Here’s what you need to know.

This output is in MB, and the “/1GB” trick assumes that the output is in bytes.  So lets assume that your $Total is 4189811.4375.  Remember now, that’s in MB.  So here’s what the 1GB trick will yield.

>$Total/1GB

0.00390206597512588

That’s not even close to right.  And again, that’s because the 1GB trick assumes your output will be in bytes.  So here the best way to convert this to GB is to do the math yourself like this:

>$Total/1024

4091.61273193359

Of course, you can still use one of the tricks to get the number you’re looking for, but since you’re already in MB you have to go 1 measure down to make it come out right. 

>$Total/1KB

4091.61273193359

So why is that?  Well, because 1KB is only a shortcut for 1024, and from MB, the conversion to GB is /1024.  And the problem of course with doing it this way is that it’s deceiving as hell.  Anyone else looking at this will think you’ve converted to KB because they may not realize that the original output is in MB.  So don’t use this one, I only threw it in there to show you how to come up with the right number.  It also shows you something about how the trick works.

Now, you can still use the 1GB trick, you just have to massage the data first.  So here we’re going to put it back into bytes and then convert it to GB.  This is really just to prove to you that the output for this trick really is assumed to be in bytes.

>($Total*1024*1024)/1GB

4091.61273193359

Personally I’d never do it this way.  Sure, it shows you and someone else what’s going on so you can read the code, but that’s also what comments are for.

>$Total/1024 # Convert to GB.  Can’t use /1GB because this output is in MB and the /1GB assumes the output to be in bytes.

See, what’s wrong with comments?

Keynote or Breakout Session?

From time to time you run across something that demonstrates perfectly exactly what the industry needs.  As I sit here at the PASS day-3 keynote I’m watching Dr. DeWitt talk about how the query optimizer finds query plans.  This is a very complicated topic under the covers and he’s covering some of the really complicated aspects.  He’s already explained the different types of histograms and how they work (I’m sure only at a high level), and he got into selectivity, etc.  And the thing is, he’s explaining this stuff in such simple terms and making it so easy to understand, it just hit me like a ton of bricks;  why isn’t this material being taught everywhere? 

This is one of the things I’m always talking about when I say there’s no really good training out there.  There are plenty of people out there teaching queries and tuning techniques, but nobody is bothering to break this stuff down so that people actually understand the terms and what they really mean.  Most of the time what they do is just define some terms briefly and then move on and expect everyone to be able to go back home and apply this stuff.  But without this background of how this stuff actually works and how the terms really fit together and how you get bad plans, etc, then people aren’t going to be nearly as successful as they could be.  Dr. DeWitt is clearly concerned with teaching people how to think instead of just what to think.

This is always the type of training I move towards because I find I never commit anything to memory unless I understand how it all works and fits together.  So what I’d like to see happen is for someone to put something like this together in a video series, or in a pre-con… something.  That’s a pre-con I’d actually pay for out of my own pocket.  A whole day of someone dedicated to making sure I understand how this stuff works… Hell Yeah!  But seriously, where does someone go to learn stuff like this?  Because this info isn’t out there in any human readable form.  And it’s apparently not just me either, cause he’s been brought back by overwhelming request, so I’d say the better part of the room is also screaming for this type of info.

Trace flag 1807

This is something I haven’t had a use for in quite some time but it actually came in handy today.  Of course, this isn’t something you’d want to make a living doing, but you’ll love to keep it in your hat and pull it out every couple yrs or so when the situation arises.

This is all about solving problems.  Here’s the scenario.  A vendor is here and he’s trying to upgrade a DB schema.  The way the thing works is it basically copies the entire DB to new tables and then drops the old ones.  So at the apex of the operation the DB is about double in size.  The problem is that the server doesn’t have enough space to hold double the DB.  So what do you do?

Well, one of your options is to add space to the disk.  The problem there is that you can never just add space, right?  You have to provision it, etc.  So in this scenario this isn’t a good option at all.

Next you could find another SQL box on the same build and move the DBs over there.  You could then perform the upgrade on the other server, and then move the files back when you’re done.  Of course, the problem there is finding another box that doesn’t encroach on someone else’s production.  So you may or may not have this type of box lying around.

You could also think about just moving the files over to like a SAN or another server that has enough space and just attaching them across the wire.  So the way that would look would be for you to move the files to the other server (assume it doesn’t have SQL installed because you don’t need it), then attach the DB from the original server and point it back to the UNC path where the files are now.  Then once the conversion is done, all you have to do is move the files back to the original server and you’re golden.

But wait… there’s a problem.  If you attach files across the network, SQL will run a check that they’re not local and throw an error.  And even if map a drive you’ll get the same result.  So how can we get around SQL checking for network locations?  That’s where trace flag 1807 comes into play.  Just typing a simple DBCC TraceON (1807) and then attaching your files will disable the network check and allow you to attach your files.

Pretty cool huh?

And for those of you who like to see things in action, I’ve got this video of how this all works.

http://midnightdba.itbookworm.com/VidPages/Trace1807/Trace1807.aspx

What does a bad query look like?

In my SQL Marklar blog today I discussed troubleshooting DB processes.  And I’m not going to re-hash all of it here but I did want to tell you about a use case that describes perfectly what I was talking about.

Not so long ago I got a call from one team and they told me that they had some server issues.  Everything moving slow they said.  Ok, so I got on and took a look and nothing was really jumping out at me.  Then I put a profiler trace on it to see if anything jumped out at me.  And of course, I knew nothing about the app or the processes so I really didn’t know what I was looking for, but you’ve gotta start somewhere huh?

So there I am in profiler and I’m just looking for long-running queries.  The problem is there were lots of queries I would consider long-running.  For some reason I focused in on a single SP that was taking like 5mins.  I pulled up the text of the SP and started looking through it.  It all seemed fairly standard.  I mean, it was long and everything wasn’t perfect, but there was nothing out of the ordinary. 

I contacted the app guy again and asked about it.  Does this SP typically take this long to run?  No, he says (and those of you who have seen My Cousin Vinny know where this is going).  So I thought eureka, I actually found something that may fix the issue.  So I got a couple valid params from him and ran the SP with them.  I also made sure to turn on execution plans and statistics io.  The query plan had some dings in it that experience has told me could easily have caused this kind of spike in resource usage.  The problem is that there was no fragmentation, and stats were up to date.  And in talking with the app guy he told me that they just archived a bunch of the data so it was down to like 200mill rows now.  So why would this thing be taking so long to return?  Moving on.

I found a copy of his QA system that had been copied over from prod the previous week and he assured me that they had changed nothing.  I could see the extra rows in the tables (copied before the archival), and the indexes were the same as in prod so that wasn’t the issue.  They had the same fill factor, everything.  In fact, everything I checked was identical except for the amount of data.  So why would having less data cause such a huge performance issue?  Moving on.

I decided that running this thing again and again on prod was probably a bad idea.  I’m just adding to the issue.  So I started doing the rest of my work on his QA box where I was the only spid.  And the hardware was similar as well (I love it when it works out that way).  So I ran the SP on this box and 5mins passed.  Then 10mins.  Then 15mins.  Then 20mins.  And sometime soon after that, the query returned.  I had collected all my stats along the way so I was golden.  It was getting the same execution plan as the prod version.  The results aren’t what I expected at all.  Why is the prod version now performing well in comparison?  So I called the app guy again and explained the situation.  Here’s more or less how the conversation went:

Me:  You know, I just don’t know what’s going on here.  I’ve been looking at this for a long time now and I’m getting further into a hole.  The prod version of this SP takes 5mins, and that’s even after the archival.  But when I run it on QA with all the data there it takes even longer.  If the prod query is acting up then I would expect the QA query to be a shorter time even with the extra data.

Guy:  Yeah that sounds about right to me.

Me:  What sounds right to you?  (I just got a bad feeling that something horrible had gone wrong)  (You know how you can instantly drop all the pieces into place and remember key words that make everything all of a sudden fit together?  Well, I just got that, but I wanted to hear him say it.)

Guy:  This SP usually takes about that much time, but since the archival it went down to 5mins.  We’ve been very pleased.

Me:  So you mean to tell me that when I came to you with this you didn’t find it necessary to tell me that the 5mins was an improvement?

Guy:  Well, I don’t know anything about DBs so I figured you could see that kinda thing already.

Me:  I see.  Well that clears up that mystery.  Now I’ve gotta go back and start over with this whole process.

Guy:  Well I can tell you the one that’s probably causing the issue.

Me:  Oh yeah?  How’s that?

Guy:  Because the slowness is only in our billing section and that’s controlled by just a few queries.  I can give you the names of the SPs and you can look at those.  There are only like 5 of them and since we’re having a hard time pulling up a list of clients it’s likely going to be the one that controls the search on that.

Me:  I see.  So you were just never going to tell me that?  I’ve been messing with this for 2hrs and you could have given me all this info before and saved me tons of time.

Guy:  Well, again, I don’t know anything about DBs and I figured you could see all that.

Me:  You thought I could see the web app from the DB?

Guy:  You can’t?

Me:  Kill me.

So ok, it turned out to be one of the 5 he gave me.  It had a bad query plan.  I wasn’t able to determine that all on my own, btw.  I had to recompile each one of them until I found the bad one.  And that’s because I didn’t have a perf baseline like I discussed on Marklar.

So there are a couple lessons to learn here but I think you can gleen them for yourself.  The whole point of this though is that making assumptions about processes is bad and no matter what looks like a long-running query to you, it may in fact be performing better than usual.

The stupid have fallen

This is a followup to my previous post here.

Wow, what a day.  Ok, so I just got off the phone with the tech lead for that company.  It was enlightening to say the least.  It was really a longer call than I expected or that I even wanted, but my blood pressure’s gone down a bit now.  I obviously can’t recount the entire conversation for you, but here are the highlights:

V = Vendor

M = Me

V: Hi Sean, this is Josh(not his real name) with X and I was told you wanted to talk to me.

M: No, the last thing in the world I wanna do is talk to you, but I feel in order to get our issue resolved I have no choice.

V: Yeah, Cornhole(not his real name) told me you got upset with him when he was just trying to explain to you what your issue was and then you insisted on speaking with me.

M: This is just incredible.  No I didn’t get upset with him for explaining the issue to me.  I got upset with him when I told him his explanation was flatout wrong and then he called me an idiot.  He didn’t even hint at it.  He called me an idiot outright.

V: Really?  Cause that’s not the story I got at all.  I heard that he was calmly explaining that your backups were the issue and you just exploded on him.

M: No, that’s NOT what happened at all.  That little turd actually called me stupid for not buying his assinine explanation for our issue and then I only exploded after he did that a couple times.

V:  Well what don’t you believe about it then?  Because we’ve seen that many times where backups will cause proceses to block.

M: No you haven’t.  Listen, I realize that everyone can’t spend their days studying SQL, but one simple fact is that backups don’t block processes.  Period.  I’ve been doing nothing but DBs in SQL Server for over 15yrs against TBs of data and nowhere have I ever seen a backup block a process.  The problem is that you’ve got like 3 tables with TABLOCK in that sp and that’s what causing your blocking.  Don’t you think it’s funny that it only clears up AFTER you kill that process?

V: Where did you get the code?  Those sps are encrypted.

M: Encryption only stops amateurs.  And while we’re at it, what’s this script I hear about that you’ve got to unconfuse SQL to give you the real spid for the process?

V: Where did you hear that backups don’t block processes?  It’s well-known here that it’s one of our biggest problems.

M: No, your biggest problem is your TABLOCKS.  And I heard it from the truth. 

V: What is a tablock?

M: A table-level lock.  You’re locking the entire table just to read data.

V: I don’t know about that, I’ve never seen the code because it’s encrypted.

M: Well I do and that’s your problem.  Now what about this script I kept hearing about from Cornhole?

M: Did you get my email?

V: Yeah.

M: And you can see the place in the code where you’re locking the whole table?

V: Yeah.

M: So how is it that you lock the entire table and you claim that it’s the backup blocking everything?  Haven’t you noticed that the backup only decides to block stuff when this process is running?

V: That does sound a little suspicious when you put it that way.  What can we do?

M: Let me take those TABLOCKs out of there so I can query these tables without locking everything up.

V: I just saw your email signature and I’ve never heard of an MVP before, but I just looked it up.  Are you really one of those guys?

M: Yeah.

V: Wow, so I guess you really know what you’re doing then.

M: I’ve hung a few DBs in my day, sure.

V: Do you mind if I write you again if I have any questions?

M: I suppose that depends on whether you’ll let me change that stupid sp or not.

V: Yeah, go ahead.  I’ll have to have a talk with our devs tomorrow and try to convince them to put this in our next build.  Honestly, I’m only a tech lead here because I’ve been here a few months longer than anyone else on the support team.  We get all this stuff from the devs and they won’t tell us anything.

M: So what about this script I keep hearing about.  The one that tells you how to unconfuse SQL and give you the right spid for a process?

V: That’s a script the devs have and they won’t show it to me either.  I just tell my guys something else so they won’t know I haven’t seen it.

M: Wow, that sucks man.  You do know though that a script like that doesn’t exist, right?  It’s completely bullshit.

V: Yeah, I hate this place.  I’ve been suspecting for a while now they they were lying to me, but what can I do?  I need this job.  I’m just lucky to be working.  I have no idea what I’m doing here.

M: I don’t know what to say to that.  You’re right, you are lucky to be working these days.  But pick up a couple books on your own and learn SQL.  Don’t rely on them.  There are websites, blogs, etc.  I myself run a site.

So in short, I really started to feel for this guy.  He ended the call apologizing 100x for the grief they’ve caused me and he’s going to go out right away and start watching my vids and trying to learn more.  He asked if he could write me with questions, and I said, of course dude, anytime.  That’s what I do.

And y, he sounded about 22 as well.  Turns out I was right.  I asked and he was 21 and this was only his 2nd job and they brought him in from nothing and “trained” him.  I’d say he’s got some work to do.  But I personally consider this exchange a success.