Tag Archives: Query techniques

Why won’t the log shrink?

I know that everybody always says you shouldn’t shrink your log files, but lets face it, sometimes it really is necessary.  Maybe you had a big op that was out of the norm and it grew your log bigger than it usually runs, or maybe (and more common) your backups were failing for a few days before you noticed and the log is now huge and you need to get it back down to a more manageable size.  For whatever reason, you need to shrink your log.  Now, I’m gonna show you the T-SQL way because it’s what I’ve got handy and I’ve used this script for years.

So here’s the script and then we’ll discuss a couple things… because the script isn’t the meat of this post.  But this one will do all DBs on the server, so just use the logic if you don’t like the cursor.  And it has no effect on DBs that can’t have their logs shrunk so don’t worry about causing damage.

Declare @curDBName sysname
OK, so that’s fairly standard admin stuff.  Now, If you wanna check that you’ve had the effect you expected, you can run this command both before and after to make sure you’ve shrunk it.

Declare DBName Cursor For

Select Name from sysDatabases

Open DBName

Fetch Next From DBName into @curDBName

while @@Fetch_Status = 0

Begin

DBCC ShrinkDataBase (@curDBName , TRUNCATEONLY)

Fetch Next From DBName into @curDBName

END

Close DBName

DeAllocate DBName

DBCC SQLPERF(LOGSPACE)

And what you’re going to get from that is a result set that gives you the DBName, the log file size, and the % used.  What you want to look at is the log file size and the %Used.  If the size of the file is really big and the %Used is really small, that means your log file is a lot bigger than it needs to be.  So an example would be you’ve got a log file that’s 52000MB and 3% used.  That’s a lot of empty space.  So you’ve definitely got to use the above cmd to shrink it. 

But what happens if you shrink it with that cmd and it doesn’t shrink?  Or maybe it just shrinks a little bit?  Well, now you’ve got to troubleshoot why it’s not shrinking.  This is one of my standard interview questions and you’d be surprised how many people get it wrong.  As a matter of fact, I’ve never had anybody get it right.  So here’s what you do… pick up a book and become a DBA.  Ok, seriously though, it’s extremely easy.

Run this query in master to see what your log may be waiting on to truncate.

SELECT Name, log_reuse_wait_desc FROM sys.databases

What you’ll get here is a result that tells you what the log is waiting for before it can reuse the VLFs, and in this case, kill them so you can shrink the file.  There are any number of things that can show up here as the reason.  You could be waiting on a long transaction, or on a log backup, or on replication, etc.  So next time you have a problem getting a log to shrink, come here first and find out what it’s waiting for before you go guessing what the problem might be.

That’s all I’ve got.

[EDIT]
Ok, I got a comment below that says BOL states that TRUNCATEONLY only works for data files. I checked and he’s right, BOL does say that. However, I’ve been using this method for years and I just ran a test on it individually using the code I pasted out of this post so BOL is wrong on this one. I ran my test on SQL2K8 but I don’t see any reason why it would change significantly with a lower version.

Powershell to recycle error log

Here’s a really good one. I noticed that very few of my boxes had a job to recycle the sql error logs. This is important because unless specifically done, the error log will only recycle when the service is burped in some way. For those of you who aren’t up on the lingo, recycling the error log is when you close down one log and create a new one. The error logs are just txt files kept on the server. So recycling creates a new physical txt file. And the reason you care about this is because some servers don’t get bounced very often at all and the files can get quite large. I have one server with a 40GB error log. And when I try to open it, it sits there for like 2hrs and then finally dies cause there aren’t enough system resources to deal with a file that big. So you can see why you want to keep the size of the files down.

And like I said, the error log only recycles when you burp the service or do it manually. And the way you do it manually is by running this code:

master.dbo.sp_cycle_errorlog

But doing it manually isn’t good enough because it has to be done on a regular basis. I prefer to recycle my logs every day. It’s a good generic schedule and honestly I don’t ever remember needing anything more granular. OK, so a new log every day is what we’re after. That means we need to put this code in a job and have it run every day at the same time.

So ok, that set the stage for what we’re wanting to do here. Now we can get into what we actually wanna talk about… and that’s how we push our new job to all of our boxes. Now, I’m going to do this a specific way and I’m doing it like this for a reason. I’ll explain once I’m done.

Here are the steps I want you to take to prep for this venture.

1. Create a job to occur every day (Midnight is preferable) and include the above code in the job step.
2. Be sure to set the job owner to sa.
3. Script the job and save it as a file somewhere.
4. You’ll have to script the job for different versions of SQL. SQL2K doesn’t have job schedules so the script blows up. So at least have one for SQL2k and below, and one for 2K5 and above.

 Now here’s the powershell script you’ll run.

$SourceSQLScript1 = “\\Server01\F$\SQLServerDBA\Scripts\Collector\SQLQueries\ServerListAllSQL2000.txt”;

$SqlCmd1 = Invoke-Sqlcmd -ServerInstance $Server -Database $StatsDB -inputfile $SourceSQLScript1

$SqlCmd1 | % {

[System.String]$ServerName = $_.ServerName;

$ServerName;

invoke-sqlcmd -ServerInstance $ServerName -Database “MSDB” -InputFile “\\Server01\F$\SQLServerDBA\Scripts\ServerConfig\RecycleSQLErrorLogJOB-SQL2K.sql” -SuppressProviderContextWarning

}

Now I’m going to explain what’s going on here and why:

1. $SourceSQLScript1 — This is the var that holds the location of the txt file where I store the query for the list of servers to run against. Inside the file looks like this: select ServerID as InstanceID, ServerName from dbo.Servers (nolock) where IsSQL = 1 and SQLVersion = 2000. The reason I do it this way is because if I ever need to change the query, all of the scripts that call the query will automatically pick up the change. So putting the query in a file is a good thing.

2. $SqlCmd1 — here’s where I’m actually running the query in the step above and setting the results to this var. Now I’ve got an array of ServerNames I can cycle through to run my script on.

3. [System.String]$ServerName = $_.ServerName; — Here I’m just setting the current cursor iteration of the ServerName equal to a var ($ServerName). It just makes it easier to read and if something comes up where I need to make a minor adjustment to how the ServerName looks, then I can mess with the var and not have to touch the rest of the script. Again, reducing effects to the script if I need to make changes to the value. So it’s like later on down the line if I discover that I need to put [] around it or something like that. This could also be done in the query file, but it’s nice to have the option here as well should I need something special for a single script.

4. $ServerName; — Here I’m just printing out the ServerName so I can watch my script cycle through the boxes. It’s my way of knowing what’s going on. I hate that blinking cursor.

5. invoke-sqlcmd — This is where the real magic happens. I’m connecting to each box and running the job script I saved above.

Here are some finer points of the logic for why I did things the way I did:

1. I put the server list query into a var. This is to allow you (or me) to get the server list any way we like w/o disturbing the rest of the query. You could have a hardcoded list of serverNames, or have them in a file… whatever you like. Here’s a really good example.

Instead of this:
$SqlCmd1 = Invoke-Sqlcmd -ServerInstance $Server -Database $StatsDB -inputfile $SourceSQLScript1
You can have this:
$SqlCmd1 = “Server1”, “Server2”, “Server3”, “Server4”, “Server5”

 And the rest of the script stays the same because the cursor is run off of the var. It doesn’t care how that var gets populated.

 2. I chose to run this separately for each edition of SQL. So I ran it once for SQL2K, once for SQL2K5, etc. That’s because like I said there are differences in the job scripts for the different versions. I could have done it all in a single script, but I really don’t like coding that logic in there and it increases the size of the script because now I have to check for the version of SQL and make a decision before I can create the job. I have to point it to the right .sql script. That’s too much work when all I have to do is change the query file and the script file between runs. And this way, I can run them concurrently for each version simply by saving the script under different names and kicking them off at the same time.

 3. This one is my favorite because it’s what makes this so flexible. I chose to put the sql logic itself into a file and just call that file. The reason I did this is because not only is the sql logic too long for this type of script, but if I’ve got anything else I need to run against all my boxes, all I have to do is change the sql script I’m calling. I don’t have to make any changes at all to the rest of the script. So if I want to create or delete any specific users off of a group of boxes, all I have to do is put that code in a script and save it, and then point this script to it. Or anything else, right? This is really the bread and butter for powershell and multi-server mgmt.

So guys, take this boiler plate script and methodology and apply it to your own stuff. I’ve taken a lot of stuff out of the script for this blog because my scripts are part of a much larger solution I’ve developed and it’s not relevant to what you’ll be doing. But the script works as-is and you can add whatever you want to it. And now you’ve got a generic way to push T-SQL out to your boxes, and a generic way to populate the server list it goes against. What else do you need really?

Poweshell wins again

It may seem a little redundant, but I love stuff like this. I was asked in front of a group of devs to script out a group of SPs on the prod box and copy them over to the new test box. These SPs stretch across a couple schemas and are named differently from the other ones in those schemas. As it turns out, there are something like 300 of them total. I don’t have a final count.

So when the guy asked me I said sure, that’ll take me like 60secs. And one of the other devs said, there’s no way. You have to check all of those boxes individually and and make sure you don’t miss anything. I said, of course I can. I’m a powershell guy (yes, i actually said that). He was like, even if you could script something like that out, there’s no way to easily get all the ones you need. You’ll be much faster in the wizard.

I told him, I accept your challenge. And for the first time, I gave a dev rights in prod and we had a face-off right there. We sat side by side and both of us started working feverishly to get our SPs scripted. Him in the wizard and me in powershell. Very quickly a crowd gathered. We prob had like 15-20 people gather. These were the PMs, other devs, report writers, etc. They all wanted to see if the bigshot DBA MVP could be taken down by a lowly dev.

Unfortunately like 90secs later, I completed my script and was building my file with my scripted SPs. He was still slugging his way through the wizard and wasn’t even close to having all his little boxes checked. When I finished, I just stood up and walked out as everyone clapped. When I left to come back upstairs he was still working at the wizard determinded to at least finish.

At least that’s how my powershell hero stories always play-out in my mind. I really did get that assignment today, but it was through email and I just did it without any pomp and circumstance. Oh well, a guy can dream can’t he?

Here’s the code I wrote today to pull SPs by schema and matching a pattern. I used the regex in powershell to make this happen. Enjoy.

PS SQLSERVER:\SQL\Server1\DEFAULT\Databases\ProdDB\StoredProcedures> dir | ?{$_.schema -eq “Bay” -or $_.schema -match “EBM”} | ?{$_.Name -match “Bay?_PR”} | %{$_.Script() | out-file C:\SPs.txt -append; “GO” | out-file C:\SPs.txt -append}

Maybe someday one of you will actually save the day with it.

Powershell to Alter Database File Properties

Man, I still just get so jazzed when I work with powershell. Every time I need to do something long and hard(that was for you Buck), I look in powershell and it lets me just lay back and enjoy the long and hard.

So this time we’ve got like 85 DB files for our main DB. I know that’s a lot but it’s been spread across so many LUNs for so long and it’s just grown and grown as it’s filled up the LUNs. Now we finally got a couple really large LUNs and we’re trying to consolidate the files. And now that we’ve moved most of them to a single LUN, we want to stop the autogrowth on most of them so they don’t grow anymore before we empty them and then drop them. Because what’s the point really in putting them all on a single LUN if you’re not going to get rid of some of them. So we definitely don’t want them to grow anymore before we can empty them.

The task at hand was to set all of these files (most of them really) to not autogrow. The choices were do it in the GUI, or write a T-SQL cursor to go through all the files and create an alter database command in dynamic SQL for each one. Neither of those is very attractive considering that there are so many files. Of the 2 of them though, the cursor is the more attractive solution. Then I thought of powershell. I hadn’t specifically done anything like this in powershell before, but I figured I’d be able to get some love. I’m not going to teach you powershell here, or explain the script too much. I’m just gonna give you the code and if you have any questions you’re free to write, but it’s pretty straightforward. I’ll be making a video on this tonight for those of you who like to see things in action.

PS SQLSERVER:\SQL\MYSQLBox\DEFAULT\Databases\MyDB\filegroups\data\files> gci | where{$_.Name -ne “MyDB_Data”} | %{$_.set_GrowthType(“None”); Alter();}

OK, so a quick glance at the important parts.

1. Make sure you’re in the right node. Notice I’m in data\files.
2. Here I’m filtering out the one file I don’t want to be effected. So the main data file I want to still be able to grow. I’ll talk about that more in a min.
3. Don’t forget to use the Alter() at the end. Usually whenever an object has an alter method it wants you to use it. If you don’t it’ll change it in cache in your powershell session but you won’t see the change persisted to the DB. So you’re not really making the change to the object in SQL, just in your powershell pipeline. You can tell if something has an alter method by doing this at the cmdline ‘gci | gm’. If there’s an alter method there, then use it.

OK, you don’t have to filter. If you find it too much trouble or if your files are named in such a way that it would be too hard to snipe a single file for some reason, then you can change them all and then just go back to the GUI to change the last one back to what it was.

So there you go. We have a nice 1-line script to alter as many files as you need. And you can clearly tack as many file changes on there as you need. And the best part about it is that the code doesn’t really change if you need to do another file operation. All you do is change the method call. Ahhh, the power of powershell.

SSIS Email Chain

Today’s blog is actually an email chain between me and a user. It’s only a single question and reply, but I think it’s good info.

Q:
Hey, would you agree that a monthly load process is better served as an SSIS – even if you have to push/pull from text files for now – than as a series of SPs or DLLs?

A:
if you’re staying on the same box for the load then SPs can be an attractive offer because they’re very fast and the memory stays in sql and can be managed quite well… if you move that process to ssis, and ssis is on the same box, then you have to allocate memory away from sql to run the pkg and house the data while in the buffer…

if ssis is on another box, but the data is still being moved to different dbs on the same box… so if the data is being moved from server1.db1 to server1.db2 and ssis is on server2, then you don’t have to fight sql for memory, but now you incur the network cost of moving the data from the box, and then back to it…

if you’re moving between boxes, then y, ssis is a better choice because in SPs you have to manage linked servers or openrowset to make that happen and that’s not cricket…

however, what makes ssis attractive in the single box scenario is that it handles errors easier and alerting is much richer and easier to come by… you can also more easily fix problems in the data itself and it’s easier to extend… so if your requirements change and you need to switch it to another box, or if you need to send a copy somewhere else, etc then that’s much easier in ssis… ssis also gives you parallelism that you cant do in sps… you can load several tables at once in ssis where they have to be serialized in sps…

a good compromise in the single box scenario is to keep things moving like they are if possible, and where not stupid… so if they’ve already got an sp that inserts data into the 2nd db, then just call that sp from ssis… this way you get the insert and select process in sql where it belongs, and the workflow and error handling of ssis, and everything else that goes with it… if a single sp inserts several tables serially though, i’d create several sps and call them individually in ssis tasks… even if you have to keep them seialized (like for PK/FK issues)… because now you can even add things like checkpoints and individual error flows to your pkg steps and have a richer experience…

these things are meant to allow your process to grow and change much easier than is possible with an sp… so if one of the DBs gets too big and has to move to another box, it’s a simple matter in ssis… but in an sp, that means you have to create and manage a linked server, which is just one more thing and they come with their own problems…

as for the DLLs… these processes NEVER belong in a DLL… not only do you have to steal memory from sql, but it’s not optimized for data flow tasks… it’s just a plain DB connection… you can have parallelism though by calling the windows threading model, but they have to manage those threads manually and mistakes can be made… and that takes extra memory, etc… not only that, but the code isn’t available for admins to fix problems, and making changes to the compiled code can be dangerous, right… so no, that’s not a good idea…

does this answer… i realize you prob wanted more of a yes, do this, but it’s not that simple…

No More Select *

Ok, not ‘no more’, but you guys should seriously limit your usage of select * for everything.  I realize that it’s easier to support when you change the data requirement, but it pushes a lot more data than you need and that could really impact network and server performance. 

Let’s say you’re running a web app and you need 3 cols on your page.  And you pull all 37 because it’s just easier to type a * than each column.  That’s fine from your end, but you could seriously impact the server and the network because if one of those cols is really wide, say varchar(200) (or even 400, right…) then you’re taking up that much extra bandwidth and server memory.  Sure it probably won’t effect your session that much now, but when you’ve got 500 people on your site at the same time you’ll start to feel the pain then for sure.

So just code for performance and stop being so lazy about having to type a few chars.  And if you really don’t like typing that much then get yourself a nice code completer like the one from Red-Gate and you won’t have to type nearly as much.  But I’m getting sick of laziness being an excuse for bad coding.

There are some exceptions though.  It is ok to use select * for some things but you have to choose those individually and judiciously.  Let’s say that you’ve got an SP that pulls 65 cols from a complicated set of logic that you don’t wanna have to re-create or maintain separately.  And let’s also say that you only need to query it every now and then, or maybe just 2-3 times a day.  In a case like that, it’s probably ok to go ahead and use the SP even though you’re only using a handful of the resultset.  However, in that same scenario, if you were using that data several times a minute, or even a second, then you’re really better off from a performance perspective to go ahead and create your own SP that returns less data.

Another excuse that gets used is people often tell me that they used select * to make it easier to make changes to the app.  And that is logical to a degree.  But people protect themselves all the time from issues that aren’t issues.  For instance, I had this just a while back where someone gave me that excuse and when I probed, the app had been up for 2yrs and had only ever had one minor change.  So what are you protecting yourself from then?  If the app is fairly static, then grow up and do the right thing.

OK, that’s actually stepping on the toes of another post so I’ll stop here.

Watch my free SQL Server Tutorials at:
http://MidnightDBA.ITBookworm.com

Read my book reviews at:
www.ITBookworm.com

Blog Author of:
Database Underground – http://www.infoworld.com/blogs/sean-mccown

Follow my Twitter:

http://twitter.com/MidnightDBA

 

Technorati Tags: ,