Tag Archives: troubleshooting

A Nice SQL Virus

I’m being forced off of my powershell kick here for a few mins to talk about viruses.

Why do companies refuse to allow scanning exceptions for DB files?  Do you really think that anyone’s written a SQL virus?  I really can’t remember ever having see an actual virus that attacks SQL files.  Actually, I take that back.  There is one virus that attacks DB files and drags performance so far down you can barely use the system.  In fact in my performance realm I actually classify this virus as an official DOS attack against my DB.  It’s called virus scan.  Virus scans are the only thing I’ve seen that specifically target DB servers to bring down the performance.

So if any of you non-SQL admins out there are running AV on your servers w/o exception policies for your DB files, turn yourselves in because you’re officially launching a DOS attack against your own servers.

But I promise I’m not bitter.

Why you have to be on top of your game

I just heard from a DBA at my last gig who was keeping me in the loop about things I had done while I was there.
He said that they had to take away the 2 1TB LUNs I had them put on our DB server because they caused performance problems. The SAN guys warned us that having LUNs that big would hurt us and I guess they were right. Well, not really. This was the report I got back from the DBA, so not my words. When I grilled him about it, he did say that they had given us a single spindle that was 1TB for each of them instead of giving us several drives in the array.
So let me get this right… I asked for 1TB LUNs, and instead of doing it right you gave me a single drive and used it as justification for showing me you know more about disks than I do? I find that incredible.

So guys, yeah… you really have to be on top of your game and know your stuff so that when they do stuff like this you can call them on it. And I must admit that I’m not sure it would have crossed my mind to even check that it was a single spindle. Why would that even cross my mind? These are enterprise SAN guys… they know their way around and should definitely know better. What I didn’t realize was that they were into playing games. It just goes to show that you try so hard to be effective in some places and they just won’t let you.

Data Explosion

Hey everybody… it’s been a while since I’ve blogged, but that’s what starting a new gig will do to you. I’ve had so many new things to rant about here and I’m sure I’ll be getting to those in the future, but right now I want to share a funny story with you.

We had a group of devs who were trying to run a fairly simple insert and it kept filling up the disk. And when I say it was filling up the disk, i mean like 50K rows were filling up like 200GB. So they came to me to see if I could fix it. This is where it gets fun.

It really didn’t take me too long in my investigation to find that they had the data file set to autogrow by 131,000%. That’s right, I said it! 131,000 percent. So now that I found the problem I was able to set it to something more reasonable, shrink the file and let them get on with their insert.

So it started me thinking about what other DBs in the place had similar issues because I’ve been hearing about disk space problems here and there. So I wrote a powershell to go out and check the file growth rates for every SQL box and I found a lot of the same type of stuff. There were a lot of them set for several thousand percent growth, several of them set for 1MB growth, and everything in between. In general, as a good generic setting to start from, I like to grow my files 1GB at a time. It’s a good round number that works in a lot of cases and then you can snipe the ones that need something else. And then I altered the powershell to go out and change the growth rates of all the DB files out there to 1GB. Life is good again and we have the beginnings of a happy shiny environment.

Oh y, and in the same discovery I also found 110 DBs set to autoshrink and I also took care of those.

Powershell 2.0 Get-Member Enhancements

The other day I blogged on a cool way to alter database file properties.  And I promised I would film the solution for you.  Well, I got home that night and started filming but when I pulled up sqlps I was stopped dead in my tracks by something I couldn’t explain.  The methods to get and set were missing.  I checked my other box and they were there, so what happened?  The only difference was that the box I wrote the blog from was XP and the box I was filming from was Win7.

So I pinged a couple really smart guys as MS (and I cc’d Buck).  So after a couple of days we have an answer.  There’s a new feature in powershell 2.0 that hides the getter and setter methods from you by default in get-member.  They’re officially calling it an enhancement.

There are 2 new parameters that get-member supports that allow you to control what you see.  You can now use –view and –force.

-view takes the parameters Extended, Adapted, Base, All

-force doesn’t take any additional parameters.

However if you’re looking specifically for getter and setter methods, you’ll have to use –force because –view doesn’t display them.  Here’s what that’ll look like:

dir | gm –force

Now you can see everything.  It’s interesting though is that by defalt, PS loads all the methods, it just doesn’t show them to you.

Here’s the MSDN blog that explains this and so much more.

And I’d like to give a special thanks to Robert Hutchinson at MS for chasing this down for me while Buck and I sat on the sidelines helpless.

The Zombie Standard

As I promised yesterday I’m gonna talk about process and coding standards today and I want you to really think about what I’m saying here because it’s what separates the men from the boys.

This topic won’t ramble on quite as much as the past 2 days have though so don’t worry.  Basically what I want to say is that you should definitely be creating coding standards and following them, but don’t rest there and don’t allow any of your standards to rest either.  What I mean by that is just because you come up with a standard, say using ‘insert into’ instead of ‘select into’ during your ETL loads, don’t always blindly follow that standard.  Every process is unique and every process deserves consideration.  You can adversely effect your servers if you do the same thing across the board without ever questioning whether this is the right tool for this particular job or not.  So where a ‘select into’ may be the perfect solution for one really huge snapshot load, an ‘insert into’ may be the right solution for other lesser loads. 

One argument I hear a lot when preaching this bit of DB wisdom is that you want to have a solid way of doing things and you want to standardize.  And frankly, that’s just ridiculous.  That argument doesn’t hold up to even the most basic scrutiny.  Ok, maybe ridiculous isn’t the word… let’s go with asinine instead.  So you’re telling me that for this big snapshot load you should use the exact same process as you do for your smaller ones because standardization is the key?  Then why aren’t all of your loads full snapshots?  Shouldn’t you be standardizing on a single load process?  And what about your indexes?  Do you have the exact same fill factor for every single index in your place?  If so you’re not paying attention to your index stats.  And are you using #table instead of @table absolutely everywhere?  Or are there cases where one is better than the other? 

So you see, with just a few examples I’ve shown you that you don’t do EVERYTHING the same way in your DBs.  So there’s no reason for you to scream standardization with something as important as a data load.  Take each load on an individual basis and decide what the best course of action is for each one.  And yeah, as the data grows or changes in nature you may end up revisiting your process and updating it to reflect the new nature of your objects.  And that’s ok.  So many people think that having to rewrite a process is a huge sin and that you’ve failed somehow.  It’s really not you know.  Not only are these types of things a learning process, but things also change that require you to change tactics. 

I really hope you take what I’m saying here and apply it.  And don’t be one of those idiots who tells me that he runs everything serially because he wrote the process on a single-CPU box and therefore he needs to standardize on that platform instead of taking advantage of a multi-CPU machine.

Watch my free SQL Server Tutorials at:
http://MidnightDBA.ITBookworm.com

Read my book reviews at:
www.ITBookworm.com

Blog Author of:
Database Underground – http://www.infoworld.com/blogs/sean-mccown

Follow my Twitter:

http://twitter.com/MidnightDBA

The Performance Dilemma

OK, so yesterday I was talking about how servers can take a long time to finally be brought to their knees with poor processes, and that you want to get to the point where your server is on a strict diet and you control every morsel that goes into it.

Today I want to talk about how that can play into a real performance strategy because unfortunately you have more to consider than just raw performance.  I know, I know, it sounded yesterday like I was advocating that performance was the most important thing.  And it is… but at the same time it isn’t.  So let me explain a little.

Performance is extremely important and some might even say it’s the most important thing on the server, but you also have to consider management et al.

See, quite often the code that leeches out the absolute best performance on the server isn’t as easy to manage as other solutions.  Something like this isn’t very concrete without an example so I’ll give you one.

Let’s look at what I call moving schema.  The guys in my shop are sick to death of hearing that term but it’s an important one.  Moving schema is where you drop and recreate objects all the time that should be permanent.  A good example of this is doing ‘select into’ instead of ‘insert into’.  Sure, the ‘select into’ performs better than the ‘insert into’, but it’s also harder to manage from a couple different perspectives.  For starters, even during your load you can’t specify a filegroup for the object so you’re stuck with the default.  Now for performance or growth reasons you may find it better in general if you put it on a different set of disks, but you can’t because ‘select into’ doesn’t allow it.  So from a space management aspect ‘select into’ traps you in a corner.  Also, if you have specific permissions granted on the table you have to keep track of them somewhere and reapply them every day when you recreate the table.  And you also have to remember to update your permission repository every time you change the perms on that table during the course of your other admin functions.  As well, most performance stats are gathered by objectID.  And if you’re recreating the object every day then your objectIDs are invalid tomorrow.  So you either have to keep track of what the objectID is every day so you can track performance and other issues (like fragmentation) from day to day, or you have to store the objectName in your repository as well which leads to an inefficient and larger schema to track production issues.  It also makes the timing of other processes more critical because with ‘insert into’ they’ll be blocked until the current load finishes, but with ‘select into’ they’ll fail because the object doesn’t even exist. 

So ok, I’ve shown you a few reasons why something as simple as moving schema can be a problem.  And like I said, it’s really one of my diatribes because of those issues mainly, and a couple out-lying ones.  And the guys at my shop have been hearing this battle cry for about 3yrs now and I’m sure they’re quite tired of it.  But when it comes right down to it, it’s good DB practice. 

There are some instances where the moving schema rule can be broken though.  Let’s look at a really large data load for instance.  If for some reason you can’t do any kind of incremental load and you have to do a full snapshot every night (those instances are becoming fewer and fewer with Katmai) on a HUGE table (several hundred million rows), then you may be better off doing a ‘select into’ because you don’t incur the logging and it can it can decrease your load time significantly.  It of course can also help control how big your log grows.  So this is a decision you have to make at the time, right? 

However, I will say that quite often, if you take the size of the log out of the picture, the ‘select into’ doesn’t out-perform the ‘insert into’ by all that much.  I recently tested the 2 methods against 30mill rows and the ‘insert into’ only took about 40secs longer.  That’s really not enough to worry about in the course of a full ETL load.  And for all the extra benefits you get from having a stable object, that 40secs is more than worth it.  So this would be a case where choosing the absolutely best performing process wouldn’t be the best way to go.  In the case above where I talked about loading several hundred million rows, the ‘insert into’ may increase the load time by as many as 30mins and that’s hard to ignore.  So depending on your requirements you may decide that dropping the object every day is worth it.  But in the instance where the difference is only a handful of minutes, you’d typically choose to keep your schema in tact and gain the other benefits.

So OK, you’ve got a process and you’ve decided to drop the schema every day to do your load.  And your process performs as well as it possibly could because of the lack of logging.  The problem is that people tend to think of performance in terms of the single piece they’re writing and nothing else.  Performance includes more than just the load.  Performance also includes up-time, and number of errors, and manageability, etc.  The load is just one part of the process for any given object.  Examples of different parts of an object’s process would be the load process, the reporting process, the troubleshooting process, the triage process, security process, the space management process, etc.  I’m sure there are others, but you guys get the point.  So all of these are all part of the process of managing this object and if they all don’t play well then the process for that object doesn’t work well.  So be careful in throwing all the other processes away and making them extremely more complicated just to appease a single piece.

So I’ve been in several shops and it’s always different everywhere.  Some shops are really big into management and others are really big into performance.  I’ve been in shops where the management of the objects was just a pain because the systems really were so busy that every last CPU cycle was counted and every last I/O had to be justified.  So they were willing to put up with the management headache of that kind of performance cost because they needed it to stay alive.  And again you have to decide what’s best for your place and your processes.  But whatever you decide, don’t rest on that decision.  Keep making it again and again.  I’ll talk about that tomorrow.

I guess that’s all I’ve got to say for now.

Watch my free SQL Server Tutorials at:
http://MidnightDBA.ITBookworm.com

Read my book reviews at:
www.ITBookworm.com

Blog Author of:
Database Underground – http://www.infoworld.com/blogs/sean-mccown

Follow my Twitter:

http://twitter.com/MidnightDBA

The Silent DBA

Be careful when you’re complaining about how much advice you get from your DBA because one day he might stop giving it.  And that’s not really a scenario anyone wants because it means he’s stopped caring about the systems and you won’t get any real help.  Because I don’t care what you devs out there think, you need your DBAs to keep your code playing nicely with others.

So when your DBA stops putting in his 2-cents your prayers haven’t been answered… in fact, just the opposite.  You’ve taken your DB support staff and turned him into a wallflower.

So now my question to you is what did you do to shut him up?  Because I guarantee you it was your fault.  Are you constantly going out of your way to circumvent procedures he puts in place?  Are you constantly disregarding his advice?  Do you refuse to setup things in a way that will make it easier for everyone involved to troubleshoot?  Do you refuse to follow any coding guidelines he gives you to keep things going smoothly?

OK so I realize that even though your code runs in double the time it could and it always deadlocks, that you’re the smartest dev out there and you don’t need advice from any of those dumb DBAs, but try shutting up and listening sometimes.

Watch my free SQL Server Tutorials at:
http://MidnightDBA.ITBookworm.com

Read my book reviews at:
www.ITBookworm.com

Blog Author of:
Database Underground – http://www.infoworld.com/blogs/sean-mccown

Follow my Twitter:

http://twitter.com/MidnightDBA

Decimal(18,2)?!?

I just reviewed a script that had 3 major problems.

1.  It created a #table and then immediately truncated it.

2.  It passed in a var into the sp and then hardcoded the value a few lines later.  The passed-in value will never be read.

3.  It did the following operation.

Declare @var decimal(18,2)

Set @var = select count(*) from table

Really?  Come on people let’s think about what we’re doing.

 

Watch my free SQL Server Tutorials at:
http://MidnightDBA.ITBookworm.com

Read my book reviews at:
www.ITBookworm.com

Blog Author of:
Database Underground – http://www.infoworld.com/blogs/sean-mccown

Follow my Twitter:

http://twitter.com/MidnightDBA

Murphy’s Law of Vacation

We’ve got this situation at work where we’ve just run across this very well-known law. 

The law states that that code you just put into production will always break right before you go on vacation.

The law also states that if the only resource for an application goes on vacation, no matter how long the app has been running without issue, it will break as soon as he goes on vacation and you’ll have nobody there with knowledge to fix it.

Help us all.

Is there a connection between Oprah and Michael Jackson that caused his death?

Watch my free SQL Server Tutorials at:
http://MidnightDBA.ITBookworm.com

Read my book reviews at:
www.ITBookworm.com

Blog Author of:
Database Underground – http://www.infoworld.com/blogs/sean-mccown

Follow my Twitter:

http://twitter.com/MidnightDBA

The Runaway Instance

I recently installed the latest Katmai CTP on my workstation at home. I installed the entire package. I connected to it just fine and worked with it for several days. I then went to hit it with an external client so I could load some data, but it wouldn’t connect. I checked and it was set for remote connections, and it was listening on 1433 so that wasn’t the problem. I didn’t know if there might not be some kind of glitch so I burped the service from the Katmai Config Mgr. The service failed to restart. The message I got was that 1433 was already in use. OK, at least I have a message to troubleshoot now. Well, I’ve got Pinnacle running on that box and it uses SQL Express. So since I installed Katmai as a default instance as well, it was clear that Pinnacle was stepping on it… even though Yukon was on the box before and it never had issues with Pinnacle.

Anyway though… I stopped the express service and I still couldn’t connect. So to make a long story shorter, because I spent a very long time on this, it turns out that the Katmai Control Mgr was falsely reporting the express instance as stopped when it was really still on. When I went into Windows SCM it was still running. So I stopped it from Windows and all’s well.

At least I got some good troubleshooting practice in, huh?