Category Archives: Admin

Is teamwork really that rare?

Well, despite the saga today that you can read about in my other 2 posts (Why can’t voodoo be real? and The stupid have fallen), the day ended fairly well in my last meeting.  I’ll give you just a snippet of the backstory before getting into the meeting.

We do a lot of server builds.  And quite a few of them are clusters.  So the problem is that we’re expecting to be given a server in a certain state and we don’t always get it.  So sometimes we have to troubleshoot something ourselves to get the server to look like we’re expecting.  And of course other times it’s just what it should be.  So we had a meeting today with the DBAs, server guys, and SAN dudes and all of our directors to try to sort this out.

That said, I’ll just give you the end result so I can get on to the rest of the post.  What we decided to do is next time we have a cluster build we’re all going to sit in the same room and do our thing and we’re going to negotiate what each hand-off is going to look like.  So the server team is going to build a checklist based off of what we agree on and every server they provide us will look exactly like that.  And we’ll know exactly what’s expected of us, and so will the SAN dudes.  Personally I can’t wait, because that’s such an excellently low-tech way to solve a problem.

Now, this isn’t exactly a rant, but kinda.  Is that level of teamwork really that rare?  I can tell you that in my experience is certainly is.  In almost every company I’ve been in the different groups have been at such odds they could barely communicate.  They honestly forget that they’re all on the same team working towards a common goal.  My last job was the worst about that.  The ETL team manager made sure his team was at odds with everyone and they kept everything to themselves and never even discussed issues in a friendly manner.  It was very acrimonious and tense whenever the different groups would get together for meetings.  And I’ve been in several shops that distrusted each other like that. 

In fact, I actually sat in that meeting today and said out loud what a strange feeling it was to actualy work openly with another team like that.  And it really is.  Everything my team does is an open book.  I’ve gone out of my way to make sure my team doesn’t hide anything from anybody because we’ve got nothing to hide and we’ll make more friends if they know why we’re doing certain things.  So our reasons for doing something and what we’ve done are always an open book… even our mistakes.  We admit to them and tell the customer what we’re doing to fix it and how long it will take.  And I’m not saying that I’m the reason these other teams are playing nicely, but I’m certainly fitting into this portion of the environment for sure.  And I can’t say that all teams play this well together, but I know a lot of them do because I’ve witnessed it. 

I’ve actually been saying it for years… Let’s pretend we’re all in the same company!

The stupid have fallen

This is a followup to my previous post here.

Wow, what a day.  Ok, so I just got off the phone with the tech lead for that company.  It was enlightening to say the least.  It was really a longer call than I expected or that I even wanted, but my blood pressure’s gone down a bit now.  I obviously can’t recount the entire conversation for you, but here are the highlights:

V = Vendor

M = Me

V: Hi Sean, this is Josh(not his real name) with X and I was told you wanted to talk to me.

M: No, the last thing in the world I wanna do is talk to you, but I feel in order to get our issue resolved I have no choice.

V: Yeah, Cornhole(not his real name) told me you got upset with him when he was just trying to explain to you what your issue was and then you insisted on speaking with me.

M: This is just incredible.  No I didn’t get upset with him for explaining the issue to me.  I got upset with him when I told him his explanation was flatout wrong and then he called me an idiot.  He didn’t even hint at it.  He called me an idiot outright.

V: Really?  Cause that’s not the story I got at all.  I heard that he was calmly explaining that your backups were the issue and you just exploded on him.

M: No, that’s NOT what happened at all.  That little turd actually called me stupid for not buying his assinine explanation for our issue and then I only exploded after he did that a couple times.

V:  Well what don’t you believe about it then?  Because we’ve seen that many times where backups will cause proceses to block.

M: No you haven’t.  Listen, I realize that everyone can’t spend their days studying SQL, but one simple fact is that backups don’t block processes.  Period.  I’ve been doing nothing but DBs in SQL Server for over 15yrs against TBs of data and nowhere have I ever seen a backup block a process.  The problem is that you’ve got like 3 tables with TABLOCK in that sp and that’s what causing your blocking.  Don’t you think it’s funny that it only clears up AFTER you kill that process?

V: Where did you get the code?  Those sps are encrypted.

M: Encryption only stops amateurs.  And while we’re at it, what’s this script I hear about that you’ve got to unconfuse SQL to give you the real spid for the process?

V: Where did you hear that backups don’t block processes?  It’s well-known here that it’s one of our biggest problems.

M: No, your biggest problem is your TABLOCKS.  And I heard it from the truth. 

V: What is a tablock?

M: A table-level lock.  You’re locking the entire table just to read data.

V: I don’t know about that, I’ve never seen the code because it’s encrypted.

M: Well I do and that’s your problem.  Now what about this script I kept hearing about from Cornhole?

M: Did you get my email?

V: Yeah.

M: And you can see the place in the code where you’re locking the whole table?

V: Yeah.

M: So how is it that you lock the entire table and you claim that it’s the backup blocking everything?  Haven’t you noticed that the backup only decides to block stuff when this process is running?

V: That does sound a little suspicious when you put it that way.  What can we do?

M: Let me take those TABLOCKs out of there so I can query these tables without locking everything up.

V: I just saw your email signature and I’ve never heard of an MVP before, but I just looked it up.  Are you really one of those guys?

M: Yeah.

V: Wow, so I guess you really know what you’re doing then.

M: I’ve hung a few DBs in my day, sure.

V: Do you mind if I write you again if I have any questions?

M: I suppose that depends on whether you’ll let me change that stupid sp or not.

V: Yeah, go ahead.  I’ll have to have a talk with our devs tomorrow and try to convince them to put this in our next build.  Honestly, I’m only a tech lead here because I’ve been here a few months longer than anyone else on the support team.  We get all this stuff from the devs and they won’t tell us anything.

M: So what about this script I keep hearing about.  The one that tells you how to unconfuse SQL and give you the right spid for a process?

V: That’s a script the devs have and they won’t show it to me either.  I just tell my guys something else so they won’t know I haven’t seen it.

M: Wow, that sucks man.  You do know though that a script like that doesn’t exist, right?  It’s completely bullshit.

V: Yeah, I hate this place.  I’ve been suspecting for a while now they they were lying to me, but what can I do?  I need this job.  I’m just lucky to be working.  I have no idea what I’m doing here.

M: I don’t know what to say to that.  You’re right, you are lucky to be working these days.  But pick up a couple books on your own and learn SQL.  Don’t rely on them.  There are websites, blogs, etc.  I myself run a site.

So in short, I really started to feel for this guy.  He ended the call apologizing 100x for the grief they’ve caused me and he’s going to go out right away and start watching my vids and trying to learn more.  He asked if he could write me with questions, and I said, of course dude, anytime.  That’s what I do.

And y, he sounded about 22 as well.  Turns out I was right.  I asked and he was 21 and this was only his 2nd job and they brought him in from nothing and “trained” him.  I’d say he’s got some work to do.  But I personally consider this exchange a success.

Why can’t voodoo be real?

I had a very interesting talk with a vendor this morning.  Ok, it wasn’t really interesting as much as maddening.  I swear, it’s guys like this that are going to be the death of me.  I know I shouldn’t let them get to me but sometimes they do.  You know how it goes, you try to explain something and it just escalates until you’re upset.  Anyway…

OK, so they were having blocking issues in the DB.  The vendor was trying to convince the app team to turn off the SQL backups because they were interfering with their processes.  That’s when they called me.  Frankly, I can’t believe they called because the vendor was very adamant and this is the kind of thing our app groups do on their own without consulting us at all, so I’m more shocked than anything.  But I knew this was going to be a bad call the second my app guy told me what was going on.

Here’s the conversation as it took place just a couple hrs ago.  It took me this long to regain my strength.

V = vendor

M = me

M: Ok, I’m here now.  So explain to me what’s going on.

V: Well, we’re having severe blocking issues in the DB and after looking at it we noticed it’s the SQL backups causing it.  So what we need to do is stop the SQL backups.

M: Backups don’t hold locks in the DB so there’s no way for it to be causing blocking.

V: Of course backups hold locks.  Everything holds locks.

M: No, they don’t hold locks.  There’s never been a case where a backup has blocked a process.

V: We see it all the time.  And you’ll just have to trust me because I do this for a living.  I’m very familiar with this app and this is how we always fix this issue.

M: You always fix your poor coding issues by having people not backup their data?

V: You can backup your data, but you have to schedule downtime every week to do it… to keep it from blocking.

M: I really don’t want to finish this conversation, but I’m developing this morbid curiosity about where this is leading.

V: What does that mean?

M: Well, you clearly know nothing about SQL and you’re doing your customers such a huge disservice by making them turn off their backups when all you have to do is fix your query.

V: What do you mean I know nothing about SQL, you’re the one who thinks backups don’t cause blocks when I see that all the time.

M: No, what you see all the time is your process causing blocks and people running backups.  Because I’m on your server now and I can see the blocking you’re referring to and it has nothing to do with the log backup, it has to do with your process on spid 201.

V: I thought you were a DBA.  When there’s severe blocking in a system SQL Server sometimes reports the wrong spid.  Our tech lead assures us that this is coming from the backup.

M: Well, if you’re just going to insult me we can end this call right now because I really don’t have time for this.  I can’t teach every moron who writes an app for SQL everything they need to know to support it.

V: Now who’s being insulting?

M: Oh, that would still be you.  Because you know nothing about SQL and you hop on the phone and start calling me an idiot because you don’t know how to troubleshoot a simple blocked process.  Since we’ve been on the phone, I’ve looked at the sp that’s causing this and the problem is simple; you’ve got a hardcoded TABLOCK in 2 of your queries.  That’s what causing the blocking.  Have you ever bothered looking at this process to see what it’s doing?

V:  No, because it’s the backup causing it.  There’s nothing wrong with the code.  We run this at several customer sites and nobody ever complains.

M: Ok, let me try something different here… if SQL’s reporting a different spid for the blocking process, how does your tech lead know that this incorrect spid points back to the backup?  Why couldn’t it be pointing to one of the many other processes running right now?

V: He’s got a script that can tie them together.

M: I would love to see that script.  Maybe I could learn something, huh?  But if SQL itself gets confused about the spid and doesn’t report it right in sysprocesses, then how can he tie it to the right process in his script?  Wouldn’t SQL tie it to the right spid and present it to us the right way to begin with?

V: He won’t show anyone the script because he says it’s too advanced and we don’t need to know how to do those types of things. 

M: Wow, that’s convenient isn’t it?  Let me try something different here.  Because now this is like a train wreck I just can’t turn away from.  When you kill the backups at other client sites, does the process clear up right away?

V: No, unfortunately, the database can’t recover from something that’s been blocked by a backup so we have to kill our process to and then things work fine.

M: My, you’ve just built your own little world in there haven’t you?  You’ve got excuses for everything.

V:  They’re not excuses.  It’s how databases work.  Where did they find you anyway?  Have you ever been to any SQL Server classes? Maybe they should send you to some.

M: No, I’ve never been to any SQL classes.  I’ve never needed them.  And if you call me an idiot again you’re gonna find out what else I can do.  In fact, I’m a hair away from calling the CEO of your company and letting him know what kind of stupid, snide little ass he’s got working for him.  So you’d better watch your tone with me boy and remember I”M the customer.  So you don’t tell me what to do and you don’t get on the phone with me and call me an idiot.  And here’s something else for you.  Before I even got on this call I ripped your access from the box because I don’t want you touching anything on our server.  Nobody’s turning off our backups, and nobody’s ever touching our server.  And I’m actually done with you.  I’m not going to troubleshoot this with you one more second.  Here’s what I want.  I want you to escalate this to your team lead and I’ll fight this out with him.  I want a call with him at the next available opportunity.  You are out of this issue.

V: Look, I’m sorry if you thought I was being

S: Shut up.  I’m done with you.  The next call I get had better be from your team lead.  Good-bye.

I’m telling you guys, if voodoo were real I’d be sticking pins in this guy’s ass for the next month.  And the sad thing is I doubt it’s his fault.  He’s got some jackass yanking his chain and this is probably just some kid who they’ve taught to be this way.  He sounded like he was probably in his early 20’s.  I still haven’t heard from his tech lead, btw.

DAS isn’t a curse word

I deal with this from time to time where users (and admins too unfortunately) will not only shy away from DAS, but actually act like it’s some kind of computer virus.  Now that SAN is here and is so ubiquitous nobody thinks DAS is worth anything anymore.  I’m fully aware of all the arguments, but remember DAS gave us some really nice control over our disk layouts and we didn’t have to keep going to the SAN guys to ask them for dedicated spindles, which is something they hate to give.  And that’s really something to be said for a DAS isn’t it?  We don’t have to worry about our DAS spindles being shared with 10 other servers. 

So whenever someone talks about DAS vs SAN, I always try to insert some reason into the debate.  SAN is good for what it’s good at, but so is DAS.  And there are often times I miss the DAS days.  I never had to fight with a SAN guy about whether the SAN can handle all of the disks being on the same physical partition.  I never had to try to explain to them that a DB isn’t a file system and you can’t just put everything on the same chunk of disks with 10 other servers.  I never had to get special permission to have a different RAID level put on the server.  But don’t get me wrong, I’m not anti SAN.  I just don’t think people give DAS its due credit anymore.  For a lot of systems in your shop, a couple DAS cabinets on each server can ease some of your I/O trouble.  And of course, that’s at the cost of management for the disk guys, but everything comes at a cost.

Dealing with superstition

To be a DBA for any real length of time takes a certain type of personality.  For starters it helps if you’re fairly strong-willed and confident.  In order to deal with a lot of the customers you find yourself coming in contact with it really helps if you can stand your ground.

This is the exact type of guy that we’ve had as the past 3 DBAs before I got here.  The problem is that they were apparently pretty dumb so they spread so much misinformation about SQL and about DBs in general it’s a constant battle for me to undo it all.  Like I said, these guys were dumb, not malicious so I really don’t think they meant to  spread that much bad information… I think they just didn’t know how to pick up a book.

So this situation causes a special problem for me because I’m trying really hard to do things right here but a lot of my customers and vendors just won’t let me.  “Well, the last guy told us this is how it works”.  Yeah, I bet he did, but that’s not right.  And I can tell them it’s not right 100x, but they’ve heard the same tired story for years now so they’re not likely to believe me even after I prove it to them.  And we all know that the guy who says it the loudest is typically believed.  And I’ve heard that our previous DBAs were all quite passionate that things be done a certain way and they really went to bat for what they wanted.  What’s funny is they went to bat for ridiculous things, and then let a lot of really big things slide… but that’s another post isn’t it?

The only thing I can hope for at this point is to stick around longer than the other DBAs did and eventually my rhetoric will start to sink its way into their heads.  I’m thinking one way to do it is to just flatout lie.  Just tell them, yeah, that’s the way it used to be but on this new version they’ve fixed all that.  Something tells me that’ll work out much better than just telling them the last guys were idiots.  Because if I keep doing that, then I might as well be trying to talk the Vatican out of religion.

Are you kidding me?

This one has me going so much I had to blog for a 3rd time today.  I keep finding boxes that have failing backups and other processes.  These jobs have been stopped for weeks in some cases.  And why you ask?  Hah, that’s a good question.  The reason why they keep failing my most learned internet peeps is because the previous DBA was running them under his own personal acct.  He had scads of jobs and other processes running under his acct… as did the DBA before him and the DBA before him. 

Are you people kidding me with this crap?  Ok, for those of you who don’t know enough to be sure why I’m upset, you never run anything under your own acct… or anyone else’s acct for that matter.  Even if you don’t leave this gig, you’ll still be forced to change your password at some point and there’s no way you can keep up with every job, service, ssis pkg, script, website, credential, etc that you’ve got running under your acct.  And then you’ve gotta get everyone else involved to help you find it because you keep getting locked out every 5mins.

When IT was young you could excuse these things because we were still figuring things out then, but there’s just no excuse for this level of idiocy anymore.  Come on guys, straighten up.

Why won’t the log shrink?

I know that everybody always says you shouldn’t shrink your log files, but lets face it, sometimes it really is necessary.  Maybe you had a big op that was out of the norm and it grew your log bigger than it usually runs, or maybe (and more common) your backups were failing for a few days before you noticed and the log is now huge and you need to get it back down to a more manageable size.  For whatever reason, you need to shrink your log.  Now, I’m gonna show you the T-SQL way because it’s what I’ve got handy and I’ve used this script for years.

So here’s the script and then we’ll discuss a couple things… because the script isn’t the meat of this post.  But this one will do all DBs on the server, so just use the logic if you don’t like the cursor.  And it has no effect on DBs that can’t have their logs shrunk so don’t worry about causing damage.

Declare @curDBName sysname
OK, so that’s fairly standard admin stuff.  Now, If you wanna check that you’ve had the effect you expected, you can run this command both before and after to make sure you’ve shrunk it.

Declare DBName Cursor For

Select Name from sysDatabases

Open DBName

Fetch Next From DBName into @curDBName

while @@Fetch_Status = 0

Begin

DBCC ShrinkDataBase (@curDBName , TRUNCATEONLY)

Fetch Next From DBName into @curDBName

END

Close DBName

DeAllocate DBName

DBCC SQLPERF(LOGSPACE)

And what you’re going to get from that is a result set that gives you the DBName, the log file size, and the % used.  What you want to look at is the log file size and the %Used.  If the size of the file is really big and the %Used is really small, that means your log file is a lot bigger than it needs to be.  So an example would be you’ve got a log file that’s 52000MB and 3% used.  That’s a lot of empty space.  So you’ve definitely got to use the above cmd to shrink it. 

But what happens if you shrink it with that cmd and it doesn’t shrink?  Or maybe it just shrinks a little bit?  Well, now you’ve got to troubleshoot why it’s not shrinking.  This is one of my standard interview questions and you’d be surprised how many people get it wrong.  As a matter of fact, I’ve never had anybody get it right.  So here’s what you do… pick up a book and become a DBA.  Ok, seriously though, it’s extremely easy.

Run this query in master to see what your log may be waiting on to truncate.

SELECT Name, log_reuse_wait_desc FROM sys.databases

What you’ll get here is a result that tells you what the log is waiting for before it can reuse the VLFs, and in this case, kill them so you can shrink the file.  There are any number of things that can show up here as the reason.  You could be waiting on a long transaction, or on a log backup, or on replication, etc.  So next time you have a problem getting a log to shrink, come here first and find out what it’s waiting for before you go guessing what the problem might be.

That’s all I’ve got.

[EDIT]
Ok, I got a comment below that says BOL states that TRUNCATEONLY only works for data files. I checked and he’s right, BOL does say that. However, I’ve been using this method for years and I just ran a test on it individually using the code I pasted out of this post so BOL is wrong on this one. I ran my test on SQL2K8 but I don’t see any reason why it would change significantly with a lower version.

Audit service accounts with Powershell

I thought I’d kick off the week with a useful little PS script I wrote last week.  It happens from time to time where you need to find out what AD accts your SQL boxes are running on.  It could be that you simply want to see how diverse they are or maybe you want to change the password of an acct and you want to see what the impact will be.  All the same, it can be useful to audit the startup accts for your SQL services. 

So here’s a little script you can use to get that done.  This again is part of a much larger process I have, but I’ve pulled it out and simplified it to its base componets for you.  I’ve also added the DB side of the process for completion. 

Here’s the PS piece:
$Server = ‘localhost’
$StatsDB = ‘DBStats’                  
 $SourceSQLScript1 = “\\Server1\F$\SQLServerDBA\Scripts\Collector\SQLQueries\AllSQLBoxes.txt”; 

$SqlCmd1 = Invoke-Sqlcmd -ServerInstance $Server -Database $StatsDB -inputfile $SourceSQLScript1
$SqlCmd1 | %  {
        $_.ServerName; ####Print ServerName
        [System.Int32]$ID = $_.InstanceID;
        [System.String]$ServerName = $_.ServerName;
        $ExecutionDateTime = Get-Date -Format “%M/%d/%y %H:m:ss”; 

$a = Get-wmiobject win32_service -ComputerName $ServerName -filter “DisplayName like ‘%sql%'” -ErrorVariable err -ErrorAction SilentlyContinue 

$a | % { $DisplayName = $_.DisplayName;
   $StartName = $_.StartName; 

 Invoke-Sqlcmd -ServerInstance $Server -Database $StatsDB -Query “Collector.spServiceAcctInsert ‘$ExecutionDateTime’,’$ID’,’$DisplayName’,’$StartName'”
}

Most of this is the same as the other scripts I’ve posted so I’m not really gonna go into every line like I have in the past.  I am going to explain the params for the invoke-sqlcmd line though.  In this line I’m just calling an SP that I’ve put in place to log these details to a table and you can use as much or as little of this as you wish.  The $ID param is the ID of the server.  In my complete solution I have a server table and I log things to other tables using the ServerID that comes from that table.  You’re free however to just capture the ServerName here instead.  But you notice that in the query, I get the ID and the ServerName and I use those in the script.  I’m just presenting you with how I do it and you can change it. 

Here’s the SP that I use in the DB that gets called by this script.  It’s very simple. 

CREATE procedure [Collector].[spServiceAcctInsert]

@ExecutionDateTime datetime,

@InstanceID int,

@ServiceName varchar(50),

@StartName varchar(50)

 AS

 Insert dbo.ServiceAcct

Select

@ExecutionDateTime,

@InstanceID,

@ServiceName,

@StartName  

And now here’s the table that you’re logging to:

CREATE TABLE [dbo].[ServiceAcct](

      [ID] [int] IDENTITY(1,1) NOT NULL,

      [ExecutionDateTime] [datetime] NULL,

      [InstanceID] [int] NULL,

      [ServiceName] [varchar](50) NULL,

      [StartName] [varchar](50) NULL

And that’s all there is to it. From here, now that you’ve got all of your service accts and the AD accts associated with them, you can easily query this table to see the impact of changing a password or just see how bad your security is. This is useful info to have. You don’t have to run this kind of thing often though. In fact, it’s not a query that I have scheduled. I typically just run it when I want to know something specific, but you may have a reason to schedule it.

OK guys, that’s all I’ve got on that.  Enjoy powershelling.

Today’s Powershell Fun

I’m having big fun with PS today because I’m installing several new R2 boxes on Windows Server R2.  Where does PS come into play you ask?  Well, good question… I only wish I had a good answer.  Ok, I do.

In order to install SQL Server R2 on Windows 2008 R2 you have to first install the .NET Framework.  However, the SQL installer doesn’t do it for you.  It just tells you that you have to install it.  So the normal way is to go into the server manager and then to ‘roles’ and then ‘add a role’.  Once you’re in the add role wizard choose ‘Application Server’.  Here are some brief screens of the process.

As you can see I’ve already installed it on this server, but it’s pretty easy to figure out from here.  Just install the .NET Framework (it’ll actually already be checked for you) and you’re golden.  Now, that takes too much time for me, especially when I’m installing several boxes at once.  So what I like to do is call on powershell for this task.  I’m going to take some extra time here to show you a couple things, but the process once you get it down is much faster than the wizard.

First of all, you have to load the windows modules into your PS session.  To do it manually, you can open PS and type the following command:

> Import-Module ServerManager

Here’s a screen:

However, now you’ll be able to run the next command.

> Get-WindowsFeature

This will show you all the features available and which ones are already installed.  Here’s a screen of the output.

And of course, you’ll notice on mine, it’s already installed, but it’ll be unchecked on yours.  OK, now you’re ready to do the install itself.

And for that, you need yet another command, but don’t worry, it’s easy. 

> Add-WindowsFeature AS-NET-Framework

Notice I put an arror by the name of the feature above that you’ll feed to the Add-WindowsFeature cmdlet.  So in case you didn’t catch it, the syntax is simply the cmdlet and the name of the feature you want to install. 

Here’s a screen of the line as it’ll look when you type it.

And here’s what it looks like while it’s installing:

And that’s it guys.  The install is actually much quicker from the cmdline and you don’t have to worry about making a mistake through the GUI.  And of course you don’t have to go through all these steps every time.  Why not write a short script to do it for you?  It would look like this:

Import-Module ServerManager
Add-WindowsFeature AS-NET-Framework

That’s it.  That’s all it takes to install it from a script.  Now you can save that off to the network somewhere and just use it whenever you like. 

OK, that’s all I’ve got.

Oh y, one more thing… you don’t have to import the servermanager module every time you run this.  You can use the PS shell that loads all the windows modules for you.  You can find it here:

Ok, THAT’S all I’ve got.