Tag Archives: blunders

An Upgrade Disaster

I got an email from a user at another company today and he told me about the SQL 2012 upgrade they just finished. And apparently it was a disaster. Here’s the general gist of what happened.

They have a huge app that runs their entire business and the vendor talked them into upgrading to 2012. Originally they were slated to do tons of testing and upgrade probably sometime in november. But they decided to not listen to their DBA and instead allowed themselves to be lead by the vendor who told him that SQL upgrade was easy and nothing to worry about. So they did some perfunctory testing and pushed the upgrade to this past week. I know, smart right?

So this vendor did their upgrade for them and it completed ok from what I know about it. The problems came after the upgrade. Now, I don’t have any real specifics, but I do know that it caused a 10hr downtime. One of the directors asked about the backout plan and he was politely told to mind his own business. Everyone is calling the upgrade a disaster. They didn’t have any way to restore in case the upgrade failed in a really bad way… and that means no final backup, no scripted objects, and no mirrored system. This was an in-place all or nothing upgrade.

Just so we’re clear on this, that’s not the way you run an upgrade. Upgrades take plenty of testing from the DB side, and the app side. You should never upgrade anything without extensive testing. And you should always have a backout plan. Now, what does a backout plan really mean? Because I find that often times the backout plan gets overlooked and I think it’s mainly because they have a hard time defining it.

To me a backout plan means a few different things depending on what you’re after. Let’s take this upgrade as an example. No matter how good you think SQL upgrade is, there’s always something that can go wrong. So at the very least, you need to take a final backup of ALL the system and user DBs right before the upgrade. Make sure you kick everyone out of the DB first because it’s not a final backup if there are still going to be modifications afterwards. That’s a good start for sure, and what I’d consider to be a minimum effort. Here’s a quick list of the steps I like to take for an important upgrade such as this:

1. Copy all system DBs to another location. This just makes restore much easier because with DBs as small as most system DBs, you can just drop them back in their original location and you’re good to go.

2. Script all logins with SIDs.

3. Script all jobs.

4. Make sure I have all SSIS pkg projects at the ready so I can redeploy all pkgs if I need to.

5. Do a test restore of the final backup before starting the upgrade.

6. Script any system-level settings like sp_configure.

7. Script any repl, log shipping, mirroring scenarios.

8. Make sure I have pwords to any linked servers. While I try to keep everyone off of linked servers I have to admit they’re a part of life sometimes. And you don’t want your app to break because you don’t know the pword to the linked server. It’s not the end of the world if this doesn’t happen, but it’ll make life easier.

So basically, the more important the DB, the more of these steps you’ll follow. You need to prepare for a total meltdown and make sure you can recover in as timely manner as possible. As I sit here and write this I feel stupid because it seems so basic, but there are clearly those out there who still need this kind of advice, so here it ia.

And if you have a good test box handy, make sure you test as many of these procedures as possible. Script out your logins, etc and restore them to the test box and see if things work as they should. Deploy your SSIS pkgs to your test box and make sure they run, etc. Don’t just rely on what you think *should* work. Actually make sure it works. This is why some upgrade projects take months to complete. It’s not the upgrade itself, it’s all the planning around it. And while this isn’t a full list of everything you could do to protect yourself, it’s a damn good start.
Happy upgrading.

What an idiot!

As DBAs we quite often run into others who aren’t as smart as us.  The dev is an idiot.  The .net guy is an idiot.  The users are idiots.  The manager is an idiot.  The VP, well don’t even get me started.  And other DBAs are really idiots.  At least that’s how it is in our heads anyway.  We fall into this cycle of calling everyone idiots for every little thing they do wrong.  The dev uses a wrong data type and it makes a few queries a lot slower, what an idiot, he should’ve known better.  A .net guy uses EF instead of putting it in an SP and it causes tons of blocking, what an idiot.  Another DBA tries to fix a DB that’s down and he does something that ends up making it worse… what an idiot.

It’s pretty easy to say everyone’s an idiot when we have the luxury of hindsight isn’t it?  Sure, I could have told you that every single one of those decisions was wrong and why.  But could I have told you before you did it and it went south?  Maybe, maybe not.  I’ve made plenty of mistakes in my career (and am dedicated to continuing) that in hindsight weren’t the best option, but was I an actual idiot for doing it?  Again maybe, maybe not.

I just think we jump on the idiot bandwagon too early and too often.  And I know I’m a big offender.  It doesn’t take much for me to start branding people left and right, but I also try to temper it with some reason.  Just because someone doesn’t have the same experiences I do doesn’t make them an actual idiot.  A dev chooses the wrong data type for a column.  Is he an idiot, or does he just not have the same experience with data types that I do?  I’d have to say it depends on what the mistake was.  Did he choose a varchar(25) for Address, or did he choose datetime?  Because one makes him less experienced with addresses and the other one makes him pretty close to an idiot. Well, what if he chose the bit data type for a SalaryAmount column? Well, I can only hope that he’s writing the table his own salary will be stored in.

I’ve seen plenty of things that seemed to be basic that I didn’t know. And that’s becuase there’s just so much to know it’s hard to quantify. That’s why I make sure I interview everyone I see for at least an hour before making a decision. I honestly believe you can’t judge the sum of someone’s experience in just a handful of questions. In fact, I’ve found plenty of guys who got the first 20 questions wrong and then we suddenly got to their area of expertise and they started blowing the questions out of the water.

So anyway, just give some of these guys a break and realize that they may not be complete idiots just because they don’t know something you don’t. That’s not to say there aren’t any real idiots out there. You guys know that I’ve definitely run into my fair share of them. But I’m trying harder to lighten up on them.

Part of the problem is the learning process we go through, which is next to none. Computers are hard. SQL is hard. .NET is hard. They’re all hard. And yet training is so poor. I’ve seen so much IT training I can’t even count, but the number of courses I’ve been in that actually taught the topic is very few. sure, the high level stuff gets taught, but the hows and whys of doing things is rarely covered. There are some guys out there who really take the time to break it down for you, but try to find one of them. One of the biggest reasons I never got into BI is because all the BI guys teach beginning BI like you’re already a BI expert. They explain BI terms with other BI terms and everyone just nods and smiles. But I guarantee you that most of them walk away without a good understanding of what was just said. .Net guys are big offenders in that area too. They explain .Net to you like you’ve been a coder for years and you’re just supposed to know what all this stuff is. So it’s no wonder that so few people really know their jobs well. They’re never taught what they need to know. So are they really idiots for not knowing something they weren’t taught? There are so many things that can go wrong with a system at any given time how can they be sure that the issue is being caused by a bad data type, or by one particular piece of code? There are of course ways to find out, but so many companies are in such a hurry to move on to the next project they never get a chance to dig into these issues. And again, were they really taught how?

So here we are in the middle of the learning revolution and there’s so little quality training to be had. You can go almost anywhere and learn how to perform the steps for a task, but where do you go to learn what you actually need to know? How do you learn that one thing is stupid over another thing, and that other thing exists for a reason, so when is it supposed to be used? I was talking to someone about this very topic just this morning.

So this whole thing was prompted by a training session I had with someone not long ago. Someone did something they shouldn’t have and when I corrected them they asked why. And when I gave my reason he said oh y, I never thought of that. And I could clearly see that he wasn’t an idiot, he just didn’t have the experience he needed. And since then he’s done it right and even did it the other way a couple times because the situation was different. See, I gave him the reasoning so now he can reason out for himself when to use one method over another. And that’s training that’s worthwhile.

To me, a true idiot is someone who gets shown the way to do things right and still refuses to employ them. He is also someone who has been in his current career for many years and doesn’t even know the basics. I have very little patience for say a SQL dev who’s been doing it for 10yrs and doesn’t even know the basics of the data types. Because you can’t tell me that it’s never come up. I also don’t like DBAs with 10yrs behind them who can’t write a simple backup statement. Again, that’s a basic that you should know cold.

Healthcare isn’t ready

I just left the healthcare industry for the 2nd time and it’s sad the level of ignornace and superstition that exists around computers… and SQL especially.  The entire industry treats computers like big electronic pieces of paper.  They print things they can easily email, they manually enter in things they could easily write a form for, and they perform repetative manual tasks they could easily script.  It’s pathetic how far behind the industry as a whole is and the people who work in it are so close-minded I don’t see how they ever get anything done.

Part of the problem is the doctors.  Doctors think that because they’re doctors that they know everything.  Several times I’ve had one doctor or another tell me specifically how they wanted me to do something in SQL.  They didn’t know the first thing about it, but they heard a few terms here and there so they decided to run the show.  And here they are in meetings insisting that I follow their HA architecture that was just ridiculous.  I got a reputation in my company for being difficult to work with because I always called them on it and told them to let me do my job.  Then they would complain and my boss would be at my desk the next day.  It’s just incredible ego to think that you’re a expert in all fields because you’re an expert in your own.

However, doctors aren’t the only problem.  Vendors are also a huge problem because they’re very slow to adapt to new technologies.  And by slow, I mean 15-20yrs too slow.  We’ve had so many vendors who only code against SQL2K.  Their support personnel is pathetic to say the least as well.  These vendors know nothing.  And they’re guiding hospitals in their implementations.  And of course now you’ve got the blind leading the blind because while there’s nobody at the vendor who knows what he’s talking about, there certainly isn’t anyone at the hospitals to call them on it.  And when they do get someone in there who knows what they’re talking about they can’t keep them because what really good IT person wants to work with an entire floor of people who don’t know the first thing about IT?

The biggest issue we had with staffing was that everyone who does the hiring thinks that you have to have hospital experience to be able to work in IT at a hospital.  So they end up hiring ex nurses, or other clinical people and give them jobs as programmers, system admins, etc.  These people don’t know the first thing about being in IT or about C# yet they’re given positions based off of their hospital tenure.  So someone who wanted a career change could come in as a Sr. Programmer yet they’ve never even had a simple online coding course.  So now they’re in there trying to figure this stuff out.  They’re architecting solutions that they could barely qualify as end users for.  And anyone in IT who knows what they’re doing has to put up with this idiocy.  And make no mistake… it is idiocy.

The industry itself has too many older managers in it and they need to bring in some fresh blood that actually knows something about IT and how to actually get things done.  As it stands they’re just too scared of the change, too scared of the data, too scared of being sued, too scared of pissing off the doctors, and too scared of technology in general.  Oh sure, they’ll bring in iPads for the doctors to carry around, but big deal.  They’re not doing anything cool with them, and everything they put out there costs tons of money in support because they weren’t put together correctly.  Want a perfect example of how far behind they are?  Whenever you go to a new doctor you still have to fill out all that damn paperwork by hand don’t you?  You have to put your name, address, SSN, DOB, etc on like 9 forms.  Doesn’t that sound like something they should be able to get past by now?  And there’s more to that specific story than just being afraid of computers.  That particular one is caused by the system itself.  I won’t go into specifics though.  I’ve also seen plenty of people print online forms, fill them out, and then scan them back in and store that into the DB in a text column.  Seriously dudes?

So what can they do to change?  How can healthcare move into the 80’s?  For starters they can hire some younger more hip managers who understand how IT works and the benefits it brings, and give them the power to do what they need to do.  Next they can stop hiring from hospitals.  C# coders, or SQL guys don’t have to know crap about your business.  They have to know their business, which is IT.  And they’ll have to pony-up the money for some real IT folks.  IT folks aren’t going to work for peanuts… not when they can go somewhere else and get 20-30K more.  Oh yeah, and you’re also going to have to start treating them like they’re professionals.  IT guys don’t want to hear how much the doctors know about IT.  They want you to let them do their jobs.  So seriously, stop treating them like they’re nothing compared to the doctors.  Doctors are essential to hospitals, but your IT staff is too.  It’s getting so that hospitals are crippled without IT.  So why do you still insist that all IT guys are the same?  Hell, even all janitors aren’t the same.  I can easily tell the difference between one who cares about what he does and one who doesn’t.

Here’s a scoop for you.  Healthcare is going to need to get their act together or else.  The government is mandating that everyone have their health records in a meaningful use format by 2015 so the time of getting by on the idiots you’ve got is over.  You’re going to have to get some real talent and do what it takes to keep them.  If that means paying them a good salary, and listening to them, then all I can say is ‘you poor baby’.  Hospitals jump through hoops all the time to attract some new doctor because of what he brings to the network.  If anyone in healthcare is reading this then you’d better start planning now.  Start gathering some talented IT guys and let them do their jobs.  And NO, before you ask, you don’t know what IT talent looks like.  Get someone to help you find that talent.  And I’m not talking about recruiters either.  Go to the Microsoft MVP site and google someone in the field you’re looking for and start emailing them.  Ask them to help you interview a few guys.  I’m sure they’ll charge you a little, but it’ll be more than worth it.  Then once you get these guys on staff don’t treat them like 2nd-class citizens to the doctors.  You’ve got no choice anymore.  You have to do something.  You can’t keep this up.

My guess is that it’ll probably take about another decade before this starts really turning around though.

SQL Server 2012 Launch is a joke (#SQL2012)

Ordinarily I don’t jump on these bandwagons and slam something publically like this, but I have no choice.  I’m amped up ready for a launch event and I’m in the office so I don’t have access to porn, so I’ll blog-off this energy. 

Jen’s already written an excellent blog on the issues with the event so far, but I’ll add another perspective.

I signed up to be a moderator in a couple of the expert pods and not only have I had an hour worth of trouble getting logged in, but now the chat window in the pod won’t come up.  Apparently, the site doesn’t work in IE.  REALLY!?!?!?!?!?!? I mean REALLY?!?!?!?!?  Now, I’ve had a couple people tell me to just install chrome and it’ll work just fine, but I’m not going to install a new browser just to watch an MS webcast.  If anyone should be IE friendly it should be the SQL Server Launch Event.  This is just ridiculous.

However, I will say that this is par for the course with vendors.  This is the type of lazy, poorly-executed vendor crap that’s pushed on us all the time in the DBA world, so we’re actually used to it.  I don’t know who this vendor is, but if I did I would do my part to run them out of the business.  So aside from all the issues Jen outlined in her blog, which I’m fully behind, basic compatibility for the customer’s main web platform isn’t even there.

And on top of everything else, I’m hearing from those who were able to get in that they’re not classified as the MVP expert in the chat session, so nobody knows who they even are.  This vendor is a joke and they’ve ruined the launch event.

2 Addictions worth fighting

A recent email chain with a friend has spawned me to write this post.  What I’m trying to do here is help the 2 extremes of people you’ll find in your company:  the idiot, and the DBA.  Ok, I know all DBAs aren’t super smart, but I had to call the non-idiot something.  So this email chain I had was with a DBA who was stressing-out about something stupid his company is doing even though he warned them.  So what you’ve got to understand is that both sides of this equation are an addiction.

The Addiction of Stupidity:

It’s easy to get caught-up in the fact that our users are morons, or that the other IT guy is an idiot, but you’ve got to have a little compassion.  You have to understand that stupidity is an addiction.  It may have started through a combination of them not knowing anything at all about SQL, and them not having any qualified DBAs to help them learn.  How it started though doesn’t really matter in the end.  What does matter is that now you’ve got to help them through it.  You need to advise them on the proper way to do things.  At first they won’t listen.  Keep it up though and you’ll start to prove yourself a little at a time.  Then they’ll start to trust you.  Even when they trust you though, they’ll still fall back into being stupid all the time.  You have to urge them to fight it.  It’s hard though, right?  They really want to fall back into their old habits but you can’t let them.  It’s all they know and a little knowledge is empowering and dangerous.  So you have to guide them and not let them slip back into their bad habits of making stupid decisions.  Remind them that trusting you in the past has paid off and it will again. 

 

The Addiction of Caring:

The other side of this is the DBA himself.  See, as DBAs we tend to own the DBs under our control.  We want everything to work together like a well-rehearsed ballet.  And when it goes that way it can be a beautiful thing.  But it almost never goes that well, does it?  The company doesn’t see things the way we do.  The addicts in the above section quite often have the power to block our intelligent moves and even put things into play without us even knowing.  People in power are quite often worse than anyone else.  So how do you deal with this?  How do you get them to do the right thing?  Well, you can discuss it with them and get them to see reason, but if they don’t then you have to let it go. 

You’re not being paid to do a good job.  You’re being paid to make your boss happy.  I’ve said this many times before, and it’s still true.  Turn yourself into an internal fulltime consultant.  Advise them of the right thing to do, but if they make the wrong decision then it’s their business and their decision.  You can’t make them run their business the right way.  You can only advise them.  Remember that.  It’s not your business.  So stop insisting that things be done a certain way.  Tell them what should be done and let them make the decision.  And if they make the wrong decision just make sure you’re covered with an email.  You want it on record that you advised them correctly and they made the wrong decision.

But caring to the point where you’re pulling your hair out is just as much of an addiction as being stupid.  You have to constant remind yourself that you’re just a consultant and if they want to pay you a bloated salary to ignore you and have you spend your days cleaning their messes up instead of providing real value then that’s their business.  Again, you’re being paid to make them happy, not to do a good job.  Plenty of people disagree with me on this, but they’re wrong.  It just so happens that some bosses want you to do a good job and want you to guide the company, but that’s just a case where their priorities align with yours.  If their priorities were to change then I bet you’d find that you’re all of a sudden butting heads.  I’ve been in several companies where everyone in the world has sa and they’ve even got lots of apps connecting with sa and everyone knows the password.  I’ve advised them how unsafe this is, but they insisted that it’s too engrained and can’t be changed.  Fine, I put my advice in an email and went on about my business.  Things kept happening to the servers and they wanted it to stop, but I said I can’t stop it because everyone has sa.  They said it’s impossible.  So ok then.  Then they started failing audits.  They lost clients because they couldn’t produce  the report of the passed audit.  Now they came to me again and said how can we pass this audit.  I said you have to get everyone out of sa and do about 10 other things.  They said but the effort would be too great.  I said ok.  Then they lost another client.  Then they asked me what it would take to pass the audit.  I repeated the same thing.  They said it would be too hard.  I said ok.  They said can you give me another way?  I said no, but have you looked into what it would take?  What’s the actual effort?  They said no, we haven’t looked into it, but I know it would be really hard.  I said, then why not ask the question to the right people?  As it turns out it was a single VP who was blocking it because she had NO IT knowledge and just thought this kinda thing was really hard.  All they had to do was change a connection string in a .ini file and all was well.  Then they just told everyone to get over it and they weren’t getting sa anymore.  They would have to stop doing things that a DBA needs to do.  All of a sudden the environment became much more stable, they passed their next audit, and they started getting those contracts they were losing before.  That’s a true story BTW.

So find a way to divorce yourself from the outcome of these things.  I like to look at it like this… if you see a guy on the street getting into his car and he looks like he’s about to run the red light, and you call into his window and warn him there’s a cop sitting there, and he does it anyway you just look at him and say dumbass.  Chances are you probably won’t get upset and give yourself an ulcer over it because you’re just not that invested in the outcome of your advice.  You warned the guy but he didn’t listen… so what!  And that’s the attitude you have to take with your own company.  You can’t invest yourself in the outcome of your advice.  It’s not your company.

So anyway, I’m just trying to help you keep a little perspective.  If you’re truly in a job where they hang on your every word then consider yourself lucky.  The vast majority of us don’t have that luxury so we have to find ways to cope with the fact that everyone doesn’t listen to everything we say. 

And remember, both of these are addictions.  As much as we have to help our users fight against being stupid, we have to constantly fight against becoming too invested in the outcome of our advice.  And I struggle with this with different degrees of success.

 

Good practice saves the day

Ok, a funny thing happened at work today. I had to write a script to move a DB’s files to another drive. Sounds easy enough, huh? Well in theory it was easy, but leave it to me to mess it up. Now, this was just a dry-run so I don’t feel too bad about it, and it gives me a chance to test my process. So, here’s what happened.

My process was this:

1. Run move stmts for all files.
2. Take DB offline.
3. Run PS script that moves the files.

The PS script selects from sys.master_files to get a list of the files to move. Here’s the problem… I’ve already run the move statements so SQL thinks they’re already in the new location. Therefore, the PS script can’t move them from their current location. However, good practice saves the day.

As a habit I always save a copy of sys.master_files to another table in case something happens. Now it’s a simple matter of changing my PS script from selecting from sys.master_files to selecting from sys.master_filesSEAN. Now I’m back in business. That’s the thing I find about good practices; they help you in ways you can’t even predict most of the time.

Here are some other general steps I take to protect against one thing or another:
1. I always try to group like objects in the same schema. It’s just a good idea.
2. I always copy the system DBs when upgrading or patching SQL. Sure I can restore from backup, and I have those too, but if something big happens and I have to completely re-install from scratch it’s much easier to just drop the old system DB files out there than it is to restore from backup.
3. Whenever I’m done working on an important prod system and it’s time to move on to something else, I always close all prod windows and even disconnect from it in object explorer. I don’t want any chance there’s a way to open a new connection to it and do something dumb.

I’d love to give you guys a huge list, and I know there are more, I just can’t think of any right this second.

Over-tuning Backups


Fatal error: Uncaught Error: Call to undefined function eregi() in /home5/midnigk3/public_html/DBARant/wp-content/plugins/wp-codebox/main.php:136 Stack trace: #0 /home5/midnigk3/public_html/DBARant/wp-content/plugins/wp-codebox/main.php(75): wp_codebox_is_windowsie() #1 /home5/midnigk3/public_html/DBARant/wp-content/plugins/wp-codebox/main.php(50): wp_codebox_highlight_geshi(Array) #2 [internal function]: wp_codebox_highlight(Array) #3 /home5/midnigk3/public_html/DBARant/wp-content/plugins/wp-codebox/main.php(130): preg_replace_callback('/<p>\\s*2280cd96...', 'wp_codebox_high...', '\n\t\t\t\t<div class...') #4 /home5/midnigk3/public_html/DBARant/wp-includes/plugin.php(235): wp_codebox_after_filter('\n\t\t\t\t<div class...') #5 /home5/midnigk3/public_html/DBARant/wp-includes/post-template.php(240): apply_filters('the_content', '\n\t\t\t\t<div class...') #6 /home5/midnigk3/public_html/DBARant/wp-content/themes/twentyfourteen/content.php(57): the_content('Continue readin...') #7 /home5/midnigk3/public_html/DBARant/wp-include in /home5/midnigk3/public_html/DBARant/wp-content/plugins/wp-codebox/main.php on line 136