The mini heart attack

Tell me if this scenario isn’t all too familiar. You get a call in the middle of the night saying that something’s wrong with a SQL box. It won’t respond at all. Now, you’re the first one on the scene because we all know that if there’s a SQL box anywhere within 100 mils of the app, they’ll call the DBA first because it has to be the DB. So anyway, you have plenty of time to think while you’re on the way to the scene.

That first step is the hardest. You throw the covers off of you and expose your warm body to the cold night air.

You swing your feet on the floor and try to find something that’s not gonna be too hot or too cold because you have no idea how long you’ll be up. At this point you’re still grumbling about how wordy the helpdesk guy was when he called. Why can’t these guys ever realize that we’re asleep and all we need is essential info?

Now you’re on your way down the hall and your thoughts turn to the problem at hand. What could it be? Did someone turn off the service? Did CHECKDB take too long and is blocking everyone?

You just hit your desk and start logging in. While you’re waiting for desktop to come up it hits you… what are the state of the backups? Am I sure that I can recover this box should something happen? What if this server’s down for good? Oh CRAP, is this the box I got the alerts about the failed log backups yesterday and just ignored them? I’ll be lucky if the log just filled up and I can do something about it, but if the box is down and I lost all at that data because I didn’t feel like messing with it I’m in trouble.

So you login and take a look at the server. You’re able to TS in without any trouble and you breathe a small sigh of relief. Well, at least the box itself isn’t down. I can fix almost anything else that may be wrong. You instantly look at your backup jobs to see if something has failed. You see nothing out of the ordinary.

Next you look to see if anything’s blocking. Nope, you see nothing of the kind.

You can query the DB and it’s online so now your heart attach is completely gone and you turn your thoughts to what could actually be wront. So you call the person who submitted the ticket to begin with. She reports that they couldn’t get in to the DB for a few mins but that all seems fine now. You tie up some loose ends on the phone and end the call.

You then close the lid on your box and as you walk back to bed you make a promise to yourself that you’ll never ignore another log backup alert ever again.

Don’t let this happen to you. Laziness is no excuse for not having a backup and answering the alert is much easier than answering your boss’s call about why you lost half a day of data. Don’t you have alerts setup? Then why did this happen? That’s a tough question to answer.

The point of all this is that things can go wrong on a moment’s notice. Restoring your DB is like a car wreck. It can happen at any time and you won’t have any notice. So if you’re not sure… and I mean SURE about the state of your backups, then get that way. Do whatever it takes to make sure you’re covered and you know it. And that includes doing the actual restore. Do you know exactly what it’ll take to restore your server? The last thing you want to do is to have to figure out what has to be done when you’re under the gun. So unless you know exactly what it’ll take to restore something then you don’t have a solid plan. So avoid the bread lines and these mini heart attacks and do your diligence. The life you save may be your own.

DBA Rant

The mini heart attack

One thought on “The mini heart attack”

Instead of working, I blog.