Even MVPs make mistakes

We’re in the middle of our last mock go-live before this weekend’s prod change-over.  We’re using an SRDF process to move the DB files from the current prod to the new prod.  This is our big SQL R2 cluster and what usually happens is that they present the drives to the cluster and then I run my script to bring the DBs back online.

Well what happened this time is that the drives weren’t added as SQL dependencies and therefore SQL couldn’t see them in order to bring the drives back online.  Well, I didn’t think to check that.  Instead what I did was I just deleted the DBs thinking that just starting from scratch would be the best way to go.  What ended up happening though is that SQL deleted the newly presented data files.  So the moral of this story is that even though SQL can’t see the drives to be able to attach the files, it can apparently see them well enough to delete them behind your back.  Thanks clustering team!

And so this isn’t entirely just a complaining session, here’s the query you can use to see which drives your clustered instance of SQL can see.

SELECT * FROM sys.dm_io_cluster_shared_drives

Now I’ve you’ve got drives in the cluster that SQL can’t see, all you have to do is add them as a dependency to the SQL service and you’ll be fine.  And in Windows 2008 you can do that while the service is online, but in 2003 and below you have to take SQL offline to add them.

Oh, and just for completion, you can tell which nodes are in your cluster by using this:

SELECT * FROM sys.dm_os_cluster_nodes

Here’s a quick update:  The errors I caused and fixed this morning had some residual problems.  Because once you make a mistake and try to fix it, you’ve already got mistakes on your mind and you’re not thinking clearly.  This is why I’m always advocating having restore scripts ready to go in a manner that you don’t have to think about it in a crisis.  You always want things to be as easy as possible.  And this is also why I’m always saying that you should have standardized troubleshooting scripts and everyone should know how to use them.  You don’t want individual DBAs inventing new methods on the fly.  You should all know what scripts are being run to troubleshoot if at all possible.