PASS Day 2 Keynote Live Blog

Today’s keynote with Quentin Clark starts at 8:15am EDT. Click here to watch today’s keynote live. For information on today’s keynote (“a demo-packed adventure across SQL Server solutions that span traditional boxed software, appliances and cloud”) see the SQLPASS.org page on Keynotes.

 Ok, here we are on day 2 of the PASS Summit 2011.  If you want to know about the first part of the keynote you can see that here

Now, I know in my past post that they were starting on appliances and I wasn’t interested in writing about them, but they’re up on stage now pimping the hardware itself and the machines are very impressive-looking machines.  It’s good to be Dell right now to be able to get your hardware on stage like that.  I dind’t notice before but they’re also showing some HP boxes too, so it’s not gonna just be a big Dell-fest.

Talk about a big box though… the new HP has 2TB of RAM, 196 cores, and enough spare drives that “you won’t have to worry if one of your drives fails”.  It’s their new internal cloud fast track, and that’s another blog entirely, but basically the internal cloud offering is this HP box that comes configured with Hyper-V, System Center, and some other cloud management stuff.  It comes basically ready to plug in so once you bring it in the door you’re ready to start building cloud machines.  Now, this is really just a big VM server that you spin-up VM guests in, so don’ t get too excited.  It comes with some specialized mgmt stuff I believe, but overall you could do all of this yourself if you prefer.

Now, the really cool part of this for me is the new ODBC drivers for linux for CDC and Oracle.  What this means is that now you’ll be able to pull data out of the Oracle logs themselves, and put them right into your SQL Server DB.  So you can literally use CDC with Oracle now.

Now Michael Rys is on stage talking about semantic search.  I don’t really know enough about it to put into words what he’s doing, but I’ll just say that it’s really cool.  I know that’s a lame description, but it’s almost the best I can do.  I will describe one thing though.  He ran a semantic search against a folder of pdf docs he had and the semantic search was smart enough to know the general subjects of the books, and only showed him the titles that met his criteria.  There’s a lot that goes on behind the scenes there that I can’t speak to, but I do know enough to get a sense of what a leap that is for search. 

Now we’re talking about SmartTVs.  I’m not sure why yet though.  Oh, I see, they’re on Azure to sync the TV data.  Ok, they’re talking a lot about clouds so that’s it for today.

PASS Day 1 Keynote Live Blog

Today’s keynote with Ted Kummert starts at 8:15am EDT. Click here to watch today’s keynote live. For information on today’s keynote, see the SQLPASS.org page on Keynotes.

I’ve already blogged about some Summit topics here, but I wanted to continue the discussion on this blog.

Ok, during this opening keynote, Ted just announced the official name for Denali and it’s going to be SQL Server 2012 and it’s exact release date is in the first half of 2012.  We’re alredy in CTP 3 so MS has some work to do before now and then.

One of the big themes in Denali is the cloud.  MS is investing heavily in the cloud and until now the message has been very militant.  For the longest time now we’ve been getting the message that we’re all going to be out of a job if we  don’t transfer our skills to the cloud within the next 2-5yrs.  Now they’ve backed off quite a bit and are saying that the cloud is just oging to enhance your current skills and just give you another avenue when it comes to deciding where your data’s going to go.  Frankly I’m glad to see them calm down finally because it was making a lot of people very angry.  Now, just because MS says it that doesn’t make it so, so why was everyone getting so angry?  Well, in a word because of the offering.  They’ve been telling us that we’re all in trouble if we move our skills to the cloud, but their cloud offering is anything but complete.  So when there are such big holes in their offering, people get upset when you tell them their jobs are going away because it makes the scenario impossible.  So anyway, calm down guys, your job isn’t going anywhere.

Ted also just announced that SQL Server will be supporting Hadoop for ultra-large data sets.  They will also be making submissions to Hadoop to make it run properly for SQL Server.  Now, I know what you’re thinking… you’re a SQL Server guy, not a Hadoop guy.  Well, so am I, but it looks like we may have to at least get up to speed at least to a perfunctory level because we’re going to start seeing it in our DBs.  So to get you up and going on Hadoop, have a look at some of these links:

Hadoop Tutorial

Microsoft is getting in to BIG DATA.  One of their new goals going forward is to be able to work with the super-large datasets from many different sources.  A lot of this data is going to be unstructured, and that can be the hardest data to get at so bringing on products like Hadoop will help with that.  Because let’s face it… as DBAs we really want for everything to have a nice queryable structure, but it’s not always stored like that.

Data Explorer is an interesting topic too.  Microsoft has a new Azure Marketplace and apparently they’ll not only offer an app marketplace like we’re all used to seeing, but they’re also going to offer it as an internal service.  What it sounds like this means is that they’ll allow companies to have their own app marketplace that they can use to give their internal people  apps.  Now, this is a new market for sure, but how useful is it?  From a technology standpoint I get where they’re coming from, but I just don’t know how much of a use there is for something like that.  If it does get picked up then it’ll mean a big paradigm shift for pretty much every company in the world.  But this is definitely something to watch out for because it’s an interesting concept.  I remember when MS came out with web services.  They didn’t really know what to do with it, but they stood in front of their customers and said here’s this new technology, now we’d like for you to go see what you can do with it.  I’m waiting to see if this’ll be like that again, and I’m also interested to see if this has any real teeth. 

 

 

Comparing DB objects in Powershell

I got a request today to compare 2 DBs and see if they’ve got all the same tables. It seemed easy enough since I’ve got the redgate Schema Compare tool, but it actually isn’t. See, for some reason Schema Compare doesn’t work very well on my box, and it’s a big overkill for what I needed. All I really needed was to see if the same tables existed. I wasn’t interested in comparing their actual definitions. So I thought I’d give Powershell a try.

I started by picking my cmdlet, which was going to be Compare-Object. I use its alias ‘Diff’. Now, this cmdlet isn’t typically used for comparing DB objects, but there’s nothing that says it can’t be. After giving it a little thought I decided to write the following code to get my comparison. Oh, and I’m just using sample DBs for my examples.

$a = (cd sqlserver:\sql\localhost\default\databases\AdventureWorks\tables).dir;
$b = (cd sqlserver:\sql\localhost\default\databases\AdventureWorksLT\tables).dir;

Diff -reference ($a) -difference ($b) -property name

OK, that’s the base-level syntax you need to do a comparison on table names. That code will give you the following results.

The left arrows indicate tables that are only found in the reference object ($a), while the right arrows indicate objects that are only in the difference object ($b). And that’s really it. The only thing you need to know is that while the differences are outlined, objects that appear in both sets aren’t listed. So your best bet would be to have an empty result set because that would mean that all of the objects exist in both places.

Now, keep in mind also that I only matched on the name property. So there are opportunities for this to go wrong because I’m not taking schema into account. If I wanted to take schema into account though, I would just add it to the property list like this:

$a = (cd sqlserver:\sql\localhost\default\databases\AdventureWorks\tables).dir;
$b = (cd sqlserver:\sql\localhost\default\databases\AdventureWorksLT\tables).dir;

Diff -reference ($a) -difference ($b) -property schema, name

This time you notice that the Address table shows up twice in different schemas. And in the first result set the Address table wasn’t there, so that means that the Address table was considered a match in both DBs because we weren’t taking schema into account.

However, there’s one final piece that you may find useful. What if you want to see all the objects, even the ones that match, you can throw in the -IncludeEqual flag. I’ll run the original Diff to show you that now the Address table shows backup and is indeed considered equal when you compare it just by name.

$a = (cd sqlserver:\sql\localhost\default\databases\AdventureWorks\tables).dir;
$b = (cd sqlserver:\sql\localhost\default\databases\AdventureWorksLT\tables).dir;

Diff -reference ($a) -difference ($b) -property name -IncludeEqual

Now that we’ve got all the plumbing worked out you can apply this to SPs, functions, or whatever you like.
It won’t give you an actual schema compare against these objects, but it’s really helpful in quite a few instances to be able to just see which objects exist.

That’s it for this time.

Finding SQL VMs: Part 2

Ok, I’ve got a couple other methods so I thought I’d pass them along to you guys. These are methods passed to me by users.

The first one is: xp_readerrorlog 0,1,’Manufacturer’
I tried this on a number of my boxes with very mixed results. It didn’t have anything that specifically told me the VM wan’t virtual, and it didn’t work on any of my VMWare boxes. It appeared to work well enough with Hyper-V though.

The 2nd is: Win32_BaseBoard
This WMI class worked just fine for finding Hyper-V instances, but didn’t prove fruitful at all on VMWare. And again, there wasn’t really way way to tell whether it was physical or just not being reported correctly.

So there you have it. We’ve got 2 more methods, that I haven’t had much luck with, but maybe you’ll have more luck if you’ve got the right environment. So if you really don’t like the one I gave you yesterday, then pick one of these.

Is your SQL Server in a VM?

It seems that one of the key pieces of info you need when looking into performance problems is whether your SQL box is sitting in a VM or not. This could greatly expand the scope of your troubleshooting. The problem is of course, that Windows has no idea it’s in a VM so there’s nothing really to query inside Windows itself to get this info. Oh, if you’re lucky you can run through the software installed and look for the VM tools installed, but that’s really hit or miss, and not as easy as I’d like.

I’ll show you the way I do it though and you can either adopt it or not. Frankly, I’ll be really interested in seeing if anyone leaves a comment with a better way, cause I just stumbled on this about a yr ago so I don’t know how it’s usually done.

Anyway, we’re going to turn to Powershell for the answer… there’s a shock, huh?
Here’s the simple code it takes to query for a VM instance:

gwmi Win32_BIOS -ComputerName "Server1"

That’s pretty easy, right?
Now for the results. I’ve only got 2 VM packages to test with so if you’ve got something other than VMWare or Hyper-V then you’ll have to devise your own test.

For VMWare, the results will look like this:

Notice the SerialNumber has “VMware” in it? That’s what you’re looking for.
Now, to filter out the boxes in your LAN that are VMs, you have but to filter based on that criteria. It would look something like this:

$a = "Server1", "Server2", "Server3"

$a | %{$bios = gwmi Win32_BIOS -ComputerName $_; If($bios.SerialNumber -match "VMware") {"$_ is Virtual"} 

Ok, so it’s not a tremendously useful example, but it shows you how the simple filter would work.

Now let’s move on to Hyper-V. That output would look like this:

This time notice that it doesn’t say “Hyper-V”, and it doesn’t even say anything special in the SerialNumber. This time you have to look in the Version col. This col has “VRTUAL” in it. It would be nice if it said Hyper-V, but it doesn’t. So now you can apply the same filter above to find your VMs if you’re in a Hyper-V shop.

$a = "Server1", "Server2", "Server3"

$a | %{$bios = gwmi Win32_BIOS -ComputerName $_; If($bios.SerialNumber -match "VRTUAL") {"$_ is Virtual"} 

Now, let’s put them together. If you’re in a mixed shop like I am, you’ll want to know what kind of VM your SQL box is in so you know which VM admin you need to work with when you have problems. So this last script will tell you which VM your boxes are in. Now you can query all of your servers and put that info into a table so you have it always. And don’t forget to run it periodically to make sure nothing’s changed. Because we’re always moving boxes into VMs, and even taking a few back out. So keeping up with the latest can be useful. I run mine once a week.

$a = "Server1", "Server2", "Server3"

$a | %{

$bios = gwmi Win32_BIOS -ComputerName $_; 

if($bios.Version -match "VRTUAL") {"$_ is Hyper-V"}
if($bios.SerialNumber -match "VMware") {"$_ is VMWare"}
if($bios.Version -notmatch "VRTUAL" -and $bios.SerialNumber -notmatch "VMware") {"$_ is Physical"}
		
}

I would loved to have used the SWITCH statement here, but since I’m having to query 2 separate cols it wouldn’t have been the easiest way to do it. However, just in case you get a chance to use it, here’s a quick lesson on SWITCH. It’s like the CASE statement you see in pretty much every language out there. I won’t bother with explaining the details of the syntax because I think it’s fairly self-explanatory. If you do need help though, ping me and I’ll give you a hand, or you could do the responsible thing and bing it on google first.
But here’s a quick example of how you’d use SWITCH.

$a = "hello"

switch ($a)
{
"hello" {"You typed the right word."}
default {"You typed something other than 'hello'."}
}	
}

Again, it’s not a particularly useful example, but it does show you how SWITCH works. It’s really just a CASE statement.

Change your process

This is an excellent example of how you need to be flexible with your processes, even when you’re in the middle of a project.

We started a project to move a DB to a new set of disks. Since the files are large, we probably weren’t going to be able to fit it into a single downtime so we were going to just move one file at a time over the next few weeks until they were all done. Well, due to circumstances out of our control, now they all have to be done at the same time. The problem is that now the file copies are going to take in excess of 6hrs, which is way longer than any downtime they would give us. I know, right? Don’t worry, I’ve got big problems with them forcing us to do operations in a large chunk like this, and then saying we can’t have time to do it. So we were doing our test last week and it did indeed take about 6hrs to copy all the files. And I don’t know why it didn’t hit me before, but why not change the process? The copy process was there because we were going to piecemeal the operation over several weeks, but since that’s gone, then maybe it’s time to come up with a new strategy.

So that’s what I did. My new strategy is an even simpler backup/restore op. All I have to do when I restore is map the files to their new locations and I’m golden. So so far it’s 6hrs to copy, and I know I can backup in 20-25mins. So my restore I’m guessing will be about 30mins (give or take).

Of course, the backup/restore won’t perform that well on its own. You have to tune it so it’ll use the resources to its advantage. This is where knowing how to tune your backups can come into play. And while I often say that tuning backups is quite often frustrating because you can’t use a lot of your resources because you’ve still gotta leave room for user processing on the box, this is one of those times that you can crank it all the way up. Everyone’s going to be offline for this op, so the box is completely mine. I can use every ounce of RAM and CPU on the server if I want. And that’s what I’m going to do. If you’re interested in how to go about tuning your backups, you can look at my recent SQLSAT session on the topic. I did it for SQL #90 in OKC. You can find the vid page for it here: http://midnightdba.itbookworm.com/Events.aspx

So anyway, the point is that just because you’ve come up with a way to do something, don’t set it in stone. If the scope changes in such a way that you can now do it a better way, then don’t be afraid to say “Stop, I’ve got a better way”. You may get some pushback from your peers because the project plan is already done and we need to just move forward with what we’ve got, but that’s when you need to push back and say no, this process was developed for a different circimstance and now it’s a different scenario completely. So this is no longer a viable method.

Good practice saves the day

Ok, a funny thing happened at work today. I had to write a script to move a DB’s files to another drive. Sounds easy enough, huh? Well in theory it was easy, but leave it to me to mess it up. Now, this was just a dry-run so I don’t feel too bad about it, and it gives me a chance to test my process. So, here’s what happened.

My process was this:

1. Run move stmts for all files.
2. Take DB offline.
3. Run PS script that moves the files.

The PS script selects from sys.master_files to get a list of the files to move. Here’s the problem… I’ve already run the move statements so SQL thinks they’re already in the new location. Therefore, the PS script can’t move them from their current location. However, good practice saves the day.

As a habit I always save a copy of sys.master_files to another table in case something happens. Now it’s a simple matter of changing my PS script from selecting from sys.master_files to selecting from sys.master_filesSEAN. Now I’m back in business. That’s the thing I find about good practices; they help you in ways you can’t even predict most of the time.

Here are some other general steps I take to protect against one thing or another:
1. I always try to group like objects in the same schema. It’s just a good idea.
2. I always copy the system DBs when upgrading or patching SQL. Sure I can restore from backup, and I have those too, but if something big happens and I have to completely re-install from scratch it’s much easier to just drop the old system DB files out there than it is to restore from backup.
3. Whenever I’m done working on an important prod system and it’s time to move on to something else, I always close all prod windows and even disconnect from it in object explorer. I don’t want any chance there’s a way to open a new connection to it and do something dumb.

I’d love to give you guys a huge list, and I know there are more, I just can’t think of any right this second.

An interesting log truncation case

Here’s a good one for you guys. I got a call from one of my DBAs today that they’re having trouble with the logs on one server not truncating. They’ve brought the server space to a critical level, and it needs to be fixed. The problem is that the DBA had looked at the log backups and they were fine. The log backup job was running just fine, and there were no active transactions.

I asked if there were any DBs being replicated… no.
OK, I’ll look into it.

So I connect and start my investigation where I always do for log truncation problems.
SELECT name, log_reuse_wait_desc from sys.databases

This tells me exactly what the log is waiting for before it can truncate. It’s really a matter what what your goal is. Some people love the troubleshooting process so much that they want to drag it out as long as they can. So they hunt around looking at different things to try to find what the problem is. They gather evidence and try different things until the problem is solved. I’m not that way though. I want to find the issue ASAP and get on to other things. So if you really enjoy the hunt then keep doing what you’re doing. But if you want to solve the problem quickly, then use the above query and get on with your life.

So anyway, the log was waiting on a log backup. I then checked the log backup job and it had been running just fine all day. So ok, the log is being backed up, but SQL still thinks that it’s waiting on a log backup before it can truncate the log.

This is where a knowledge of SQL can come in. At this point there are only 2 things that could architect this situation. Because all the DBs on the server are in the same boat with their logs filling up. So either the backup isn’t actually running, or it’s running with the copy_only flag.

A quick look at the job properties told me exactly what I needed to know. Looking at the execution time of the different runs, the job finishes in about 10secs. That seems a little fast for me on a server that has like 200 DBs on it. So looking back at the history 2 days ago the job was taking 9mins.

At this point, I knew exactly what the problem was. I looked in the SP that runs the backup, and the line that actually runs the backup had been commented out. Someone was trying to make a simple change and commented out the wrong line in the SP.

Look, I’m not smarter than your average DBA, and I’m not necessarily better educated. What I do however, is follow a predictable pattern of troubleshooting, and actually pay attention to the evidence. Sometimes the evidence isn’t clear and you have to make guesses, but most of the time it is.
I understand that the thrill of the hunt keeps you going, but some things are easy enough that you should just get them done. Why a log won’t truncate is such an easy thing to diagnose it should just be commonplace for you. Search somewhere else for your troubleshooting fix.

Partitioning and the free lunch

Quite often there’s no benefit to a feature if you’re not using it, but this time that’s not the case. Even if you’re not using partitioning you can still take advantage of it in a really cool way.

Let’s say you’ve got a table that you load every night, and you want to move the current data to a new table and truncate the current one. Here’s how you’d do something like that today.

1. Select * into T2 from T1
2. Insert into T2 select * from T1
3. Rename T1 to T2. Then recreate T1.

But let’s face it, none of those choices are really ideal as they take time. Though of those I would choose 1, 3, 2 in order. I don’t like renaming objects because the new object now has a new ID, and it’s hard to track them over time if they keep changing IDs. So those are your choices if you’re not taking advantage of partitioning.

If you ARE taking advantage of partitioning however, you’ve got a different choice. Now you can switch the data out of the current table and into the 2nd w/o any data actually moving at all. And the best part is that you don’t even have to have partitioned your tables at all. That’s right… your tables are automatically partitioned as a single-partition table so you’re using partitioning whether you know it or not. So that means that in order to truncate T1 and move its data to T2 you only have to do this:

alter table T1
switch partition 1 to T2

Now there are some caveats, but as with most things it’s not too bad. The biggest 3 are:
1. The table structures have to be identical.
2. They have to be on the same filegroup. So the partition you’re switching out and the one you’re switching it into have to be on the same filegroup.
3. The target table/partition has to be empty.

There are some others that are advised, but those 3 are the only show STOPPERS. And once you’ve got the data into the new table, you can then move it to a new filegroup, or do whatever with it you like.

So anyway, it’s a wicked cool method for getting rid of your current data even though you’re not officially partitioning your table.

Learning Powershell in Steps

One of the places PS shines is in its ability to let you work the way you want. So often I find someone stagnating in PS because they can’t figure out with “proper” way to do the task. So they stare at the screen for along time, try a few things, and finally give up. The problem with that is that all of a sudden PS is this new thing they don’t understand and they don’t have time to learn it nuances well enough to make it useful. So they get discouraged and don’t even consider it when coming up with solutions to issues. And this is coupled with that fact that sooooooo much of the PS examples out there are by guys who just don’t get it. They post these really long scripts that are written in native SMO when a simple 1-liner will do. Why would you do that to someone? I’ve said many times that if those scripts had been my first intro to PS I wouldn’t have given it a second glance either. To me, these guys can definitely code, but they really don’t *get* what PS is all about.

So here I’m going to help you with what I’m calling the conduit method of learning PS. Because even now when I can’t figure out how to do something in “proper” PS, I still fallback on what I know. I’ve said many times in my SQLSaturday sessions that PS will do pretty much whatever you tell it, and that it’s perfectly compatible with regular DOS commands. And the same thing goes with T-SQL.

Yesterday I needed to take a user account and add it to every DB on a big server. This server had hundreds of DBs on it. Well, of course I went to PS. My goal was to perform the task as fast as I could and go on about my day, so I chose the conduit method instead of the native PS method. Sure, I can call the regular SMO methods and maybe even figure out how that’s done, and figure out the exact params the methods take, etc, but SMO is so poorly documented it’s hard to ferret some of that stuff out. And you know what? Sometimes it’s more important to just get the job done than it is to make sure it’s done in “proper” or native PS.

So what is this conduit method I’m talking about? Well, it’s marrying what you know in T-SQL and using PS only as a conduit for your T-SQL. This usually consists of using Invoke-SqlCmd inside a loop. This way you get the best of both worlds. You get the power of PS and the simplicity of what you already know and you don’t have to rewrite it every time. So again, the conduit method let’s you code in T-SQL, and just delivers it in PS. And it’s a very powerful technique.

OK, so adding a user account to all my DBs. Here’s what the script looked like:

dir | %{invoke-sqlcmd -ServerInstance localhost -database $_ -query "CREATE USER [MyUser] FOR LOGIN [MyUser]"}

So I start at the DB tree in PS. If you run a ‘dir’ command you’ll get a list of DBs. Now, you loop through them, and for each one, you call invoke-sqlcmd and connect to that DB. Now you run the T-SQL inside the -query param. It’s that easy. Now you’ve got a repeatable method for running code against a whole list of DBs on any server and it’ll run whatever is in the -query param.

And if you’ve got code that’s more complicated, you can put it in a file and run it that way. It would look like this:

dir | %{invoke-sqlcmd -ServerInstance localhost -database $_ -InputFile c:\MySQLFile.sql}

With this you can deploy SPs or views, or tables, or whatever you like to every DB. And you can easily alter it to loop across servers too. So even if you’re a beginner you can use this method because it never changes and you can still get the full power out of PS. And this is the point isn’t it?

So many big PS guys completely miss the point that PS is supposed to make our lives easier. They insist that things have to be done a certain way and frankly I’m getting tired of all the little pissing contests that take place. Oh you have to do it this way because it performs better. Or you have to do it this way because it’s more elegant. And while I agree with those to a certain degree, there’s nothing more elegant than getting the job done and not having to spend all day doing it. I’ve had arguments with guys about scripting DB objects and they insist that the native SMO is faster than the way I do it even though mine is a lot less code. And I’ve tested it, and native SMO is faster. On one of my DBs that had several thousand objects the SMO came in like 15secs sooner than the shorter way. Wow, 15secs, really? What will I do with all that extra time? Maybe I can use it to finally write that book I’ve been planning on huh? So for most cases, these differences are just pissing contests that I have no use for.I don’t care if scripting an entire DB takes a few extra secs. I’ll spend far more time than that writing the SMO it takes to save that time.

And since a lot of our work is adhoc in nature, it helps us to be able to write short, tight code. And by adhoc I mean they’re requests that we couldn’t have forseen so we have to write something from scratch. I don’t know about you, but I don’t want my hands in raw SMO 4x a day. I’d rather write a few 1-liners and go to lunch.

So don’t let anyone tell you that the conduit method is worthless or even inferior. It’s just as valid a method as writing 2 pages of SMO to do the same thing, and it performs plenty fast enough. Don’t be afraid of PS. Use the conduit method until you can learn more PS and in the meantime you’ll be more productive than you’ve ever been.

And we’ve got lots of PS vids on MidnightDBA.com, so go check them out and we’ll explain all kinds of things to you.

Instead of working, I blog.