Category Archives: Applied SQL

Create a clustered index and a primary key on existing tables


Your data: unique little snowflakes

It’s the simple things in life that can trip people up. I find that creating primary keys and clustered indexes on existing tables is one that newish DBAs tend to trip up on, and so (like I do), I shall blog about it so there’s a nice, clear, explicit reference available.

Yes, Virginia, you most certainly CAN add a clustered index to a heap without explicitly dropping and recreating the table. Same thing for primary key constraints.

IMPORTANT: Creating a clustered index is a big operation, for large tables. This isn’t something you want to do without planning, on a production server. For more information, see


Requirements for creating a clustered index on an existing table:

  • The table can’t have an existing clustered index.

That’s it. There’s no requirements for uniqueness or NULLability, or NULLs. Note, however, that after a column is part of a clustered index, you can’t alter the properties of that column (so, for example, you couldn’t then make the column NOT NULL).

Requirements for creating a primary key constraint on an existing table:

  • The table can’t have an existing PK constraint.
  • The column can’t have any NULL values.
  • The column must be defined as NOT NULL.

Demo time! Clustered Primary Key

So, let’s take a simple example, where we want a single-column, clustered primary key (on column [id]):

CREATE TABLE t1 ( id INT, val INT );
INSERT INTO t1 VALUES (1, 1), (2,2), (3,3);

The table meets 2 of the 3 requirements for a PK constraint, but [id] is a NULLable field. Let’s fix that:


After that, adding a primary key constraint to an existing table is easy. While we could run a CREATE CLUSTERED INDEX statement first and then the PK, why don’t we just run a single statement and do both at once:


That’s it!

Demo time! Nonclustered Primary Key

Not all primary keys are clustered. In this example, [id] is our primary key, and [val] is our clustered index:

CREATE TABLE t2 ( id INT, val INT );
INSERT INTO t2 VALUES (1, 1), (2,2), (3,3);

Again, we have to make sure the PK column is NOT NULLable:


Now we can create the primary key constraint and the clustered index:




Remember kids, a clustered index is not the same thing as a primary key.

Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be sorted in only one order.

The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.



A table typically has a column or combination of columns that contain values that uniquely identify each row in the table. This column, or columns, is called the primary key (PK) of the table and enforces the entity integrity of the table. Because primary key constraints guarantee unique data, they are frequently defined on an identity column.

When you specify a primary key constraint for a table, the Database Engine enforces data uniqueness by automatically creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries. If a primary key constraint is defined on more than one column, values may be duplicated within one column, but each combination of values from all the columns in the primary key constraint definition must be unique.


Happy days,
Jen McCown

Have you checked out Minion Reindex yet? It’s our free, easy, comprehensive index maintenance solution. You’ll love it, seriously.

Applied SQL: Orphan a user, fix orphaned users (sp_FixOrphanedUsers)

olivertwistEver heard of orphaned users? It’s a database user account, disconnected from its associated login. You typically get this when you take a database backup, and restore it to a different server.

Microsoft provides us with the ever helpful sp_change_users_login to find and repair orphaned users, but even so, it’s kind of a PITA* manual operation. To simplify and provide for automation, I’ve put together the brand-spanking-new sp_FixOrphanedUsers.

First, let’s orphan some users

If you’d like to test this, here is a repro scenario that creates an orphaned user on a single instance. It orphans a user by doing the following:

  1. Create a login “OliverTwist”**.
  2. Create a database.
  3. Create a user in that database from the “OliverTwist” login.
  4. Back up the database, then drop it.
  5. Drop the login.
  6. Restore the database.

I will say that steps 4 and 6 aren’t strictly necessary; just dropping the login will orphan the user. But I’d like to drill the idea of orphans being associated with restores, because that’s the most common scenario in real life.

Important: ALWAYS review code before you run it, even on a test instance. 

Now let’s resolve a single orphaned user

When you have an orphaned user, it’s really easy to detect and fix them. In our case, we’d find the orphan by running

EXEC OrphanedUserDemo..sp_change_users_login 'Report';

And then, we could fix that orphan by first creating the missing login, and then running

EXEC OrphanedUserDemo..sp_change_users_login 'AUTO_FIX', 'OliverTwist';

(Or, if we want the missing login to be created automatically, then “EXEC OrphanedUserDemo..sp_change_users_login ‘AUTO_FIX’, ‘OliverTwist’, NULL, ‘NewPasswordOMG!!!';“)

But this is a manual process, and therefore slow and tedious and not entirely worth our time.

Let’s resolve many orphaned users

Create the stored procedure sp_FixOrphanedUsers in master, and run it as [DBName]..sp_FixOrphanedUsers, and it fixes all orphaned users in that database, if it can do so automatically.

Better still, you could run sp_FixOrphanedUsersAllDB (included) to run this process against all databases on the SQL Server instance.

If you have a regular refresh process – like restoring a set of databases to UAT – you could create a second step in the job to run sp_FixOrphanedUsersAllDB, and greatly simplify your life.

Limitations and caveats

Of COURSE there are limitations, silly reader!

  • Don’t trust code you download from the web. ALWAYS review code before you run it, even on a test instance.
  • This is the first iteration of the procedure(s). Doubtless, you and other readers will have helpful suggestions on how to improve the thing.
  • In this iteration, the SP doesn’t have the ability to specify a password to auto-create missing logins. I think this is a really bad practice. (“Oh, this server doesn’t have a login for ‘GodlyAdminGuy’? Let’s create on with ‘P@ssw0rd!’ for the default!” Yeah, no.)
  • And other things I haven’t yet thought of, but will doubtless be pointed out to me enthusiastically and repeatedly. Feel free, but be cool.


Happy days,
Jen McCown

*PITA = “Pain In The Analytics”, so far as you know.

** GET IT??

Applied SQL: LIKE ‘whatever’

SSMS_practiceToday let’s play around with the LIKE operator. Like, totally.

Like, Introduction

Like “determines whether a specific character string matches a specified pattern. A pattern can include regular characters and wildcard characters.” That makes it pretty darn useful, say, when we’re searching for all product reviews with the word “worst” in it:

SELECT ID, Comments
FROM Review
WHERE Comments LIKE ‘%worst%';

Also, when we’re searching for all product names that begin with the word “Hex”:

FROM Product
WHERE [Name] LIKE ‘Hex%';

We can even use multiple % wildcards to find strings where something is followed by something else. This is particularly useful, I find, when searching sys.sql_modules for code within stored procedures:

SELECT object_id, definition
FROM sys.sql_modules
WHERE definition like ‘%DELETE%Customer%';

You get the idea. You can also, of course, find strings that don’t match a certain pattern, using NOT LIKE. For example, let’s say we want all departments that aren’t production related:

FROM Department
WHERE [Name] NOT LIKE ‘%Production%';

Like, Really Cool

But wait, there’s more!

Go back and look over that TechNet article on LIKE. Check out the table in the “pattern” section. We’re familiar with the % wildcard – “Any string of zero or more characters”. But pay attention to the next one: the underscore (_), which represents a single character.

Now we can do things like, oh, getting all strings that begin with “t” and are exactly five characters long:

SELECT petName
FROM myPet
WHERE petName LIKE ‘t____';

You can, of course, use WHERE LEN(petName) = 5 AND petName LIKE ‘t%’ instead. Nothing’s stopping you. But never overlook the beauty of being able to do things in more than one way.

Edit: Someone over on Reddit pointed out that this comment deserves a little more explanation. He’s right: you should note that LEN(petName)=5  makes the predicate non-sargable, meaning that that bit won’t be able to take advantage of a useful index. However, petName LIKE ‘_____’ is ALSO non-sargable, though it does seem to provide a better estimated number of rows, which would potentially result in a better plan.

In our scenario, this is all slightly moot, because that leading t IS a sargable search term: it will use an appropriate index, if available, to seek out data. Even so, I’d probably go with LIKE ‘t____’, knowing what I know now about the estimated rows accuracy.

We also have the ability to search for character ranges:

SELECT petName
FROM myPet
WHERE petName LIKE ‘[D-R]ex';

This will return any instances of Dex, Hex, Lex, Mex, Rex, and so on…but not Bex or Wex.

Go thou, and and play around with, like, whatever.

Happy days,
Jen McCown