Category Archives: Applied SQL

“Thing IS NOT NULL” isn’t the same as “Thing != NULL”

Nonsensical questions: What color do you want that DB?
Nonsensical questions: What color do you want that DB?

I’ve talked about NULLs before. (Ref: Here’s a bit on ISNULL(), and here’s one about understanding NULL at a basic level.)  And yet, it’s worthwhile to bring these things up from time to time.

“[Thing] != NULL” is nonsense. It won’t work the way you want it to work, because NULL literally means “unknown”. Nothing in the world can be equal to, or not equal to, an unknown.

The TechNet article on IS [NOT] NULL says:

To determine whether an expression is NULL, use IS NULL or IS NOT NULL instead of comparison operators (such as = or !=). Comparison operators return UNKNOWN when either or both arguments are NULL.

Example time – what does != NULL do?

Let’s walk through an example.

This will create a temporary table and add three rows, one of which has a NULL value in the column “mychar”:

CREATE TABLE #tmp (id INT IDENTITY(1,1), mychar CHAR(1) NULL);
INSERT INTO #tmp (mychar) VALUES ('a'), ('b'), (NULL);
SELECT id, mychar FROM #tmp;

So, the data in our table is:

id mychar
1 a
2 b
3 NULL

Now it’s very easy to demonstrate how the IS NULL/IS NOT NULL predicates will compare against the =/!= comparison operators.

IS NULL vs =

First, let’s take a look at IS NULL versus = NULL.

select id, mychar FROM #tmp WHERE mychar IS NULL;
select id, mychar FROM #tmp WHERE mychar = NULL;

The IS NULL query returns one row, as it should:

id mychar
3 NULL

But the “= NULL” query returns no rows. Why? Because SQL has NO IDEA what you’re talking about. “Does mychar equal an unknown? Um. I don’t know? Whatever, here’s your empty result set.

IS NOT NULL vs !=

And now, IS NOT NULL versus != NULL.

select id, mychar FROM #tmp WHERE mychar IS NOT NULL;
select id, mychar FROM #tmp WHERE mychar != NULL;

The IS NULL query returns two rows, as it should:

id mychar
1 a
2 b

But again, the query with the comparison operator “!= NULL” returns no rows.  “Does mychar not equal an unknown?” is, as far as SQL Server is concerned, not a valid question. It’s exactly like asking a coworker, “What sized year did you eat for lunch?” or “What color do you want that database?

A little discussion…

As to whether it’s faster, as someone asked, to use = NULL and != NULL…I haven’t run the benchmarks, as it’s nonsense. But I’d assume these would be faster than IS NULL/IS NOT NULL, since SQL doesn’t bother trying to work out “is this equal to unknown”. You’re just not going to get any rows back.

If you’d like to speed up performance in this way, it’d be much simpler and clearer to use WHERE 1=2; or to not write the query at all.

Happy Days,
Jen
http://www.MidnightDBA.com/Jen

Have you checked out Minion Reindex yet? It’s our free, easy, comprehensive index maintenance solution. You’ll love it, seriously.

Create a clustered index and a primary key on existing tables

Snow_Flakes
Your data: unique little snowflakes

It’s the simple things in life that can trip people up. I find that creating primary keys and clustered indexes on existing tables is one that newish DBAs tend to trip up on, and so (like I do), I shall blog about it so there’s a nice, clear, explicit reference available.

Yes, Virginia, you most certainly CAN add a clustered index to a heap without explicitly dropping and recreating the table. Same thing for primary key constraints.

IMPORTANT: Creating a clustered index is a big operation, for large tables. This isn’t something you want to do without planning, on a production server. For more information, see http://sqlmag.com/blog/what-happens-if-i-drop-clustered-index

Requirements

Requirements for creating a clustered index on an existing table:

  • The table can’t have an existing clustered index.

That’s it. There’s no requirements for uniqueness or NULLability, or NULLs. Note, however, that after a column is part of a clustered index, you can’t alter the properties of that column (so, for example, you couldn’t then make the column NOT NULL).

Requirements for creating a primary key constraint on an existing table:

  • The table can’t have an existing PK constraint.
  • The column can’t have any NULL values.
  • The column must be defined as NOT NULL.

Demo time! Clustered Primary Key

So, let’s take a simple example, where we want a single-column, clustered primary key (on column [id]):

CREATE TABLE t1 ( id INT, val INT );
INSERT INTO t1 VALUES (1, 1), (2,2), (3,3);

The table meets 2 of the 3 requirements for a PK constraint, but [id] is a NULLable field. Let’s fix that:

ALTER TABLE t1 ALTER COLUMN id INT NOT NULL;

After that, adding a primary key constraint to an existing table is easy. While we could run a CREATE CLUSTERED INDEX statement first and then the PK, why don’t we just run a single statement and do both at once:

ALTER TABLE t1 ADD CONSTRAINT pk_t1
 PRIMARY KEY CLUSTERED (id);

That’s it!

Demo time! Nonclustered Primary Key

Not all primary keys are clustered. In this example, [id] is our primary key, and [val] is our clustered index:

CREATE TABLE t2 ( id INT, val INT );
INSERT INTO t2 VALUES (1, 1), (2,2), (3,3);

Again, we have to make sure the PK column is NOT NULLable:

ALTER TABLE t2 ALTER COLUMN id INT NOT NULL;

Now we can create the primary key constraint and the clustered index:

ALTER TABLE t2 ADD CONSTRAINT pk_t1
PRIMARY KEY NONCLUSTERED (id);
CREATE CLUSTERED INDEX ix1 ON t2 (val);

Ta-dah!

Reference

Remember kids, a clustered index is not the same thing as a primary key.

Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be sorted in only one order.

The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.

-http://msdn.microsoft.com/en-us/library/ms190457.aspx

Whereas…

A table typically has a column or combination of columns that contain values that uniquely identify each row in the table. This column, or columns, is called the primary key (PK) of the table and enforces the entity integrity of the table. Because primary key constraints guarantee unique data, they are frequently defined on an identity column.

When you specify a primary key constraint for a table, the Database Engine enforces data uniqueness by automatically creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries. If a primary key constraint is defined on more than one column, values may be duplicated within one column, but each combination of values from all the columns in the primary key constraint definition must be unique.

-http://msdn.microsoft.com/en-us/library/ms179610.aspx

Happy days,
Jen McCown
http://www.MidnightDBA.com/Jen

Have you checked out Minion Reindex yet? It’s our free, easy, comprehensive index maintenance solution. You’ll love it, seriously.

Applied SQL: Orphan a user, fix orphaned users (sp_FixOrphanedUsers)

olivertwistEver heard of orphaned users? It’s a database user account, disconnected from its associated login. You typically get this when you take a database backup, and restore it to a different server.

Microsoft provides us with the ever helpful sp_change_users_login to find and repair orphaned users, but even so, it’s kind of a PITA* manual operation. To simplify and provide for automation, I’ve put together the brand-spanking-new sp_FixOrphanedUsers.

First, let’s orphan some users

If you’d like to test this, here is a repro scenario that creates an orphaned user on a single instance. It orphans a user by doing the following:

  1. Create a login “OliverTwist”**.
  2. Create a database.
  3. Create a user in that database from the “OliverTwist” login.
  4. Back up the database, then drop it.
  5. Drop the login.
  6. Restore the database.

I will say that steps 4 and 6 aren’t strictly necessary; just dropping the login will orphan the user. But I’d like to drill the idea of orphans being associated with restores, because that’s the most common scenario in real life.

Important: ALWAYS review code before you run it, even on a test instance. 

Now let’s resolve a single orphaned user

When you have an orphaned user, it’s really easy to detect and fix them. In our case, we’d find the orphan by running

EXEC OrphanedUserDemo..sp_change_users_login 'Report';

And then, we could fix that orphan by first creating the missing login, and then running

EXEC OrphanedUserDemo..sp_change_users_login 'AUTO_FIX', 'OliverTwist';

(Or, if we want the missing login to be created automatically, then “EXEC OrphanedUserDemo..sp_change_users_login ‘AUTO_FIX’, ‘OliverTwist’, NULL, ‘NewPasswordOMG!!!';“)

But this is a manual process, and therefore slow and tedious and not entirely worth our time.

Let’s resolve many orphaned users

Create the stored procedure sp_FixOrphanedUsers in master, and run it as [DBName]..sp_FixOrphanedUsers, and it fixes all orphaned users in that database, if it can do so automatically.

Better still, you could run sp_FixOrphanedUsersAllDB (included) to run this process against all databases on the SQL Server instance.

If you have a regular refresh process – like restoring a set of databases to UAT – you could create a second step in the job to run sp_FixOrphanedUsersAllDB, and greatly simplify your life.

Limitations and caveats

Of COURSE there are limitations, silly reader!

  • Don’t trust code you download from the web. ALWAYS review code before you run it, even on a test instance.
  • This is the first iteration of the procedure(s). Doubtless, you and other readers will have helpful suggestions on how to improve the thing.
  • In this iteration, the SP doesn’t have the ability to specify a password to auto-create missing logins. I think this is a really bad practice. (“Oh, this server doesn’t have a login for ‘GodlyAdminGuy’? Let’s create on with ‘P@ssw0rd!’ for the default!” Yeah, no.)
  • And other things I haven’t yet thought of, but will doubtless be pointed out to me enthusiastically and repeatedly. Feel free, but be cool.

Downloads

Happy days,
Jen McCown
http://www.MidnightDBA.com/Jen

*PITA = “Pain In The Analytics”, so far as you know.

** GET IT??