Create a clustered index and a primary key on existing tables

Snow_Flakes
Your data: unique little snowflakes

It’s the simple things in life that can trip people up. I find that creating primary keys and clustered indexes on existing tables is one that newish DBAs tend to trip up on, and so (like I do), I shall blog about it so there’s a nice, clear, explicit reference available.

Yes, Virginia, you most certainly CAN add a clustered index to a heap without explicitly dropping and recreating the table. Same thing for primary key constraints.

IMPORTANT: Creating a clustered index is a big operation, for large tables. This isn’t something you want to do without planning, on a production server. For more information, see http://sqlmag.com/blog/what-happens-if-i-drop-clustered-index

Requirements

Requirements for creating a clustered index on an existing table:

  • The table can’t have an existing clustered index.

That’s it. There’s no requirements for uniqueness or NULLability, or NULLs. Note, however, that after a column is part of a clustered index, you can’t alter the properties of that column (so, for example, you couldn’t then make the column NOT NULL).

Requirements for creating a primary key constraint on an existing table:

  • The table can’t have an existing PK constraint.
  • The column can’t have any NULL values.
  • The column must be defined as NOT NULL.

Demo time! Clustered Primary Key

So, let’s take a simple example, where we want a single-column, clustered primary key (on column [id]):

CREATE TABLE t1 ( id INT, val INT );
INSERT INTO t1 VALUES (1, 1), (2,2), (3,3);

The table meets 2 of the 3 requirements for a PK constraint, but [id] is a NULLable field. Let’s fix that:

ALTER TABLE t1 ALTER COLUMN id INT NOT NULL;

After that, adding a primary key constraint to an existing table is easy. While we could run a CREATE CLUSTERED INDEX statement first and then the PK, why don’t we just run a single statement and do both at once:

ALTER TABLE t1 ADD CONSTRAINT pk_t1
 PRIMARY KEY CLUSTERED (id);

That’s it!

Demo time! Nonclustered Primary Key

Not all primary keys are clustered. In this example, [id] is our primary key, and [val] is our clustered index:

CREATE TABLE t2 ( id INT, val INT );
INSERT INTO t2 VALUES (1, 1), (2,2), (3,3);

Again, we have to make sure the PK column is NOT NULLable:

ALTER TABLE t2 ALTER COLUMN id INT NOT NULL;

Now we can create the primary key constraint and the clustered index:

ALTER TABLE t2 ADD CONSTRAINT pk_t1
PRIMARY KEY NONCLUSTERED (id);
CREATE CLUSTERED INDEX ix1 ON t2 (val);

Ta-dah!

Reference

Remember kids, a clustered index is not the same thing as a primary key.

Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be sorted in only one order.

The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.

http://msdn.microsoft.com/en-us/library/ms190457.aspx

Whereas…

A table typically has a column or combination of columns that contain values that uniquely identify each row in the table. This column, or columns, is called the primary key (PK) of the table and enforces the entity integrity of the table. Because primary key constraints guarantee unique data, they are frequently defined on an identity column.

When you specify a primary key constraint for a table, the Database Engine enforces data uniqueness by automatically creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries. If a primary key constraint is defined on more than one column, values may be duplicated within one column, but each combination of values from all the columns in the primary key constraint definition must be unique.

http://msdn.microsoft.com/en-us/library/ms179610.aspx

Happy days,
Jen McCown
http://www.MidnightDBA.com/Jen

Have you checked out Minion Reindex yet? It’s our free, easy, comprehensive index maintenance solution. You’ll love it, seriously.

5 thoughts on “Create a clustered index and a primary key on existing tables

  1. Rajesh

    ALTER TABLE t2 ADD CONSTRAINT pk_t1
    PRIMARY KEY NONCLUSTERED (id);

    will give error (There is already an object named ‘pk_t1’ in the database.)

  2. Michael Rybicki

    I think the title of your post is completely wrong:
    CREATE A CLUSTERED INDEX AND A PRIMARY KEY “ON EXISTING TABLES”

    You just created a new table. This is not EXISTING. The challenge many people face is changing you mind on how you want the table clustered after you’ve already created a PRIMARY KEY CLUSTERED like most tables are created at the beginning.

  3. Kris Condiff

    I am a beginner php programmer and want to develop a mlm web application in php & mysql,but I have no idea about it.My mlm webapplication’s requirement is-1.It work on binary tree in which every node only have left and right child.2.In this we can add node as our wish which be add in left or right.So how can implement it?

Comments are closed.