It’s the simple things in life that can trip people up. I find that creating primary keys and clustered indexes on existing tables is one that newish DBAs tend to trip up on, and so (like I do), I shall blog about it so there’s a nice, clear, explicit reference available.
Yes, Virginia, you most certainly CAN add a clustered index to a heap without explicitly dropping and recreating the table. Same thing for primary key constraints.
IMPORTANT: Creating a clustered index is a big operation, for large tables. This isn’t something you want to do without planning, on a production server. For more information, see http://sqlmag.com/blog/what-happens-if-i-drop-clustered-index
Requirements for creating a clustered index on an existing table:
- The table can’t have an existing clustered index.
That’s it. There’s no requirements for uniqueness or NULLability, or NULLs. Note, however, that after a column is part of a clustered index, you can’t alter the properties of that column (so, for example, you couldn’t then make the column NOT NULL).
Requirements for creating a primary key constraint on an existing table:
- The table can’t have an existing PK constraint.
- The column can’t have any NULL values.
- The column must be defined as NOT NULL.
Demo time! Clustered Primary Key
So, let’s take a simple example, where we want a single-column, clustered primary key (on column [id]):
CREATE TABLE t1 ( id INT, val INT ); INSERT INTO t1 VALUES (1, 1), (2,2), (3,3);
The table meets 2 of the 3 requirements for a PK constraint, but [id] is a NULLable field. Let’s fix that:
ALTER TABLE t1 ALTER COLUMN id INT NOT NULL;
After that, adding a primary key constraint to an existing table is easy. While we could run a CREATE CLUSTERED INDEX statement first and then the PK, why don’t we just run a single statement and do both at once:
ALTER TABLE t1 ADD CONSTRAINT pk_t1 PRIMARY KEY CLUSTERED (id);
Demo time! Nonclustered Primary Key
Not all primary keys are clustered. In this example, [id] is our primary key, and [val] is our clustered index:
CREATE TABLE t2 ( id INT, val INT ); INSERT INTO t2 VALUES (1, 1), (2,2), (3,3);
Again, we have to make sure the PK column is NOT NULLable:
ALTER TABLE t2 ALTER COLUMN id INT NOT NULL;
Now we can create the primary key constraint and the clustered index:
ALTER TABLE t2 ADD CONSTRAINT pk_t1 PRIMARY KEY NONCLUSTERED (id); CREATE CLUSTERED INDEX ix1 ON t2 (val);
Remember kids, a clustered index is not the same thing as a primary key.
Clustered indexes sort and store the data rows in the table or view based on their key values. These are the columns included in the index definition. There can be only one clustered index per table, because the data rows themselves can be sorted in only one order.
The only time the data rows in a table are stored in sorted order is when the table contains a clustered index. When a table has a clustered index, the table is called a clustered table. If a table has no clustered index, its data rows are stored in an unordered structure called a heap.
A table typically has a column or combination of columns that contain values that uniquely identify each row in the table. This column, or columns, is called the primary key (PK) of the table and enforces the entity integrity of the table. Because primary key constraints guarantee unique data, they are frequently defined on an identity column.
When you specify a primary key constraint for a table, the Database Engine enforces data uniqueness by automatically creating a unique index for the primary key columns. This index also permits fast access to data when the primary key is used in queries. If a primary key constraint is defined on more than one column, values may be duplicated within one column, but each combination of values from all the columns in the primary key constraint definition must be unique.
Have you checked out Minion Reindex yet? It’s our free, easy, comprehensive index maintenance solution. You’ll love it, seriously.