A Glance at Table Design

There exist tutorials and large sections of many books devoted to all the principles that go into good table design. Today let's rattle off a quick summary of important points:

Modeling – A table models something in the real world…either something tangible (like socks or beer) or not (like services or phone calls).

Names – Table names should be descriptive, but not burdensome. E.g., Vendor instead of BeerVendorsWeUseAsPartOfOurBusiness. I personally don't use prefixes like tbl_, and I use the singular of what the table models ("Beer" instead of "Beers"). I'm ambivalent about using underscores, but I think you should pick a method and stick with it.

Columns – Just like in the real world, a thing modeled by a table has attributes, like name, age, color. Each attribute is data of a specific type. Be sure to use appropriate data types to take advantage of data type-specific features (like DateAdd, or LEN), save storage space on disk, and make database operations as efficient as possible

NULLs – Depending on our business model, some data will be REQUIRED for the object in question, and some will be optional (or optional part of the time). Explicitly define NULLable columns for data that can be "unknown", and NOT NULL columns for data that must be known. Remember to use IS NULL in your stored procedure code as appropriate, to deal with the possibility of unknown data.

PKs – Pick your side – either use natural keys (the smallest number of columns that define a row uniquely) or use my preferred method, surrogate keys (like INT IDENTITY(1,1)) plus a unique constraint on the natural keys. Define the PK explicitly, and stick with your side. Your PK wont' necessarily also be your clustered index, but if it is, define the PK constraint and the clustered index separately for supportability (more on that in a future blog). For more on selecting a clustered index, see SQL Server 2008 BOL "Clustered Index Design Guidelines".

No repeating columns – The object we're modeling has a DISTINCT set of attributes; if we're modeling customers, for example, we know what data we want to capture about one customer. If you find yourself creating a table with extra columns for an unknown number of attributes, you're violating good table design. For example, we want to track a customer's name, birthdate, and pets. THIS IS WRONG:

Name varchar(50) NOT NULL,
birthDate smalldatetime NULL,
dog1 varchar(30),
dog2 varchar(30),
dog3 varchar(30),
dog4 varchar(30),
cat1 varchar(30),
cat2 varchar(30),
cat3 varchar(30),
…

Those dog and cat columns should be removed to a separate table that links to the Customer table with a foreign key constraint. That way, each customer can have as many (or as few) pets as they want – our database is scalable! – and we save space on disk because we're not carrying tons of empty columns.

FKs – When one table refers to another – for example, our CustomerPet table has a CustomerID column that refers to the Customer table – define an explicit foreign key constraint on that relationship to prevent orphaned records.

————–

That's a good start for today. Remember to read up further on good table design in SQL Server Books Online, and in your favorite blogs and books everywhere.

Happy days,

Jen McCown
http://www.midnightdba.com