A Lovely Statistics Query

Most of you know that you can set SQL to auto update stats.  And I’m sure that most of you know that it invalidates stats at about 20%.  What that means is that when about 20% of the data in the table has changed, the stats are invalidated.  It’s actually 500 + 20% if you wanna be specific.  And notice I say they’re invalidated and not updated.  Stats aren’t updated when that 20% mark is reached, they’re only invalidated.  They’re not rebuilt until they’re needed.  Well, for really large tables that can result in stats being oudated longer than they should because the amount of changes needed to invalidate them is much higher.  The same issue can occur with smaller tables where the changes are varied enough to throw the distribution out of whack.  Some tables are just more sensitive to change than others and that 20% may be too much.  You may find that you need to update stats more often… say at 10% or in extreme cases even 5%.

Here’s a sexy little query I put together that will give you the percentage of change an index has had.  You can use it to not only run a job to update stats on a different percentage, but you can also just query it and see what percentage your tables are at and how fast they change.  It’s an excellent tool for troubleshooting possible stats problems.  All the same, it’s pretty handy. 

–Get stats percentage of change for tables.

SELECT
so.SCHEMA_ID,
ss.NAME AS [Schema],
OBJECT_NAME(id) AS TableName, si.NAME AS IndexName,
CAST((CAST(rowmodctr AS float)/CAST(rowcnt AS float))*100 AS int)  AS Pct,
rowcnt,
STATS_DATE([id], indid) AS [StatsDate]
FROM sys.sysindexes si
INNER JOIN sys.objects so
ON si.id = so.object_id
INNER JOIN sys.schemas ss
ON so.schema_id = ss.schema_id
WHERE rowcnt >= 1000000--500
AND rowmodctr > 0
--AND OBJECT_NAME(id) = 'TableName'
ORDER BY Pct DESC, rowcnt DESC, rowmodctr

Here are a couple notes on the query:

1.  Notice there’s a way to query for a specific table.

2.  The StatsDate col shows you the last time the stats were updated.

3.  I’m also limiting the rowcount to indexes with more than 1mill rows.  Feel free to lower that if you like.

4.  rowmodctr > 0 is something I threw in there to keep the divide by zero error out of there.  This also filters out the system-created stats.

OK, I hope you guys like this one.  I personally love it.  It’s allowed me to build a process to update stats on some of my tables more aggressively than 20% and keep my server running at its peak.

5 thoughts on “A Lovely Statistics Query”

  1. Thanks for this useful info!

    So where I see greater than 100%, this means the statistics content has rolled completely over since the statistics where taken?

    When the IndexName is null, what non-index object is the row referring to?
    Do you mean to filter these from the query, or do you remedy this situation somehow?

  2. That, and does the StatsDate refer to last time it re-indexed, re-organized, or both?

  3. Hey everybody, sorry, I’m horrible about coming on and approving comments. One of you broke down and emailed me and I answered, so I’ll just paste the Q/A in here for you. It answers a lot of the questions you’ve been asking.

    · Where I see greater than 100% (eg. “450%”), does this mean the original statistics content has rolled completely out of existance since the statistics where last taken?

    SM: The way stats work is like this… at about 20% of change, the stats are invalidated. And many people think that they’re recomputed at that point, but they’re not. They’re only invalidated at 20%. They’re not recalculated until they’re needed again. So when you see a value like 450%, that means that the data in the table has changed a lot, but the stats haven’t been used so they’ve never been triggered for update. A really good way to test this is to pick an index that these stats percentages are really high for… then force a select that uses that index… so if the index is on col1 then ‘select col1 from table where col1 = 1234’… that’ll force you to use the index… the query will run, and the stats will be run in the background (if you’ve got it set for asynch stats building)… if you don’t and you have it for synch stats building then it’ll make the query wait while it builds new stats… all the same, you should be able to run the stats query again and see that the index has fresh stats. Keep in mind that depending on how big the table is, it could take a few mins to build the stats on the index.

    · When the IndexName is null in the resultset, what “non-index object” is the row referring to?

    SM: I typically lookup individual indexes with this anymore. I wrote this quite some time ago… but after some investigation it’s un-indexed tables so you can filter those out.

    · Do you mean to filter these “IndexName=null” rows from the query, or do you remedy this situation somehow?

    · Lastly, does the StatsDate refer to last time it re-indexed, re-organized, or both?

    SM: StatsDate is the last time the stats were recomputed. This could be because of a reindexing, an autostats run, or a manual stats run… it doesn’t matter… just the last time they were recomputed.

Comments are closed.