We got a question recently that basically boiled down to this: Nested views shouldn’t be such a performance suck. Why doesn’t Microsoft make the optimizer deal with nested views better?
We’ve looked into this a good deal – nested views are one of my pet peeves, and I’ve blogged about it at least a couple of times (see Detangling Nested Views and Proving Views Can Hurt Performance). What it boils down to is, detangling nested views is way harder than you would think, even for a computer.
“It’s not that hard…”
The straight set math gets exponentially more complicated with every additional level of nesting. You can see this pretty easily if you put together a nested view of 2 levels deep, vs 3 or 4 levels, and then try to pick apart the logic. Plus, each view references anywhere from 1, to some ludicrously high number of tables, often repeating joins to the same tables in different ways. So, it’s even more complex than it first appears to a casual observer.
Go ahead, take an afternoon or seven and try refactoring a nested view manually. You’ll start to feel a lot sorrier for the optimizer.
“Yeah but, the computer could do it, right?”
I’ve talked to the optimizer team about this sort of thing, because it really does seem mathematically possible to work out the best possible plan for a nested view…something that eliminates unnecessary joins, simplifies duplicate result sets, and so on. I’d be willing to bet that it is mathematically possible. But the query optimizer itself gives up trying to find a better plan after a certain amount of time*, and that’s by design. In most cases, it’s not worth it to spend 20 seconds (or 90 minutes) finding the perfect, best optimized plan, if you’re taking the execution time from 4 seconds down to 3 with that better plan.
(By they way, take a quick look at this blog post by Jason Strate…pay attention to the bit with “StatementOptmEarlyAbortReason”…what we’re talking about is either “TimeOut” or “GoodEnoughPlanFound”: http://www.jasonstrate.com/2010/12/can-you-dig-it-stmtsimple-element/ ).
So yeah, most likely the optimizer could certainly find some absolutely fabulous plan through the much of a deeply nested view, but it simply isn’t worth it in most cases. It quickly gets to the point where an optimization like that might (theoretically) take hours, if not days to work out the perfect plan. And then in the future, when the plan is pushed out of cache, the optimizer will have to do it all over again. And then, again. And on it goes.
“Well FINE then. So what do I do with these 9-layer views?”
It might be nice if MS gave us the option of turning on a “find best plan at all cost” settings, if only for troubleshooting and refactoring, but it certainly hasn’t happened yet. I myself have taken a couple of pokes at starting a tool that could do some of the refactoring for you, but only ever got so far as to list out the referenced elements of a nested view. In all, your best bet is to take the time to work out what it is that the system needs, and refactor your nested views yourself. Your queries, and your optimizer, will thank you for it.
* Technically this isn’t really a timeout, but a number of attempts to find the plan. The number of attempts the optimizer makes depends on query complexity. Once it runs through that number of attempts, it gives up…but they call it a timeout. Whatever works, I guess. Thanks to Grant Fritchey for that clarification!
Speaking of Grant: For an excellent article on how the SQL Server optimizer chooses when to give up on your
ridiculously overcomplex unique snowflake of a query, see Grant Fritchey’s post “Reason for Early Termination of Statement“.