Home » Intermediate »sqlserverpedia-syndication »SSC » Currently Reading:

Puzzling TVFunctions

July 17, 2012 Intermediate, sqlserverpedia-syndication, SSC 9 Comments

I had the opportunity to tune a highly-used function the other day, and was most puzzled by the difference between the old and the new function.  This was on SQL Server 2005, and I was able to repro the issue on-server, but not on my SQL Server 2012 instance.

The old function specified the return table; it actually inserted the data into a table variable, and returned that:

-- =============================================
-- The working, but slow, function:
-- =============================================
CREATE FUNCTION test.myfunction_tableVar ( @Data NVARCHAR(800) )
RETURNS @tableVar TABLE ( data VARCHAR(800) )

        INSERT  @tableVar
                ( data )
        VALUES  ( LTRIM(RTRIM(@data)) ) ;


Note that these have been greatly simplified from the original functions, but still showed the same odd behavior.

Anyway, when I created a new function for testing, I neglected to use a table variable; instead, I just specified a TABLE return:

-- =============================================
-- The new, faster, wrong-resultset function:
-- =============================================
CREATE FUNCTION test.myfunction_table ( @Data NVARCHAR(800) )
SELECT LTRIM(RTRIM(@data)) AS [data] ;

I tested these thoroughly; a given input produced a single row of output, and the output of one function was IDENTICAL to the output of the other in all cases. I even joined the two functions on a.data = b.data, proving the identical output. However, when I used these in a query against live data, I got different resultsets from the query:

-- This returned 915 rows, as it should:
SELECT  idnumber
FROM    test.myTable
        INNER JOIN Speak.test.myfunction_tableVar('nom') f ON CHARINDEX(f.data,
                                                              [notes]) > 0;
-- This returned 911 rows!
SELECT  idnumber
FROM    test.myTable
        INNER JOIN Speak.test.myfunction_table('nom') f ON CHARINDEX(f.data,
                                                              [notes]) > 0;

So, help me out here. How can two functions with the EXACT same resultset, and used in the exact same way, turn out different resultsets when used in a query? Is this a bug, or a feature?

Happy days,
Jen McCown

Currently there are "9 comments" on this Article:

  1. Kevin Boles says:

    You didn’t specify the length of the output field in the second function, thus it gets defined by default as something (varchar(50) IIRC) and you get some data truncations and misses on the joins?

  2. Kevin Boles says:

    Well, I missed that you joined the two functions together and got same output. Hmm, try doing a LEN() on the outputs? And can you isolate the 4 missing rows to see what is special about them?

  3. Kevin Boles says:

    Also just noticed that first function defines output field as VARCHAR(800), as opposed to Nvarchar(800) which is the input. Could be a simplification thing though…

  4. Scott Newman says:

    No idea. lurking…

  5. Chuck Rummel says:

    I see now. 1st function is multi-statement, but 2nd the optimizer may be able to inline, so the optimizer may be doing something additionally strange, you may want to look at the execution plans. But I also like where Kevin was thinking regarding the specification of the schema as a possible reason for the difference. Do they really need to be returning tables if they are returning a single value, or is that just the oversimplification of obfuscation? If you’re using the functions to parse a multi-value string into rows, I’ve been recently working to optimize several similar functions (for example to eliminate the use of while loops, ugh) and there are lots of posts with lots of examples for doing that. Beyond optimizing the function itself, you may consider inserting the function results to a separate #temp table then joining #temp to myTable, to let the optimizer get better stats/rowcount/histogram metadata to work with. Let me know what you find works best for you.

  6. Kev Riley says:

    I agree with Kevin, and would start by looking at the 4 ‘odd’ rows. It’s interesting that when you joined the functions, they showed no differences, there could be some implicit conversion going on here. The difference in the query is that you have now introduced a 3rd and 4th element, the table `test.myTable` and the Charindex function – the combination of which has got to be hiding the reason…..

  7. Scott Minar says:

    A quick one to rule out is throwing in a MAXDOP 1 hint. There’s a bug out there where parallel plans can produce inconsistent results.

  8. James says:

    Did you compare the result sets using the INTERSECT, EXCEPT, UNION, and UNION ALL operators?

  9. Brian Ezell says:

    I’m curious if it is related to the varchar(800) vs. nvarchar(800) data types. I created a scenario that appears to create the same issue you are seeing:

    CREATE TABLE TestSearch
    TestSearchID int IDENTITY(1,1) PRIMARY KEY,
    Note nvarchar(500)

    INSERT INTO TestSearch(Note)

    –update customer
    –set name = ‘Minyā’
    –where customerid = 10002

    select *
    from TestSearch
    INNER JOIN test.myfunction_tableVar(N’Minyā’) f
    ON CHARINDEX(f.data, note) > 0

    select *
    from TestSearch
    INNER JOIN test.myfunction_table(N’Minyā’) f
    ON CHARINDEX(f.data, note) > 0

    DROP TABLE TestSearch;

Comment on this Article:

Release Date: Minion by MidnightDBA

Just 2 days 12 hours 18 minutes 33 seconds until we release Minion Redindex: index maintenance done right!

MidnightSQL Consulting

Need help? Got an emergency? Write us at Support@MidnightDBA.com!

We can schedule time to help with your backup/restore issues, high availability and disaster recovery setup, performance problems, and a great deal more. Very often, we're even available on the moment for downtime issues and emergencies.

For more information about MidnightSQL consulting, email us or check out www.MidnightSQL.com. Happy days!

Where are We?

November 3-7: PASS Summit, Seattle, WA

PASS Summit: Jen is presenting How to Interview a DBA: A Panel Debate on Thursday 11/6 1:30pm, room 401 (along with Adam Machanic, Sean McCown, Bob Pusateri, and Michelle Ufford).

PASS Summit: Sean is presenting Performance Tuning Your Backups on Wednesday 11/5 3:00pm, room 602-604.

December 11: Presenting "Powershell Cmdlets.." at Alaska SQL Server User Group

January 30: "Become an Enterprise DBA" precon at Austin SQL Saturday

Blog Posts by Category


How to Eat Pop-tarts
Watch DBAs@Midnight live on Fridays,m 11pm Central time

The best database career advice you’ve never heard!


The DBA Roadmap Seminar is 7 MP3 tracks (over 5 hours!) of insider guidance on your database career. We'll teach you how and what to study as a DBA, weigh in on controversial resume debates, teach you to recognize a worthy recruiter, and discuss the new professionalism of interviews. Also some bonus materials, PDF companion guides, and really spiffy intro music!

Once your $99 PayPal payment is submitted, you'll get the download link in e-mail! (Download is a 370Mb ZIP file.)

Become a DBA. Become a BETTER DBA. Use the Roadmap.

Visit www.DBARoadmap.com for info, forums, and more!

Add to Cart View Cart

Cunningham’s Law

"The best way to get the right answer on the Internet is not to ask a question, it's to post the wrong answer."
Relevant: http://xkcd.com/386/