“Thing IS NOT NULL” isn’t the same as “Thing != NULL”

Nonsensical questions: What color do you want that DB?
Nonsensical questions: What color do you want that DB?

I’ve talked about NULLs before. (Ref: Here’s a bit on ISNULL(), and here’s one about understanding NULL at a basic level.)  And yet, it’s worthwhile to bring these things up from time to time.

“[Thing] != NULL” is nonsense. It won’t work the way you want it to work, because NULL literally means “unknown”. Nothing in the world can be equal to, or not equal to, an unknown.

The TechNet article on IS [NOT] NULL says:

To determine whether an expression is NULL, use IS NULL or IS NOT NULL instead of comparison operators (such as = or !=). Comparison operators return UNKNOWN when either or both arguments are NULL.

Example time – what does != NULL do?

Let’s walk through an example.

This will create a temporary table and add three rows, one of which has a NULL value in the column “mychar”:

CREATE TABLE #tmp (id INT IDENTITY(1,1), mychar CHAR(1) NULL);
INSERT INTO #tmp (mychar) VALUES ('a'), ('b'), (NULL);
SELECT id, mychar FROM #tmp;

So, the data in our table is:

id mychar
1 a
2 b
3 NULL

Now it’s very easy to demonstrate how the IS NULL/IS NOT NULL predicates will compare against the =/!= comparison operators.

IS NULL vs =

First, let’s take a look at IS NULL versus = NULL.

select id, mychar FROM #tmp WHERE mychar IS NULL;
select id, mychar FROM #tmp WHERE mychar = NULL;

The IS NULL query returns one row, as it should:

id mychar
3 NULL

But the “= NULL” query returns no rows. Why? Because SQL has NO IDEA what you’re talking about. “Does mychar equal an unknown? Um. I don’t know? Whatever, here’s your empty result set.

IS NOT NULL vs !=

And now, IS NOT NULL versus != NULL.

select id, mychar FROM #tmp WHERE mychar IS NOT NULL;
select id, mychar FROM #tmp WHERE mychar != NULL;

The IS NULL query returns two rows, as it should:

id mychar
1 a
2 b

But again, the query with the comparison operator “!= NULL” returns no rows.  “Does mychar not equal an unknown?” is, as far as SQL Server is concerned, not a valid question. It’s exactly like asking a coworker, “What sized year did you eat for lunch?” or “What color do you want that database?

A little discussion…

As to whether it’s faster, as someone asked, to use = NULL and != NULL…I haven’t run the benchmarks, as it’s nonsense. But I’d assume these would be faster than IS NULL/IS NOT NULL, since SQL doesn’t bother trying to work out “is this equal to unknown”. You’re just not going to get any rows back.

If you’d like to speed up performance in this way, it’d be much simpler and clearer to use WHERE 1=2; or to not write the query at all.

Happy Days,
Jen
http://www.MidnightDBA.com/Jen

Have you checked out Minion Reindex yet? It’s our free, easy, comprehensive index maintenance solution. You’ll love it, seriously.

15 thoughts on ““Thing IS NOT NULL” isn’t the same as “Thing != NULL”

  1. Pingback: “Thing IS NOT NULL” isn’t the same as “Thing != NULL” - SQL Server - SQL Server - Toad World

  2. Nic

    Well unless you are one of those kinds of people that turns ANSI NULLS off for queries so that you can do =/!= NULL.

    So glad that they are removing that vile piece of code in a future release.

  3. Pingback: Dew Drop – March 17, 2015 (#1976) | Morning Dew

  4. Luc Van der Veken

    Just being pedantic, but C-style syntax ‘!=’ is not ISO/ANSI SQL standard, the correct syntax is ”.
    I know ‘!=’ has leaked through into just about every SQL dialect, BUT with two notable exceptions: MS Access and IBM DB2.
    And of course Oracle had to go their own way and allow ‘^=’ as another alternative, so you can say ‘not equal’ in THREE different ways there 🙂

  5. Hugo Kornelis

    Hi Jen!

    Good article!
    Just one small correction: NULL does not ‘literally means “unknown”’. NULL literally means “missing”. The definition in the ANSI standard defines “null value” as “special value that is used to indicate the absence of any data value”.

    The effect on the database when doing comparisons is similar. If I don’t tell you my age, you will not be able to answer the question “is Hugo 25” (okay, you have seen me so you would, but you know what I mean). But it is still an important distinction. You should never assume that any NULL you see in a database means “unknown”. The NULL in the gender column for my customer Acme Corporation means that gender is not applicable for a company; the NULL in the enddate column of the price list means that the price is currently valid; neither means that we have gaps in our information.

    For a more detailed explanation, see these blog posts:
    http://sqlblog.com/blogs/hugo_kornelis/archive/2007/07/06/null-ndash-the-database-rsquo-s-black-hole.aspx
    http://sqlblog.com/blogs/hugo_kornelis/archive/2007/07/17/the-logic-of-three-valued-logic.aspx
    http://sqlblog.com/blogs/hugo_kornelis/archive/2007/09/22/dr-unknown-or-how-i-learned-to-stop-worrying-and-love-the-null.aspx
    http://sqlblog.com/blogs/hugo_kornelis/archive/2007/09/30/what-if-null-if-null-is-null-null-null-is-null.aspx

    Cheers,
    Hugo

    1. Michael Gorman

      Hugo:

      As I am sure you know, the ANSI SQL standard comes in a number of varieties: 86, 89, 92, 99, 2003, 2008, and 2011.

      Fundamentally, however, saying that Gender is not applicable to a Corporation is not at all addressed by the Null or NotNull “unknown” value. You would first have to know (sorry about that) that Gender is not applicable to a Corporation to then know what is the meaning of a Corporation’s Gender = “NULL” value.

      Regards,
      Mike Gorman, Secretary since 1978
      ANSI INCITS Technical Committee on Database Languages, DM32.2 .

    2. Peter Wone

      Neither of you has it quite right. I wrote an article on this about fifteen years ago but since I’m not Chris Date no-one listened. Nulls can have type. Zero is a null typed as a number. There are lots of things that aren’t on my desk right now. Elephants, bananas, users, pencils… there are zero of all those things. You can’t count a thing that isn’t there… until you type the null as a zero, and then whole horizons of math open up, even though it’s not quite a number (try dividing by it). There are five kinds of null. I have TLAs for four, and the last already has a special symbol: UNK, TBA, DNC, N/A and 0. That’s unknown, to-be-advised, do-not-care, not-applicable and zero. If systems supported these values things would be a lot more deterministic. The difference between UNK and TBA is profound: with the latter you record the fact that while you don’t know, you do know that you will know. With DNC you know that the user chose not to record a value, with N/A that in the user’s opinion the attribute doesn’t exist for the thing being described. UNK is most like nulls as we know and loathe them, except that by implication the datum is genuinely unavailable, not omitted by user decision, just as zero pencils leaves you certain that there aren’t any pencils on my desk, instead of wondering whether I just couldn’t be bothered supplying a value.

    3. Jen McCown Post author

      Hi Hugo,

      Thanks for the input. We can agree that NULL doesn’t universally signify “unknown”, though I have found it “close enough for government work” in most conversations and applications. SQL Server itself, in fact (and that is the context that we’re currently in) defins a NULL value in part thusly: “A value of NULL indicates that the value is unknown.”

      https://technet.microsoft.com/en-us/library/ms191504(v=sql.105).aspx

      So yes, the literal, absolute, and standard definition is not strictly that NULL is “unknown”, and the distinction is worth making. But I stand by my post; though a missing value may be missing for a number of reasons, it is absolutely unknown to the database.

  6. Michael Gorman

    For sure NULL is a “nasty” concept. I have a large database that stores the specifications of IT work products. One of the features of the Metabase System is “Reverse Engineering.” That enables me to import a “Physical Database” (via SQL DDL shredding into rows of tables BEFORE knowing anything about the “Logical Database” that would be its semantic parent. Thus, all the “parent” references would have to be “NULL” but that would of course foul up any Referential Integrity.

    In addition to this as a problem, different SQL engines have different approaches to NULL. Bad, I know, but life’s like that.

    Here was my solution. Since every SQL engine “knows” about the value 1 and the value 2, the Metabase System has engineered the pre-populating of its 275 tables with one or two default rows. For row with PkeyId = 1 all the “string” column values have the value, “Unknown.” If I need a 2nd rows for some tables (bills of materials Assembly tables), the second row’s PkeyId = 2 and the “string” values are “Unknown2.”

    While this would cause some NULL persons to roll over in their graves (hint, Ted Codd), it’s a solution to a NonNull problem.

  7. D

    “[Thing] != NULL” is nonsense. It won’t work the way you want it to work, because NULL literally means “unknown”.

    Thinking of NULL as “unknown” is nonsense. If I have a date field that allows NULL and I see records with NULL in that field, I don’t ask myself what date may be in those fields because, you know, it’s unknown.

    I know full well there are no dates in the fields at all. So a better way to think of NULL may be “No Data Present”.

    1. Jen McCown Post author

      No, it’s not nonsense at all. It’s a value that is theoretically possible; but it is unknown to the database. Further, it’s how Microsoft SQL Server itself defines “NULL values”; not that this should be taken as the universal standard and truth, but since our discussion (as this is my blog) is more or less limited to T-SQL, then it’s worth some weight.

      And finally: It’s a useful way of talking about a concept for people on the beginning edge of understanding. I’d be delighted to sit by and listen to the standards experts debate, but this particular post is targeted at people who are new to the concept; I still stand by “unknown” as a perfectly reasonable synonym for “missing value” at this level of conversation.

  8. Pingback: Tricky TSQL: NOT IN (NULL) | The MidnightDBA Star-Times

Comments are closed.