Thread: (null) != (null) ?

(null) != (null) ?

From
Todd Vierling
Date:
Below are two minor bug issues which I can't find as `known' (then again, I
can't seem to find an easy-to-identify `known issues' list for that matter 8-),
in pgsql 6.5.2.

Platform:  NetBSD/i386, 1.4.1 (a.out).

=====

(1) SELECT ... FROM table1 a,table2 b WHERE a.fieldname = b.fieldname;

Both "fieldname" definitions are identical (verified with char(2) and
varchar(100) in particular), and both tables contain a row with a "null" in
that field.  However, the results don't contain the row with the "null"
value.  A quick reproduction:

=> create temp table foo (fieldname char(2));
=> create temp table foo2 (fieldname char(2));
=> insert into foo values (null);
=> insert into foo2 values (null);
=> select foo.fieldname from foo,foo2 where foo.fieldname = foo2.fieldname;

fieldname
---------
(0 rows)

In the above, only the following expression seems to DTRT:

=> select foo.fieldname from foo,foo2 where foo.fieldname = foo2.fieldname
   or (foo.fieldname = null and foo2.fieldname = null);

fieldname
---------

(1 row)

=====

(2) NOT IN doesn't seem to work at all.  I always get 0 results--and very
    rapidly at that!--regardless of the situation.

--
-- Todd Vierling (tv@pobox.com)


Re: [BUGS] (null) != (null) ?

From
Tom Lane
Date:
Todd Vierling <tv@pobox.com> writes:
> (1) SELECT ... FROM table1 a,table2 b WHERE a.fieldname = b.fieldname;

> Both "fieldname" definitions are identical (verified with char(2) and
> varchar(100) in particular), and both tables contain a row with a "null" in
> that field.  However, the results don't contain the row with the "null"
> value.

NULL = NULL does not yield TRUE, it yields NULL.  For that matter,
NULL != NULL does not yield FALSE --- it yields NULL.  This is a
basic consequence of the semantics of NULL.  The easiest way to
think about NULL that I've heard of is: "NULL means I don't know
what the value should be".  So, for example, NULL = NULL is asking
whether two things are equal when you don't know exactly what
either of them is.  The answer cannot be "yes", it cannot be "no",
it has to be "I don't know" --- ie, NULL.

Nearly all Postgres operators yield NULL if any input is NULL.
This is perfectly sensible; for example, if you don't know what
x is, you don't know what x+1 is, either.  The main exceptions
are the special operators IS NULL and IS NOT NULL.  I think we also
put in a dirty hack to treat "x = NULL" (when NULL is written as a
literal constant) as "x IS NULL", because some clueless programmer
at Microsloth made MS SQL act that way, and now people expect it.
But it's bogus by any strict interpretation :-(

The WHERE clause treats a NULL test result as false (ie, the row
doesn't get selected), which accounts for the behavior you cite.
A really hard-line view of the semantics would be that WHERE NULL
should raise an error --- after all, if you don't know the result
of the test, how can you say if the row should be in or out?  But I
guess the SQL committee felt that that would be sacrificing too much
usability in the name of logical purity.  If it worked that way you
could hardly ever write a WHERE clause without explicit tests for
NULLs.

If you really want to match up nulls in your example, you can do
something like
    WHERE (a.fieldname = b.fieldname) OR
          (a.fieldname IS NULL AND b.fieldname IS NULL)
This is pretty grotty, of course, so my inclination would be to
use a special non-NULL value --- an empty string, for example ---
for rows that you wanted to match like this.

PS: The above WHERE does succeed where both fields are NULL.
Exercise for the student: explain why.  (Hint: OR is just a
little bit special.)

> (2) NOT IN doesn't seem to work at all.  I always get 0 results--and very
>     rapidly at that!--regardless of the situation.

I don't think it's quite *that* broken.  How about a concrete
example of what you're trying to do?

            regards, tom lane

Re: (null) != (null) ?

From
Thomas Pfau
Date:
Todd Vierling wrote:
> (1) SELECT ... FROM table1 a,table2 b WHERE a.fieldname = b.fieldname;
>
> Both "fieldname" definitions are identical (verified with char(2) and
> varchar(100) in particular), and both tables contain a row with a "null" in
> that field.  However, the results don't contain the row with the "null"
> value.  A quick reproduction:

This is standard SQL behavior.  NULL != NULL.  Essentially, NULL is an
undefined value.  Since it is undefined, it can't be know to be equal to
anything, even another undefined value.

Oracle has the NVL function which can be used to replace nulls with
known values allowing your statement above to work as you expect.  I
don't think Postgres has a similar function but you could probably write
your own.

--
Thomas Pfau
pfau@maherterminals.com
aka pfau@eclipse.net
http://www.eclipse.net/~pfau/

Re: [BUGS] (null) != (null) ?

From
Todd Vierling
Date:
On Tue, 26 Oct 1999, Tom Lane wrote:

: > Both "fieldname" definitions are identical (verified with char(2) and
: > varchar(100) in particular), and both tables contain a row with a "null" in
: > that field.  However, the results don't contain the row with the "null"
: > value.
:
: NULL = NULL does not yield TRUE, it yields NULL.  For that matter,
: NULL != NULL does not yield FALSE --- it yields NULL.  This is a
: basic consequence of the semantics of NULL.

!?

I have been using such constructs on commercial databases for ages.  Do you
have a link to a web-based SQL standard transcription that I could look this
up?  (I'll check up on exactly which database(s) I can use this type of
construct when I get back to work tomorrow....)

It seems _extremely_ counter-intuitive, especially in cases where both
fields are in fact the same type.

: Nearly all Postgres operators yield NULL if any input is NULL.

Interesting ... so see my clarification of (2) below.

: If you really want to match up nulls in your example, you can do
: something like
:     WHERE (a.fieldname = b.fieldname) OR
:           (a.fieldname IS NULL AND b.fieldname IS NULL)

Which I already described in my text, sigh.

: This is pretty grotty, of course, so my inclination would be to
: use a special non-NULL value --- an empty string, for example ---

Doesn't work for datetime, which is an important application in my case
which rather needs null to indicate "no datestamp at all".

: > (2) NOT IN doesn't seem to work at all.  I always get 0 results--and very
: >     rapidly at that!--regardless of the situation.
:
: I don't think it's quite *that* broken.  How about a concrete
: example of what you're trying to do?

Well, after reading your statement about "Nearly all Postgres ...", here's a
very simple example that I was able to create based on that assumption:

=> create temp table foo (name varchar(10));
=> create temp table foo2 (name varchar(10));
=> insert into foo values (null); // <<- here's the tripwire!
=> insert into foo values ('a');
=> insert into foo2 values ('a');
=> insert into foo2 values ('b');
=> select * from foo2 where field not in (select field from foo);

field
-----
(0 rows)

Now *that* is awfully disturbing.  :>

--
-- Todd Vierling (tv@pobox.com)


Re: [BUGS] (null) != (null) ?

From
Todd Vierling
Date:
On Tue, 26 Oct 1999, Todd Vierling wrote:

: : NULL = NULL does not yield TRUE, it yields NULL.  For that matter,
: : NULL != NULL does not yield FALSE --- it yields NULL.  This is a
: : basic consequence of the semantics of NULL.

: It seems _extremely_ counter-intuitive, especially in cases where both
: fields are in fact the same type.

Although I did find a SQL92 document on the web in the amount of time this
took to copy back to me, and I see the clause about NULL <comp op>
<anything> being unknown.  Which, I imagine, means "implementation
dependent".

--
-- Todd Vierling (tv@pobox.com)


Re: [BUGS] (null) != (null) ?

From
Tom Lane
Date:
Todd Vierling <tv@pobox.com> writes:
> : NULL = NULL does not yield TRUE, it yields NULL.  For that matter,
> : NULL != NULL does not yield FALSE --- it yields NULL.  This is a
> : basic consequence of the semantics of NULL.

> !?

> I have been using such constructs on commercial databases for ages.  Do you
> have a link to a web-based SQL standard transcription that I could look this
> up?

The SQL92 standard expresses this notion in a very wordy, laborious
fashion: every single place that they define the result of an
expression, they start out by saying "if the input(s) are null the
result is null, otherwise it's ...".  Two examples:

         1) If the value of any <numeric primary> simply contained in a
            <numeric value expression> is the null value, then the result of
            the <numeric value expression> is the null value.

            .... etc etc ....

         2) If <concatenation> is specified, then let S1 and S2 be the re-
            sult of the <character value expression> and <character factor>,
            respectively.

            Case:

            a) If either S1 or S2 is the null value, then the result of the
              <concatenation> is the null value.

            .... etc etc ....

And the particular case at hand is defined in 8.1  <predicate> and
8.2  <comparison predicate>, which say

         8.1  <predicate>

         Function

         Specify a condition that can be evaluated to give a truth value of
         true, false, or unknown.

         ...

         1) Let X and Y be any two corresponding <row value constructor
            element>s. Let XV and YV be the values represented by X and Y,
            respectively.

            Case:

            a) If XV or YV is the null value, then "X <comp op> Y" is un-
              known.

Finally, WHERE is defined as selecting those rows which yield a true
predicate result.

If your other databases don't get this right, then they're broken.


> Doesn't work for datetime, which is an important application in my case
> which rather needs null to indicate "no datestamp at all".

IIRC datetime has several special values such as "infinity"; you could
use one of those, perhaps.  But NULL doesn't act the way you are looking
for.


> : > (2) NOT IN doesn't seem to work at all.  I always get 0 results--and very
> : >     rapidly at that!--regardless of the situation.
> :
> : I don't think it's quite *that* broken.  How about a concrete
> : example of what you're trying to do?

> Well, after reading your statement about "Nearly all Postgres ...", here's a
> very simple example that I was able to create based on that assumption:

> => create temp table foo (name varchar(10));
> => create temp table foo2 (name varchar(10));
> => insert into foo values (null); // <<- here's the tripwire!
> => insert into foo values ('a');
> => insert into foo2 values ('a');
> => insert into foo2 values ('b');
> => select * from foo2 where field not in (select field from foo);

> field
> -----
> (0 rows)

> Now *that* is awfully disturbing.  :>

Well, it falls out of the semantics:  the NOT IN is true if foo2's
field is not equal to *all* of the entries returned by the subselect.
If one of those is NULL, then the result of the NOT IN can't be "true",
it has to be "unknown", because you don't know whether the foo2 value
ought to be considered equal to the null or not.  The NOT IN can return
a definite "false" when it finds a match to one of the non-null
subselect values, but never a definite "true".  Of course when you are
using it as a WHERE condition you won't see the difference between
"false" and "unknown".

I do see a related bug here, though: I'd expect

select * from foo2 where (name not in (select name from foo)) is null;

to produce hits, and it doesn't.  I suspect the subselect evaluator
is being a little careless about nulls ... will look into it.

            regards, tom lane

Re: [BUGS] (null) != (null) ?

From
Todd Vierling
Date:
On Tue, 26 Oct 1999, Tom Lane wrote:

: > => select * from foo2 where field not in (select field from foo);

: Well, it falls out of the semantics:  the NOT IN is true if foo2's
: field is not equal to *all* of the entries returned by the subselect.
: If one of those is NULL, then the result of the NOT IN can't be "true",
: it has to be "unknown",

In this case, I suppose I can produce another workaround:

select * from foo where field not in
  (select field from foo2 where field notnull);

which is more wordy, but seems to work as I want.

Thanks for the help!

--
-- Todd Vierling (tv@pobox.com)


Re: [BUGS] (null) != (null) ?

From
Tom Lane
Date:
Todd Vierling <tv@pobox.com> writes:
> Although I did find a SQL92 document on the web in the amount of time this
> took to copy back to me, and I see the clause about NULL <comp op>
> <anything> being unknown.  Which, I imagine, means "implementation
> dependent".

No.  It means unknown, ie, NULL.  SQL's predicates are three-valued.

            regards, tom lane

Re: [BUGS] (null) != (null) ?

From
Bruce Momjian
Date:
> On Tue, 26 Oct 1999, Tom Lane wrote:
>
> : > Both "fieldname" definitions are identical (verified with char(2) and
> : > varchar(100) in particular), and both tables contain a row with a "null" in
> : > that field.  However, the results don't contain the row with the "null"
> : > value.
> :
> : NULL = NULL does not yield TRUE, it yields NULL.  For that matter,
> : NULL != NULL does not yield FALSE --- it yields NULL.  This is a
> : basic consequence of the semantics of NULL.
>
> !?
>
> I have been using such constructs on commercial databases for ages.  Do you
> have a link to a web-based SQL standard transcription that I could look this
> up?  (I'll check up on exactly which database(s) I can use this type of
> construct when I get back to work tomorrow....)
>
> It seems _extremely_ counter-intuitive, especially in cases where both
> fields are in fact the same type.

But NULL is unknown.  How do you know they are equal if both values are
unknown?

--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: [BUGS] (null) != (null) ?

From
Bruce Momjian
Date:
I have just added a paragraph about this comparison in my book.  You can
see it on the documentation web page under "Published book".


> On Tue, 26 Oct 1999, Tom Lane wrote:
>
> : > Both "fieldname" definitions are identical (verified with char(2) and
> : > varchar(100) in particular), and both tables contain a row with a "null" in
> : > that field.  However, the results don't contain the row with the "null"
> : > value.
> :
> : NULL = NULL does not yield TRUE, it yields NULL.  For that matter,
> : NULL != NULL does not yield FALSE --- it yields NULL.  This is a
> : basic consequence of the semantics of NULL.
>
> !?
>
> I have been using such constructs on commercial databases for ages.  Do you
> have a link to a web-based SQL standard transcription that I could look this
> up?  (I'll check up on exactly which database(s) I can use this type of
> construct when I get back to work tomorrow....)
>
> It seems _extremely_ counter-intuitive, especially in cases where both
> fields are in fact the same type.
>
> : Nearly all Postgres operators yield NULL if any input is NULL.
>
> Interesting ... so see my clarification of (2) below.
>
> : If you really want to match up nulls in your example, you can do
> : something like
> :     WHERE (a.fieldname = b.fieldname) OR
> :           (a.fieldname IS NULL AND b.fieldname IS NULL)
>
> Which I already described in my text, sigh.
>
> : This is pretty grotty, of course, so my inclination would be to
> : use a special non-NULL value --- an empty string, for example ---
>
> Doesn't work for datetime, which is an important application in my case
> which rather needs null to indicate "no datestamp at all".
>
> : > (2) NOT IN doesn't seem to work at all.  I always get 0 results--and very
> : >     rapidly at that!--regardless of the situation.
> :
> : I don't think it's quite *that* broken.  How about a concrete
> : example of what you're trying to do?
>
> Well, after reading your statement about "Nearly all Postgres ...", here's a
> very simple example that I was able to create based on that assumption:
>
> => create temp table foo (name varchar(10));
> => create temp table foo2 (name varchar(10));
> => insert into foo values (null); // <<- here's the tripwire!
> => insert into foo values ('a');
> => insert into foo2 values ('a');
> => insert into foo2 values ('b');
> => select * from foo2 where field not in (select field from foo);
>
> field
> -----
> (0 rows)
>
> Now *that* is awfully disturbing.  :>
>
> --
> -- Todd Vierling (tv@pobox.com)
>
>
> ************
>
>


--
  Bruce Momjian                        |  http://www.op.net/~candle
  maillist@candle.pha.pa.us            |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026