Thread: BUG #6064: != NULL, <> NULL do not work

BUG #6064: != NULL, <> NULL do not work

From
"Michael Pilling"
Date:
The following bug has been logged online:

Bug reference:      6064
Logged by:          Michael Pilling
Email address:      Michael.Pilling@dsto.defence.gov.au
PostgreSQL version: PostgreSQL 9.0
Operating system:   Windows XP (server) Ubuntu 10.4 (Client)
Description:        != NULL, <> NULL do not work
Details:

Server version was PostgreSQL 9.0.1, compiled by Visual C++ build 1500,
32-bit running on Windows XP 32 bit.

It is arguable whether this bug is in the documentation, parser or
implementation. Personally I think it is in the implementation.

A reasonable programmer would expect != NULL, <> NULL and IS NOT NULL to be
synonyms. However IS NOT NULL works and the others don't.

At the very least the documentation for comparison operators should state
that != and <> will not work with NULL but this would be an obscure fix.
Ideally the compiler would implement != NULL and <> NULL like it implements
IS NOT NULL, failing that the parser should at least flag the combinations
with != and <> as syntax or semantic errors.

Reproducing the bug:

Execute the following code:


DROP TABLE example;

CREATE TABLE example (
    id        SERIAL PRIMARY KEY,
    name        varchar(40),
    content        varchar(40)
);

INSERT INTO example ( name, content ) VALUES ( 'Test 1', 'Test 1' );
INSERT INTO example ( name ) VALUES ( 'Test 2' );

CREATE OR REPLACE FUNCTION  show_problem() RETURNS SETOF example AS

$$
DECLARE
  result_name varchar(40);
  result_content varchar(40);
BEGIN
    SELECT example.name, example.content INTO result_name, result_content
FROM example WHERE id=1;
    IF result_content != NULL THEN
        RAISE NOTICE '!= THEN part id=1';
    ELSE
        RAISE NOTICE '!= ELSE part id=1';
    END IF;

    SELECT example.name, example.content INTO result_name, result_content
FROM example WHERE id=2;
    IF result_content != NULL THEN
        RAISE NOTICE '!= THEN part id=2';
    ELSE
        RAISE NOTICE '!= ELSE part id=2';
    END IF;

    SELECT example.name, example.content INTO result_name, result_content
FROM example WHERE id=1;
    IF result_content <> NULL THEN
        RAISE NOTICE '<> THEN part id=1';
    ELSE
        RAISE NOTICE '<> ELSE part id=1';
    END IF;

    SELECT example.name, example.content INTO result_name, result_content
FROM example WHERE id=2;
    IF result_content <> NULL THEN
        RAISE NOTICE '<> THEN part id=2';
    ELSE
        RAISE NOTICE '<> ELSE part id=2';
    END IF;
    SELECT example.name, example.content INTO result_name, result_content
FROM example WHERE id=1;
    IF result_content IS NOT NULL THEN
        RAISE NOTICE 'IS NOT THEN part id=1';
    ELSE
        RAISE NOTICE 'IS NOT ELSE part id=1';
    END IF;

    SELECT example.name, example.content INTO result_name, result_content
FROM example WHERE id=2;
    IF result_content IS NOT NULL THEN
        RAISE NOTICE 'IS NOT THEN part id=2';
    ELSE
        RAISE NOTICE 'IS NOT ELSE part id=2';
    END IF;
RETURN QUERY Select * from example;
RETURN;
END;
$$ LANGUAGE plpgsql;

select * from show_problem();

The last two NOTICEs are what I would regard to be correct. The if statement
has executed according to whether the data was NULL or not. For != and <>
the IF statements always execute the ELSE part regardless of the data
value.

Regards,
Michael

Re: BUG #6064: != NULL, <> NULL do not work

From
Abel Abraham Camarillo Ojeda
Date:
Do not write expression =3D NULL because NULL is not "equal to" NULL. (The
null value represents an unknown value, and it is not known whether two
unknown values are equal.) This behavior conforms to the SQL standard.

http://www.postgresql.org/docs/9.1/static/functions-comparison.html


On Fri, Jun 17, 2011 at 2:39 AM, Michael Pilling
<Michael.Pilling@dsto.defence.gov.au> wrote:
>
> The following bug has been logged online:
>
> Bug reference: =C2=A0 =C2=A0 =C2=A06064
> Logged by: =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Michael Pilling
> Email address: =C2=A0 =C2=A0 =C2=A0Michael.Pilling@dsto.defence.gov.au
> PostgreSQL version: PostgreSQL 9.0
> Operating system: =C2=A0 Windows XP (server) Ubuntu 10.4 (Client)
> Description: =C2=A0 =C2=A0 =C2=A0 =C2=A0!=3D NULL, <> NULL do not work
> Details:
>
> Server version was PostgreSQL 9.0.1, compiled by Visual C++ build 1500,
> 32-bit running on Windows XP 32 bit.
>
> It is arguable whether this bug is in the documentation, parser or
> implementation. Personally I think it is in the implementation.
>
> A reasonable programmer would expect !=3D NULL, <> NULL and IS NOT NULL t=
o be
> synonyms. However IS NOT NULL works and the others don't.
>
> At the very least the documentation for comparison operators should state
> that !=3D and <> will not work with NULL but this would be an obscure fix.
> Ideally the compiler would implement !=3D NULL and <> NULL like it implem=
ents
> IS NOT NULL, failing that the parser should at least flag the combinations
> with !=3D and <> as syntax or semantic errors.
>
> Reproducing the bug:
>
> Execute the following code:
>
>
> DROP TABLE example;
>
> CREATE TABLE example (
> =C2=A0 =C2=A0 =C2=A0 =C2=A0id =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =
=C2=A0SERIAL PRIMARY KEY,
> =C2=A0 =C2=A0 =C2=A0 =C2=A0name =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0=
varchar(40),
> =C2=A0 =C2=A0 =C2=A0 =C2=A0content =C2=A0 =C2=A0 =C2=A0 =C2=A0 varchar(40)
> );
>
> INSERT INTO example ( name, content ) VALUES ( 'Test 1', 'Test 1' );
> INSERT INTO example ( name ) VALUES ( 'Test 2' );
>
> CREATE OR REPLACE FUNCTION =C2=A0show_problem() RETURNS SETOF example AS
>
> $$
> DECLARE
> =C2=A0result_name varchar(40);
> =C2=A0result_content varchar(40);
> BEGIN
> =C2=A0 =C2=A0SELECT example.name, example.content INTO result_name, resul=
t_content
> FROM example WHERE id=3D1;
> =C2=A0 =C2=A0IF result_content !=3D NULL THEN
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE '!=3D THEN part id=3D1';
> =C2=A0 =C2=A0ELSE
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE '!=3D ELSE part id=3D1';
> =C2=A0 =C2=A0END IF;
>
> =C2=A0 =C2=A0SELECT example.name, example.content INTO result_name, resul=
t_content
> FROM example WHERE id=3D2;
> =C2=A0 =C2=A0IF result_content !=3D NULL THEN
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE '!=3D THEN part id=3D2';
> =C2=A0 =C2=A0ELSE
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE '!=3D ELSE part id=3D2';
> =C2=A0 =C2=A0END IF;
>
> =C2=A0 =C2=A0SELECT example.name, example.content INTO result_name, resul=
t_content
> FROM example WHERE id=3D1;
> =C2=A0 =C2=A0IF result_content <> NULL THEN
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE '<> THEN part id=3D1';
> =C2=A0 =C2=A0ELSE
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE '<> ELSE part id=3D1';
> =C2=A0 =C2=A0END IF;
>
> =C2=A0 =C2=A0SELECT example.name, example.content INTO result_name, resul=
t_content
> FROM example WHERE id=3D2;
> =C2=A0 =C2=A0IF result_content <> NULL THEN
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE '<> THEN part id=3D2';
> =C2=A0 =C2=A0ELSE
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE '<> ELSE part id=3D2';
> =C2=A0 =C2=A0END IF;
> =C2=A0 =C2=A0SELECT example.name, example.content INTO result_name, resul=
t_content
> FROM example WHERE id=3D1;
> =C2=A0 =C2=A0IF result_content IS NOT NULL THEN
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE 'IS NOT THEN part id=3D1';
> =C2=A0 =C2=A0ELSE
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE 'IS NOT ELSE part id=3D1';
> =C2=A0 =C2=A0END IF;
>
> =C2=A0 =C2=A0SELECT example.name, example.content INTO result_name, resul=
t_content
> FROM example WHERE id=3D2;
> =C2=A0 =C2=A0IF result_content IS NOT NULL THEN
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE 'IS NOT THEN part id=3D2';
> =C2=A0 =C2=A0ELSE
> =C2=A0 =C2=A0 =C2=A0 =C2=A0RAISE NOTICE 'IS NOT ELSE part id=3D2';
> =C2=A0 =C2=A0END IF;
> RETURN QUERY Select * from example;
> RETURN;
> END;
> $$ LANGUAGE plpgsql;
>
> select * from show_problem();
>
> The last two NOTICEs are what I would regard to be correct. The if statem=
ent
> has executed according to whether the data was NULL or not. For !=3D and =
<>
> the IF statements always execute the ELSE part regardless of the data
> value.
>
> Regards,
> Michael
>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs
>

Re: BUG #6064: != NULL, <> NULL do not work

From
"Kevin Grittner"
Date:
"Michael Pilling" <Michael.Pilling@dsto.defence.gov.au> wrote:

> A reasonable programmer would expect != NULL, <> NULL and IS NOT
> NULL to be synonyms.

Only if that programmer was not aware of the SQL standard and had
not worked much with a standard-conforming database.

NULL is conceptually intended to indicate "unknown" or "not
applicable".  If you have a person table with a date_of_birth
column, which contains NULL for a number of rows for which the date
of birth is unknown, can you say that all such people have the same
date of birth?  No; for any such person, the result of comparing
their date of birth to anyone else's (whether or not the other one
is NULL) is UNKNOWN.

You might want to read up on IS [NOT] DISTINCT FROM.  In the SQL
language, while NULL is not known to be equal to NULL, you *can* say
that NULL IS NOT DISTINCT FROM NULL.

This is most definitely not a bug in the software.  The
documentation does cover it here:

http://www.postgresql.org/docs/9.0/interactive/functions-comparison.html

Is there something you would add to that?

-Kevin

Re: BUG #6064: != NULL, <> NULL do not work

From
Craig Ringer
Date:
On 06/17/2011 10:20 PM, Kevin Grittner wrote:
> "Michael Pilling"<Michael.Pilling@dsto.defence.gov.au>  wrote:
>
>> A reasonable programmer would expect != NULL,<>  NULL and IS NOT
>> NULL to be synonyms.
>
> Only if that programmer was not aware of the SQL standard and had
> not worked much with a standard-conforming database.

Yep, and if they want to continue working that way, they can use the
flag intended for compatibility with Microsoft Access that makes NULL =
NULL result in 't' instead of NULL.

http://www.postgresql.org/docs/current/static/runtime-config-compatible.html#GUC-TRANSFORM-NULL-EQUALS

Note that this flag is very specifically limited to equality
comparisions using the '=' operator. It won't make NULL behave as a
value in any other way. For example, 1 > NULL will still return NULL.

--
Craig Ringer

Re: BUG #6064: != NULL, <> NULL do not work [sec=UNCLASSIFIED]

From
"Pilling, Michael"
Date:
Thanks Craig,

The real problem here then is that the documentation showing
the boolean comparison operators does not mention this quirk, which I=20
accept may be a standard quirk but it's still a quirk. You just
wouldn't go looking for that flag unless you had any inkling that
it might exist.

And indeed the parser does not generate warnings either.

Regards,
Michael




-----Original Message-----
From: Craig Ringer [mailto:craig@postnewspapers.com.au]
Sent: Sun 6/19/2011 6:30 PM
To: Kevin Grittner
Cc: Pilling, Michael; pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #6064: !=3D NULL, <> NULL do not work
=20
On 06/17/2011 10:20 PM, Kevin Grittner wrote:
> "Michael Pilling"<Michael.Pilling@dsto.defence.gov.au>  wrote:
>
>> A reasonable programmer would expect !=3D NULL,<>  NULL and IS NOT
>> NULL to be synonyms.
>
> Only if that programmer was not aware of the SQL standard and had
> not worked much with a standard-conforming database.

Yep, and if they want to continue working that way, they can use the=20
flag intended for compatibility with Microsoft Access that makes NULL =3D=
=20
NULL result in 't' instead of NULL.

http://www.postgresql.org/docs/current/static/runtime-config-compatible.htm=
l#GUC-TRANSFORM-NULL-EQUALS

Note that this flag is very specifically limited to equality=20
comparisions using the '=3D' operator. It won't make NULL behave as a=20
value in any other way. For example, 1 > NULL will still return NULL.

--
Craig Ringer


IMPORTANT: This email remains the property of the Department of Defence and=
 is subject to the jurisdiction of section 70 of the Crimes Act 1914. If yo=
u have received this email in error, you are requested to contact the sende=
r and delete the email.

Re: BUG #6064: != NULL, <> NULL do not work [sec=UNCLASSIFIED]

From
"Pilling, Michael"
Date:
Hi Kevin,

Thanks for that. Point entirely taken. I think what I would add would be in=
 the table 9-1 of operators,
an extra column filled in only for =3D, <> and !=3D saying Important: see d=
ifference from IS [NOT] NULL.
Perhaps one reason I didn't pick up on this subtle issue is that IS NULL an=
d IS NOT NULL are not listed in this
table but they are comparison operators, just textual rather than symbolic =
ones in the grammar so they should be
in the table. I recall specifically looking up what is the not equal operat=
or in this language and only=20
going forward from the table, not realising I had to read any further.

I'd also add after "Do not write expression =3D NULL because NULL is not "e=
qual to" NULL."=20
Do not write expression !=3D NULL or <> NULL because NULL is not "not equal=
 to" NULL.
because while implied, it's not obvious that because =3D doesn't work with =
NULL that !=3D doesn't either.

Reading the note after this section saying before version 8.2 postgres was =
inconsistent with the SQL standard,
I think that really strengthens the case for the parser to issue warnings w=
hen it comes across =3D,<> !=3D used with
null and the transform_null_equals (boolean) compatibility flag isn't set.

BTW while I agree with you that "Only if they hadn't read the SQL standard"=
, how many people read the standard
of any language before they start programming in it? I may have read it 20 =
years ago but haven't recently.
It's not something you can rely on.

Best regards,
Michael

-----Original Message-----
From: Kevin Grittner [mailto:Kevin.Grittner@wicourts.gov]
Sent: Fri 6/17/2011 11:50 PM
To: Pilling, Michael; pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #6064: !=3D NULL, <> NULL do not work
=20
"Michael Pilling" <Michael.Pilling@dsto.defence.gov.au> wrote:
=20
> A reasonable programmer would expect !=3D NULL, <> NULL and IS NOT
> NULL to be synonyms.
=20
Only if that programmer was not aware of the SQL standard and had
not worked much with a standard-conforming database.
=20
NULL is conceptually intended to indicate "unknown" or "not
applicable".  If you have a person table with a date_of_birth
column, which contains NULL for a number of rows for which the date
of birth is unknown, can you say that all such people have the same
date of birth?  No; for any such person, the result of comparing
their date of birth to anyone else's (whether or not the other one
is NULL) is UNKNOWN.
=20
You might want to read up on IS [NOT] DISTINCT FROM.  In the SQL
language, while NULL is not known to be equal to NULL, you *can* say
that NULL IS NOT DISTINCT FROM NULL.
=20
This is most definitely not a bug in the software.  The
documentation does cover it here:
=20
http://www.postgresql.org/docs/9.0/interactive/functions-comparison.html
=20
Is there something you would add to that?
=20
-Kevin


IMPORTANT: This email remains the property of the Department of Defence and=
 is subject to the jurisdiction of section 70 of the Crimes Act 1914. If yo=
u have received this email in error, you are requested to contact the sende=
r and delete the email.

Re: BUG #6064: != NULL, <> NULL do not work [sec=UNCLASSIFIED]

From
Craig Ringer
Date:
On 20/06/2011 7:43 AM, Pilling, Michael wrote:
> Thanks Craig,
>
> The real problem here then is that the documentation showing
> the boolean comparison operators does not mention this quirk, which I
> accept may be a standard quirk but it's still a quirk.

What URL are you looking at?

http://www.postgresql.org/docs/current/interactive/functions-comparison.html

certainly does. Are you looking at documentation for a really ancient
version like 6.x ?

> You just
> wouldn't go looking for that flag unless you had any inkling that
> it might exist.

You shouldn't use it, either. As documented, it's an ugly hack to work
around a deficiency in MS Access, which doesn't use SQL correctly. You
may not like how NULL comparisons work, but I *strongly* recommend that
you get used to it because trying to ignore it *will* cause you problems
down the track, and there won't be any other flags to flip to change the
behaviour to how you want it.

SQL's 3-value logic isn't always popular and isn't as logically
consistent as I'd like when you get into messy things like arrays.
Unfortunately, it's not practical to just rip it out of the system
because it's so fundamentally linked into the relational calculus, how
outer joins work, how aggregates work, etc.

See: http://en.wikipedia.org/wiki/Null_(SQL)
See: http://en.wikipedia.org/wiki/Three-valued_logic

> And indeed the parser does not generate warnings either.

Why would it?

You might argue that performing an equality comparison to a literal NULL
is probably a mistake. Unfortunately, many queries are written by query
generators that will quite happily substitute null into placeholders.
This is often correct and will return the expected result so long as you
know what it means, eg:

   WHERE a = NULL OR b = 1;

will return 't' if b is 1, and false (or null, which evaluates to false
for WHERE clauses) when b is not 1, irrespective of the value of 'a'.
This may well be the application author's intent, and it's certainly valid.

Maybe Pg should have a warning when " = NULL " is seen that can be
emitted at INFO log level, the same log level as the notices about
implicit index creation etc. I doubt you'll find anyone enthusiastic
about implementing it, though, and the added parser time cost hardly
seems worth it.

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Re: BUG #6064: != NULL, <> NULL do not work [sec=UNCLASSIFIED]

From
Bruce Momjian
Date:
Pilling, Michael wrote:
> Hi Kevin,
>
> Thanks for that. Point entirely taken. I think what I would add would
> be in the table 9-1 of operators, an extra column filled in only for
> =, <> and != saying Important: see difference from IS [NOT] NULL.
> Perhaps one reason I didn't pick up on this subtle issue is that IS
> NULL and IS NOT NULL are not listed in this table but they are
> comparison operators, just textual rather than symbolic ones in the
> grammar so they should be in the table. I recall specifically looking
> up what is the not equal operator in this language and only going
> forward from the table, not realising I had to read any further.
>
> I'd also add after "Do not write expression = NULL because NULL is not
> "equal to" NULL." Do not write expression != NULL or <> NULL because
> NULL is not "not equal to" NULL.  because while implied, it's not
> obvious that because = doesn't work with NULL that != doesn't either.

I have written the attached patch to mention <> NULL also returns NULL.

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
new file mode 100644
index be92e6a..ddfb29a
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***************
*** 364,371 ****
      </indexterm>
      Ordinary comparison operators yield null (signifying <quote>unknown</>),
      not true or false, when either input is null.  For example,
!     <literal>7 = NULL</> yields null.  When this behavior is not suitable,
!     use the
      <literal>IS <optional> NOT </> DISTINCT FROM</literal> constructs:
  <synopsis>
  <replaceable>expression</replaceable> IS DISTINCT FROM <replaceable>expression</replaceable>
--- 364,371 ----
      </indexterm>
      Ordinary comparison operators yield null (signifying <quote>unknown</>),
      not true or false, when either input is null.  For example,
!     <literal>7 = NULL</> yields null, as does <literal>7 <> NULL</>.  When
!     this behavior is not suitable, use the
      <literal>IS <optional> NOT </> DISTINCT FROM</literal> constructs:
  <synopsis>
  <replaceable>expression</replaceable> IS DISTINCT FROM <replaceable>expression</replaceable>

Re: BUG #6064: != NULL, <> NULL do not work [sec=UNCLASSIFIED]

From
Bruce Momjian
Date:
Applied.

---------------------------------------------------------------------------

Bruce Momjian wrote:
> Pilling, Michael wrote:
> > Hi Kevin,
> >
> > Thanks for that. Point entirely taken. I think what I would add would
> > be in the table 9-1 of operators, an extra column filled in only for
> > =, <> and != saying Important: see difference from IS [NOT] NULL.
> > Perhaps one reason I didn't pick up on this subtle issue is that IS
> > NULL and IS NOT NULL are not listed in this table but they are
> > comparison operators, just textual rather than symbolic ones in the
> > grammar so they should be in the table. I recall specifically looking
> > up what is the not equal operator in this language and only going
> > forward from the table, not realising I had to read any further.
> >
> > I'd also add after "Do not write expression = NULL because NULL is not
> > "equal to" NULL." Do not write expression != NULL or <> NULL because
> > NULL is not "not equal to" NULL.  because while implied, it's not
> > obvious that because = doesn't work with NULL that != doesn't either.
>
> I have written the attached patch to mention <> NULL also returns NULL.
>
> --
>   Bruce Momjian  <bruce@momjian.us>        http://momjian.us
>   EnterpriseDB                             http://enterprisedb.com
>
>   + It's impossible for everything to be true. +

[ text/x-diff is unsupported, treating like TEXT/PLAIN ]

> diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
> new file mode 100644
> index be92e6a..ddfb29a
> *** a/doc/src/sgml/func.sgml
> --- b/doc/src/sgml/func.sgml
> ***************
> *** 364,371 ****
>       </indexterm>
>       Ordinary comparison operators yield null (signifying <quote>unknown</>),
>       not true or false, when either input is null.  For example,
> !     <literal>7 = NULL</> yields null.  When this behavior is not suitable,
> !     use the
>       <literal>IS <optional> NOT </> DISTINCT FROM</literal> constructs:
>   <synopsis>
>   <replaceable>expression</replaceable> IS DISTINCT FROM <replaceable>expression</replaceable>
> --- 364,371 ----
>       </indexterm>
>       Ordinary comparison operators yield null (signifying <quote>unknown</>),
>       not true or false, when either input is null.  For example,
> !     <literal>7 = NULL</> yields null, as does <literal>7 <> NULL</>.  When
> !     this behavior is not suitable, use the
>       <literal>IS <optional> NOT </> DISTINCT FROM</literal> constructs:
>   <synopsis>
>   <replaceable>expression</replaceable> IS DISTINCT FROM <replaceable>expression</replaceable>

>
> --
> Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-bugs

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +