Thread: why the need for is null?

why the need for is null?

From
Baldur Norddahl
Date:
Hi,

How come "X=null" is not the same as "X is null"? I got a few selects with
queries like this:

select * from foo where customer=#customer# or (#customer# is null and customer
is null)

Without the last part, it will not correctly match null customers.

PS. I am using the ibatis framework for java, so the #customer# gets translated
to ? and the whole thing made into a prepared statement.

Thanks,

Baldur

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Re: why the need for is null?

From
Martijn van Oosterhout
Date:
Because in trivalued logic, <anything> = NULL is also NULL. You need a
special operator to check if something is NULL, hence IS NULL and IS NOT
NULL. SQL standard says so.

This is a FAQ but I don't see it mentioned there.

On Thu, Jan 01, 2004 at 10:45:35PM +0100, Baldur Norddahl wrote:
> Hi,
>
> How come "X=null" is not the same as "X is null"? I got a few selects with
> queries like this:
>
> select * from foo where customer=#customer# or (#customer# is null and customer
> is null)
>
> Without the last part, it will not correctly match null customers.
>
> PS. I am using the ibatis framework for java, so the #customer# gets translated
> to ? and the whole thing made into a prepared statement.
>
> Thanks,
>
> Baldur
>
> ----------------------------------------------------------------
> This message was sent using IMP, the Internet Messaging Program.
>
> ---------------------------(end of broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly

--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> (... have gone from d-i being barely usable even by its developers
> anywhere, to being about 20% done. Sweet. And the last 80% usually takes
> 20% of the time, too, right?) -- Anthony Towns, debian-devel-announce

Attachment

Re: why the need for is null?

From
Baldur Norddahl
Date:
Quoting Martijn van Oosterhout <kleptog@svana.org>:

> Because in trivalued logic, <anything> = NULL is also NULL. You need a
> special operator to check if something is NULL, hence IS NULL and IS NOT
> NULL. SQL standard says so.
>
> This is a FAQ but I don't see it mentioned there.

Ok, but since this can be quite annoying and unexpected, could we get an
operator that does not use tristate logic but simply compares? Maybe == which
seems to be free :-)

So X==Y is true if X and Y are equal or both are null, false othervise.

Baldur


----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Re: why the need for is null?

From
Mark Kirkwood
Date:
The relational model was designed using a 3 valued logic - true, false,null.

All relational database implementations will inflict this on you :-)

Not everyone is convinced that the 3 valued approach was the best way.
For some entertaining comments by Chris Date and Fabian Pascal see:

http://www.dbdebunk.citymax.com/page/page/622689.htm

best wishes

Mark

Baldur Norddahl wrote:

>Hi,
>
>How come "X=null" is not the same as "X is null"? I got a few selects with
>queries like this:
>
>
>


Re: why the need for is null?

From
Martijn van Oosterhout
Date:
On Thu, Jan 01, 2004 at 11:53:29PM +0100, Baldur Norddahl wrote:
> Ok, but since this can be quite annoying and unexpected, could we get an
> operator that does not use tristate logic but simply compares? Maybe == which
> seems to be free :-)
>
> So X==Y is true if X and Y are equal or both are null, false othervise.

Annoying, not really. It's actually extremely useful. It's useful having a
value which is never equal to anything else, not even itself. If you use it
to represent "unknown" it will work for you. If you try to use it for
anything else, it will bite you.

You could create a new operator, but that means you'll have difficulty
moving it to any database that doesn't have that operator (which is most of
them).

If you want it to match perhaps you should forget NULL and use '' (zero
length string) instead.
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> (... have gone from d-i being barely usable even by its developers
> anywhere, to being about 20% done. Sweet. And the last 80% usually takes
> 20% of the time, too, right?) -- Anthony Towns, debian-devel-announce

Attachment

Re: why the need for is null?

From
Tom Lane
Date:
Martijn van Oosterhout <kleptog@svana.org> writes:
> If you want it to match perhaps you should forget NULL and use '' (zero
> length string) instead.

Yes.  The SQL semantics essentially define NULL as meaning "unknown",
which does not mean "empty" or "not applicable" or anything like that
--- it means "I am not sure what this field should contain".  The
spec's semantics work properly under that interpretation.  For other
interpretations they will confuse and distress you.

It's better to choose a specific non-null value to represent "empty",
if you want the semantics that "empty" is equal to "empty".


BTW, the actual spec text that mandates this is SQL99 Part 2 section
8.2 <comparison predicate>, general rule 1:

         1) Let XV and YV be two values represented by <value expression>s X
            and Y, respectively. The result of:

              X <comp op> Y

            is determined as follows:

            Case:

            a) If either XV or YV is the null value, then

                 X <comp op> Y

              is unknown.

            b) Otherwise, [ etc etc ]

It may be illuminating that the boolean value "unknown" is the same as
(or at least the standard does not distinguish it from) boolean NULL.
Cf. section 4.6:

         The data type boolean comprises the distinct truth values true and
         false. Unless prohibited by a NOT NULL constraint, the boolean
         data type also supports the unknown truth value as the null value.
         This specification does not make a distinction between the null
         value of the boolean data type and the unknown truth value that is
         the result of an SQL <predicate>, <search condition>, or <boolean
         value expression>; they may be used interchangeably to mean exactly
         the same thing.

            regards, tom lane

Re: why the need for is null?

From
Tom Lane
Date:
Mark Kirkwood <markir@paradise.net.nz> writes:
> The relational model was designed using a 3 valued logic - true, false,null.
> All relational database implementations will inflict this on you :-)

Not sure that it's fair to characterize this as a property of the
relational model.  It is a property of the SQL standard.  There are
many purists who say that SQL is not really relational at all (Chris
Date being one of the more prominent ones, IIRC), but in any case,
SQL drew three-valued logic from other sources than the relational
model of databases.

However, it is true that all spec-conforming implementations of SQL
will inflict this on you.  MS SQL Server, for one, has apparently been
non-compliant on this point in the past, and I'm not too sure about
Oracle.

            regards, tom lane

Re: why the need for is null?

From
Mark Kirkwood
Date:
Tom Lane wrote:

>Not sure that it's fair to characterize this as a property of the
>relational model.  It is a property of the SQL standard.
>
Yes indeed - I fell into the classic "Relational model and SQL are not
the same thing" trap !

Mark



Re: why the need for is null?

From
Baldur Norddahl
Date:
Quoting Martijn van Oosterhout <kleptog@svana.org>:

> Annoying, not really. It's actually extremely useful. It's useful having a
> value which is never equal to anything else, not even itself. If you use it
> to represent "unknown" it will work for you. If you try to use it for
> anything else, it will bite you.

I need it to represent "empty" because the field in question is a foreign key to
another table. If it represented "unknown" the foreign key should block it as
it could not possible know if that "unknown" value was valid. But I can't argue
against the SQL standard of course.

> You could create a new operator, but that means you'll have difficulty
> moving it to any database that doesn't have that operator (which is most of
> them).

Any commercial database vendor would be happy to make such a feature just for
that reason: to lock me in to their database :-). I do not try to stay database
neutral, and use lots of other features that will only work in postgresql.

> If you want it to match perhaps you should forget NULL and use '' (zero
> length string) instead.

Then I need to have a meaningless entry in the foreign table, and fill my code
with special cases that filter out that fake entry before showing the data to
the user.

Besides who said I didn't want to allow the empty string as valid data? This
would be even more an issue if the field was a nummeric, where any nummeric
value is ok. If I can not use NULL to represent "empty" or "not applicateable"
I would have to make a special field that tells me if I should ignore the
previous field or not. Does not sound reasonable when NULL works fine for just
that.

The best compromise I found so far is this "X=Y or X is null and Y is null"
construct. Just looks hard to understand and cumpersome for someone which is
not expert on this issue.

Baldur

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Re: why the need for is null?

From
"Chris Travers"
Date:
Hi all;

Here is a brief guide to NULL's and Referential Integrity:

NULL is a special SQL value meaning 'unknown.'  Well, it is a little more
complicated and NULL can mean "value does not exist."  Therefore X = NULL is
NULL becuase we don't know if the NULL is equal to X.  So:
NULL does not equal NULL
NULL does not equal anything else.
NULL does not equal an empty string
You get the picture.

Think of it this way:  2 non-existant or unknown values don't equal any
other existant or non-existant value, known or unknown.

Now, referential integrity is defined as follows:
For every non-NULL foreign key, there is a corresponding primary key in the
referenced table.
Note that NULL's are specifically handled in the RI definition.

When do I use NULL's?  NULL's have a few uses:
1:  To indicate that the foreign key does NOT correspond with a primary key
in the referencing table.  In this case, NULL means something like "value
does not exist" (this is not the same as an empty value).  In an employee's
table, a NULL in the manager's field would mean "Employee does not report to
any other employee as a manager."
2:  To indicate that the value probably exists, but is unknown at present
(we don't know this customer's address, so we set it to NULL).

Best Wishes,
Chris Travers


Re: why the need for is null?

From
Guy Fraser
Date:
Baldur Norddahl wrote:

> Quoting Martijn van Oosterhout <kleptog@svana.org>:
>
...

>>You could create a new operator, but that means you'll have difficulty
>>moving it to any database that doesn't have that operator (which is most of
>>them).
>>
>>
>
>Any commercial database vendor would be happy to make such a feature just for
>that reason: to lock me in to their database :-). I do not try to stay database
>neutral, and use lots of other features that will only work in postgresql.
>
>
There already is an operator, and it is the ANSI SQL operator "IS". Just
because "IS" does not use puctuation characters does not mean it is not
an operator.

>>If you want it to match perhaps you should forget NULL and use '' (zero
>>length string) instead.
>>
>>
An empty string is an empty string, and a NULL is the lack of there being a string, they are not the same. If you want
toemulate what you have proposed then use the function "coalesce". 

Example:

select coalesce(string_column,'') from some_table ;

This will return an empty string for all records that have no data in
string_column.

I have designed a number of realtime data collection programs, and when inserting only the available data into the
propercolumns of the table, I often end up will "NULL" columns because there was no data for that column. It is very
usefullto know if you had data available or if the data was 0 or an empty string. 

If you still don't understand, then use MySQL it is messed up and allows weird things like most of what you want to do.

Happy New Year




Re: why the need for is null?

From
Gaetano Mendola
Date:
Martijn van Oosterhout wrote:

> On Thu, Jan 01, 2004 at 11:53:29PM +0100, Baldur Norddahl wrote:
>
>>Ok, but since this can be quite annoying and unexpected, could we get an
>>operator that does not use tristate logic but simply compares? Maybe == which
>>seems to be free :-)
>>
>>So X==Y is true if X and Y are equal or both are null, false othervise.
>
>
> Annoying, not really. It's actually extremely useful. It's useful having a
> value which is never equal to anything else, not even itself. If you use it
> to represent "unknown" it will work for you. If you try to use it for
> anything else, it will bite you.
>
> You could create a new operator, but that means you'll have difficulty
> moving it to any database that doesn't have that operator (which is most of
> them).
>
> If you want it to match perhaps you should forget NULL and use '' (zero
> length string) instead.

Don't mentioning the fact that for Oracle a zero length string is NULL!
Isn't that funny ?

Regards
Gaetano Mendola

Re: why the need for is null?

From
Baldur Norddahl
Date:
Quoting Guy Fraser <guy@incentre.net>:

> There already is an operator, and it is the ANSI SQL operator "IS". Just
> because "IS" does not use puctuation characters does not mean it is not
> an operator.

"IS" is not an operator in postgresql, at least not in the same sense that "="
is an operator. You can not do "\do is" while "\do =" works fine.

> select coalesce(string_column,'') from some_table ;

Will postgresql still make effective use of the indexes if I use a function like
coalesce on the column before comparing it?

Even if it does, the method I already use is more effective.

> If you still don't understand, then use MySQL it is messed up and allows
> weird things like most of what you want to do.

1) I understand the issues involved perfectly. I just happens to have a table
where it would be usefull that NULL=NULL is true. It is not so, and therefore I
have to use a syntax that is hard to read and I have been made to understand
that I will have to accept that. Fine.

2) What kind of crap is that flaming me like this? Do all users that ask a
question about why postgresql or the sql standard implements a feature in a
specific way, end up being told to switch to mysql?

3) Mysql knows how to compare nulls:

mysql> select null=null;
+-----------+
| null=null |
+-----------+
|      NULL |
+-----------+
1 row in set (0.01 sec)

mysql> select null is null;
+--------------+
| null is null |
+--------------+
|            1 |
+--------------+
1 row in set (0.00 sec)

Baldur


----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.

Re: why the need for is null?

From
Mike Mascari
Date:
Baldur Norddahl wrote:

>Will postgresql still make effective use of the indexes if I use a function like
>coalesce on the column before comparing it?
>
>
PostgreSQL doesn't index NULLs, which may or may not be a problem for
you. Perhaps creating a functional index on the COALESCE(myfield, '')
would achieve what you want, if you are querying the data in a similar
manner.

>1) I understand the issues involved perfectly. I just happens to have a table
>where it would be usefull that NULL=NULL is true. It is not so, and therefore I
>have to use a syntax that is hard to read and I have been made to understand
>that I will have to accept that. Fine.
>
>
If you don't want to change your code, you can optionally set
TRANSFORM_NULL_EQUALS to TRUE in postgresql.conf:

http://www.postgresql.org/docs/current/static/runtime-config.html

This is a parse-time transformation, so a comparison between two
attributes whose value is NULL using the equality operator will still
yield NULL:

[test@lexus] select NULL = NULL;
 ?column?
----------
 t
(1 row)

[test@lexus] create table foo (x integer, y integer);
CREATE TABLE
[test@lexus] insert into foo values (NULL, NULL);
INSERT 164948 1
[test@lexus] select (x = y) from foo;
 ?column?
----------

(1 row)

[test@lexus] select (x = NULL) from foo;
 ?column?
----------
 t
(1 row)

I suggest it only as a temporary stop-gap until the code can be changed
into something SQL compliant.

>2) What kind of crap is that flaming me like this? Do all users that ask a
>question about why postgresql or the sql standard implements a feature in a
>specific way, end up being told to switch to mysql?
>
>
No. :-)

Mike Mascari




Re: why the need for is null?

From
"Chris Travers"
Date:
> Don't mentioning the fact that for Oracle a zero length string is NULL!
> Isn't that funny ?


All I can say is wow.

Finally a good *technical* reason not to use Oracle!

Best Wishes,
Chris Travers


Re: why the need for is null?

From
Mark Kirkwood
Date:
He was just asking.... :-)

> If you still don't understand, then use MySQL it is messed up and
> allows weird things like most of what you want to do.
>
>


Re: why the need for is null?

From
Christopher Browne
Date:
In an attempt to throw the authorities off his trail, bbn-pgsql.general@clansoft.dk (Baldur Norddahl) transmitted:
> How come "X=null" is not the same as "X is null"?

Because NULL is not really a "value" in SQL.  Nothing can ever be
equal to a NULL, and that includes another NULL.

In Some Pseudo-SQL Database Systems, NULL is treated as a sort of
"zero" value, which is contrary to the SQL standards.

I seem to recall that in Microsoft's port of Sybase SQL Server,
there's some syntactic sugar that "x = NULL" is treated as if it were
querying "x is NULL."

It would presumably be _possible_ to modify PostgreSQL's query parser
to handle "x = NULL" similarly; feel free to submit a patch to that
end, if you consider it a vital change to make.
--
output = reverse("ac.notelrac.teneerf" "@" "454aa")
http://www.ntlug.org/~cbbrowne/oses.html
Rules  of the  Evil Overlord  #212. "I  will not  send  out battalions
composed wholly of robots or  skeletons against heroes who have qualms
about killing living beings.  <http://www.eviloverlord.com/>

Re: why the need for is null?

From
"Chris Travers"
Date:
Minor correction to Christopher Browne's post:

It is currently possible to set PostgreSQL to evaluate x = NULL as x IS
NULL.

Under the current documentations section 16.4.9.2. (Platform and Client
Compatibility):

transform_null_equals (boolean)

When turned on, expressions of the form expr = NULL (or NULL = expr) are
treated as expr IS NULL, that is, they return true if expr evaluates to the
null value, and false otherwise. The correct behavior of expr = NULL is to
always return null (unknown). Therefore this option defaults to off.


This option is set in the postgresql.conf file.  Hope this helps.

Best Wishes,
Chris Travers



Re: why the need for is null?

From
Greg Stark
Date:
Mike Mascari <mascarm@mascari.com> writes:

> PostgreSQL doesn't index NULLs

Postgres does index NULLs.

However "IS NULL" can't currently use the index, which in this case is
effectively the same thing as they're not being indexed.

However there are other cases like ORDER BY where the fact that Postgres does
index NULLs is important. By comparison Oracle does not index NULLs and as a
result performs differently when doing ORDER BY without a where clause that
excludes NULLs.

--
greg

Re: why the need for is null?

From
Tom Lane
Date:
"Chris Travers" <chris@travelamericas.com> writes:
> Minor correction to Christopher Browne's post:
> It is currently possible to set PostgreSQL to evaluate x = NULL as x IS
> NULL.

Also, while I'm not totally sure about the behavior of SQL Server,
we have been told that its recent releases are spec-compliant on NULL
handling.  The fact that 'transform_null_equals' exists (and was even
the default PG behavior for awhile) arises from the fact that some older
versions of MS Access expect this behavior.  Presumably that means that
MS has at some point shipped a database that behaves that way ...

            regards, tom lane