Thread: String Comparison and NULL
Hi, I'm fairly new to PG and databases in general so this may very well be a problem in my thought process. If I have a simple table with an ID (integer) and Animal (text) like this... 1 Dog 2 Cat 3 NULL 4 Horse 5 Pig 6 Cat 7 Cat ... and I do something like "select id where animal <> 'Cat';" then shouldn't 1, 3, 4 and 5 be picked? As it is I only get 1, 4 and 5. NULL is not 'Cat'. I realize that if I were testing for NULL itself I would use IS or IS NOT but this...? I'm a little confused. Thanks!
Hello 2008/4/28 <seijin@gmail.com>: > Hi, > > I'm fairly new to PG and databases in general so this may very well be > a problem in my thought process. > > If I have a simple table with an ID (integer) and Animal (text) like > this... > > 1 Dog > 2 Cat > 3 NULL > 4 Horse > 5 Pig > 6 Cat > 7 Cat > > ... and I do something like "select id where animal <> 'Cat';" then > shouldn't 1, 3, 4 and 5 be picked? As it is I only get 1, 4 and 5. > NULL is not 'Cat'. I realize that if I were testing for NULL itself I > would use IS or IS NOT but this...? I'm a little confused. > In this case use operator IS DISTINCT FROM select id where animal IS DISTINCT FROM 'Cat'; Regards Pavel Stehule > Thanks! > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
Pavel Stehule <pavel.stehule@gmail.com> schrieb: > > ... and I do something like "select id where animal <> 'Cat';" then > > shouldn't 1, 3, 4 and 5 be picked? As it is I only get 1, 4 and 5. > > NULL is not 'Cat'. I realize that if I were testing for NULL itself I NULL is nothing, you can't compare something with nothing. As Pavel suggested, As Pavel suggested: > In this case use operator IS DISTINCT FROM > > select id where animal IS DISTINCT FROM 'Cat'; Andreas -- Really, I'm not out to destroy Microsoft. That will just be a completely unintentional side effect. (Linus Torvalds) "If I was god, I would recompile penguin with --enable-fly." (unknow) Kaufbach, Saxony, Germany, Europe. N 51.05082°, E 13.56889°
On Mon, 28 Apr 2008 seijin@gmail.com wrote: > I'm fairly new to PG and databases in general so this may very well be > a problem in my thought process. > > If I have a simple table with an ID (integer) and Animal (text) like > this... > > 1 Dog > 2 Cat > 3 NULL > 4 Horse > 5 Pig > 6 Cat > 7 Cat > > ... and I do something like "select id where animal <> 'Cat';" then > shouldn't 1, 3, 4 and 5 be picked? Comparisons against null with =, <> and so on return unknown not true or false and WHERE clauses only return rows where the condition is true. You might want to read up on the ternary (three valued) logic and nulls. I haven't read through it but the wikipedia page on null is pretty long. http://en.wikipedia.org/wiki/Null_%28SQL%29
Andreas Kretschmer <akretschmer@spamfence.net> writes: >>> ... and I do something like "select id where animal <> 'Cat';" then >>> shouldn't 1, 3, 4 and 5 be picked? As it is I only get 1, 4 and 5. >>> NULL is not 'Cat'. I realize that if I were testing for NULL itself I > NULL is nothing, you can't compare something with nothing. A better way to think about it is that NULL means UNKNOWN. Thus the result of NULL <> 'Cat' is not FALSE but UNKNOWN (ie NULL) --- if you don't know what the value is, you don't know whether or not it's equal to any specific other value. The other mistake novices typically make is to expect that NULL = NULL will yield TRUE. It doesn't, it yields NULL, because again you can't say whether two unknown quantities are equal. You can hack around this behavior to some extent with IS DISTINCT FROM, but generally the right thing is to redesign your data representation. Trying to make NULL act like a normal data value is almost always going to lead to tears in the long run. regards, tom lane
On Apr 29, 7:36 am, t...@sss.pgh.pa.us (Tom Lane) wrote: > Andreas Kretschmer <akretsch...@spamfence.net> writes: > >>> ... and I do something like "select id where animal <> 'Cat';" then > >>> shouldn't 1, 3, 4 and 5 be picked? As it is I only get 1, 4 and 5. > >>> NULL is not 'Cat'. I realize that if I were testing for NULL itself I > > NULL is nothing, you can't compare something with nothing. > > A better way to think about it is that NULL means UNKNOWN. Thus > the result of NULL <> 'Cat' is not FALSE but UNKNOWN (ie NULL) > --- if you don't know what the value is, you don't know whether or not > it's equal to any specific other value. > > The other mistake novices typically make is to expect that > NULL = NULL will yield TRUE. It doesn't, it yields NULL, > because again you can't say whether two unknown quantities > are equal. > > You can hack around this behavior to some extent with > IS DISTINCT FROM, but generally the right thing is to redesign > your data representation. Trying to make NULL act like a normal > data value is almost always going to lead to tears in the long run. > > regards, tom lane > > -- > Sent via pgsql-general mailing list (pgsql-gene...@postgresql.org) > To make changes to your subscription:http://www.postgresql.org/mailpref/pgsql-general This is a lot of good information and really helps. I think I'll rework my DB design and client program to try and avoid this behavior all together. I'm sure that'll be the best option in the long run. Thanks to everyone for the help!