Thread: NULLS and <> : Discrepancies ?
Sorry for intruding, but the following question did not get much attention on the "General" list. However, I still need the answer ... </LurkingMode> <NewbieMode> Could some kind soul explain this to me ? test1=# select distinct "Cle" from "Utilisateurs"; Cle ----- 1 2 3 4 (4 rows) test1=# select distinct "CleUtil" from "Histoires"; CleUtil --------- 1 (2 rows) -- Uuhhh ! test1=# select count(*) as NbRec from "Histoires" where "CleUtil" is null; nbrec ------- 2 (1 row) -- Ah Ahh ... I have NULLs. test1=# select distinct "Cle" from "Utilisateurs" where "Cle" in test1-# (select distinct "CleUtil" from "Histoires"); Cle ----- 1 (1 row) -- That's OK ... test1=# select distinct "Cle" from "Utilisateurs" where "Cle" not in test1-# (select distinct "CleUtil" from "Histoires"); Cle ----- (0 rows) -- That's definitively *NOT* OK ! However test1=# select distinct "Cle" from "Utilisateurs" where "Cle" not in test1-# (select distinct "CleUtil" from "Histoires" where "CleUtil" is not null); Cle ----- 2 3 4 (3 rows) -- That's what I expected in the first place. Could someone explain to me why not eliminating nulls destroys the potential results of the query ? In other words, for any X not null, X not in (some NULLs) is false. </NewbieMode> <LurkingMode> Emmanuel Charpentier
> Could someone explain to me why not eliminating nulls destroys the > potential results of the query ? In other words, for any X not null, X > not in (some NULLs) is false. You already know the answer: comparisons to NULL always evaluate to false. You may conclude that this exposes a flaw in SQL9x's definition of three-value logic, but is the result you should expect for a standards-compliant SQL implementation. - Thomas (I was going to say "almost always" rather than "always" but I'm not recalling a counter example and don't have time to look it up. Sorry about that... ;)
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: >> Could someone explain to me why not eliminating nulls destroys the >> potential results of the query ? In other words, for any X not null, X >> not in (some NULLs) is false. > You already know the answer: comparisons to NULL always evaluate to > false. Thomas, I'm surprised at you! Comparisons to NULL do not yield false, they yield NULL. So, given foo NOT IN (bar, NULL) we may rewrite this as NOT (foo IN (bar, NULL)) NOT (foo = bar OR foo = NULL) NOT (false OR NULL) NOT (NULL) NULL On the other hand foo NOT IN (foo, NULL) NOT (foo IN (foo, NULL)) NOT (foo = foo OR foo = NULL) NOT (true OR NULL) NOT (true) false So the correct statement of the behavior is that the result of NOT IN is always either FALSE or NULL if there are any NULLs involved. This is perfectly correct if you recall the interpretation of NULL as "don't know". The truth value of "foo = NULL" is not FALSE, it is UNKNOWN, because you don't know what the NULL is ... it could be foo. It happens that WHERE treats a NULL condition result the same as FALSE, ie don't select the row, but they are not the same thing. regards, tom lane