SUMMARY: NOT IN issue - Mailing list pgsql-general

From Marc SCHAEFER
Subject SUMMARY: NOT IN issue
Date
Msg-id Pine.LNX.3.96.1010705094527.1060A-100000@defian.alphanet.ch
Whole thread Raw
List pgsql-general
The issue was that NOT IN doesn't work as intuitively expected (or at
least to me) if there are NULLs in the set, because IN returns NULL
instead of FALSE in that case.

Simplified setup to see the issue:

   CREATE TABLE personne (id SERIAL, PRIMARY KEY(id), UNIQUE(id));
   CREATE TABLE utilisateur_news (id INT4 REFERENCES personne);
   INSERT INTO personne VALUES(1);
   INSERT INTO personne VALUES(2);
   INSERT INTO personne VALUES(3);
   INSERT INTO utilisateur_news VALUES(2);
   INSERT INTO utilisateur_news VALUES(NULL);

Goal: determine what are the id in relation personne that are not
referenced by any utilisateur_news.

The wrong NOT IN variant (will fail as soon are there is any NULL in the
id field of utilisateur_news):

   SELECT id
   FROM personne p
   WHERE p.id NOT IN (SELECT DISTINCT un.id
                      FROM utilisateur_news un);

The correct EXISTS variant (which is anyway recommended because it can be
implemented as an index search, see EXPLAIN on big tables).

   SELECT p.id
   FROM personne p
   WHERE NOT EXISTS (SELECT un.id
                     FROM utilisateur_news un
                     WHERE (un.id = p.id));

The correct NOT IN version:

   SELECT id
   FROM personne p
   WHERE p.id NOT IN (SELECT DISTINCT un.id
                      FROM utilisateur_news un
                      WHERE (un.id IS NOT NULL));

Note that in the general case, a NOT NULL in the REFERENCES id of the
utilisateur_news will avoid this degenerated case. The final db I am using
has this constraints. However, during the migration from the previous db
(using utilisateur_news only) to the new, I lifted that restriction, to
allow utilisateur_news without personne linked to fill it later.  When all
old will have filled, I will add a NOT NULL constraint, and create both in
a transaction.

Thank you for help.

References:
   http://fts.postgresql.org/db/mw/msg.html?mid=122788




pgsql-general by date:

Previous
From: GH
Date:
Subject: Re: Db creation script for referenced table ......
Next
From: Nicolas Kowalski
Date:
Subject: Trigger with current user