"WHERE col NOT IN" yields falsely empty result. - Mailing list pgsql-bugs

From pgsql-bugs@postgresql.org
Subject "WHERE col NOT IN" yields falsely empty result.
Date
Msg-id 200106131235.f5DCZlj28167@hub.org
Whole thread Raw
Responses Re: "WHERE col NOT IN" yields falsely empty result.  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-bugs
Dr M J Carter (Martin.J.Carter@nottingham.ac.uk) reports a bug with a severity of 2
The lower the number the more severe it is.

Short Description
"WHERE col NOT IN" yields falsely empty result.

Long Description
Consider

        SELECT * FROM bar WHERE col2 NOT IN (SELECT col1 FROM foo)

where tables foo and bar overlap but neither is a subset of
the other.  Since the difference set bar - foo is nonempty, the above
should yield one or more rows.  However, the presence of a null
in foo.col1 (tests 3 and 8 below) yields zero rows, even where
the difference set has rows with no null entries (see test 2).
Am I missing something obvious? eg in one of the FAQs?

Two possible user-level workarounds (using EXCEPT with IN (tests 5
and 9), and using a correlated join (tests 6 and 10)) are shown
below, which work as expected if the null is in the difference
between the tables.  A null in the intersection, as in tests 9
and 10, causes its row to appear in the output; on reflection this
is be expected, and can be fixed (tests 11 and 12) once the possibility is recognised.  I can't (yet) see how to show
null-entry 
rows in the difference set without also showing nulls from the
intersection, but I'm an SQL newbie so this time I *know* I'm
missing something.  (Disable blather mode.)

The tables need not have the same column types; that just makes testing a touch easier, by changing the columns being
examined(as 
in test 4).  Apologies for the extensiveness of my tests, of which
you say "don't waste your time" in your Web page: I had to convince
myself that I wasn't just being stupid.

Binaries used are from the stock Debian distributions, not recompiled
by me.  The older set is on my home box (Debian 2.2, PostgreSQL-6.5.3,
built with gcc 2.95.2), and the newer one at work (Deb 2.x (testing): psql yields "PostgreSQL 7.1 on i686-pc-linux-gnu,
compiledby GCC 
2.95.4"; libc6 is v2.2.3-5; processor AMD K6/2 .... anything else?).


Sample Code
Feed the following through "psql foo -f foo.sql | less":

---- snip ----
DROP TABLE foo;
CREATE TABLE "foo" (
        "ipnum" inet,
        "ipname" text);

DROP TABLE bar;
CREATE TABLE "bar" (
        "ipnum" inet,
        "ipname" text);

COPY "foo" FROM stdin;
192.168.187.10  theory
192.168.187.11  junior
192.168.40.43   granby
192.168.40.48   gotham
192.168.21.16   marian
192.168.20.6    marian
192.168.1.1     jips-gw
\.

COPY "bar" FROM stdin;
192.168.187.10  theory
192.168.187.11  junior
192.168.40.43   granby
192.168.40.48   gotham
192.168.21.16   marian
192.168.20.6    marian
192.168.3.8     real-gw
\.

SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 1: Shows intersection (6 rows).

SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo);
-- 2: Shows expected single row from bar.

INSERT INTO foo (ipnum) VALUES ('192.168.104.42');
-- Add row to foo with null ipname value.

SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo);
-- 3: Yields empty set.  Is this a bug? it's certainly unexpected;   -- what I'd expected was the same one-line reply
asfor test 2. 

SELECT * FROM bar WHERE ipnum NOT IN (SELECT ipnum FROM foo);
-- 4: Yields expected single row again, by examining other column.

SELECT * FROM bar EXCEPT
 SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 5: Yields expected row from bar.

SELECT * FROM bar WHERE NOT EXISTS (
        SELECT * FROM foo WHERE foo.ipname = bar.ipname);
-- 6: Inspired by answer 4.23 in general PostgreSQL FAQ: yields      -- expected row, as well as being somewhat faster
forlarge tables. 

INSERT INTO bar (ipnum) VALUES ('192.168.104.42');
-- Move row to intersection.

SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 7: Shows non-null rows of intersection (6 rows).

SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo);
-- 8: Yields empty set.

SELECT * FROM bar EXCEPT
 SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 9: Yields expected row from bar, plus empty-field row from the     -- intersection.

SELECT * FROM bar WHERE NOT EXISTS (
        SELECT * FROM foo WHERE foo.ipname = bar.ipname);
-- 10: Ditto.

SELECT * FROM bar WHERE ipname IS NOT NULL
 EXCEPT
 SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 11: Yields expected row from bar only.

SELECT * FROM bar WHERE ipname IS NOT NULL AND NOT EXISTS (
        SELECT * FROM foo WHERE foo.ipname = bar.ipname);
-- 12: Ditto.


No file was uploaded with this report

pgsql-bugs by date:

Previous
From: Vinod Kurup
Date:
Subject: Re: coalesce in execute crashes backend
Next
From: Tom Lane
Date:
Subject: Re: coalesce in execute crashes backend