Dr M J Carter (Martin.J.Carter@nottingham.ac.uk) reports a bug with a severity of 2
The lower the number the more severe it is.
Short Description
"WHERE col NOT IN" yields falsely empty result.
Long Description
Consider
SELECT * FROM bar WHERE col2 NOT IN (SELECT col1 FROM foo)
where tables foo and bar overlap but neither is a subset of
the other. Since the difference set bar - foo is nonempty, the above
should yield one or more rows. However, the presence of a null
in foo.col1 (tests 3 and 8 below) yields zero rows, even where
the difference set has rows with no null entries (see test 2).
Am I missing something obvious? eg in one of the FAQs?
Two possible user-level workarounds (using EXCEPT with IN (tests 5
and 9), and using a correlated join (tests 6 and 10)) are shown
below, which work as expected if the null is in the difference
between the tables. A null in the intersection, as in tests 9
and 10, causes its row to appear in the output; on reflection this
is be expected, and can be fixed (tests 11 and 12) once the possibility is recognised. I can't (yet) see how to show
null-entry
rows in the difference set without also showing nulls from the
intersection, but I'm an SQL newbie so this time I *know* I'm
missing something. (Disable blather mode.)
The tables need not have the same column types; that just makes testing a touch easier, by changing the columns being
examined(as
in test 4). Apologies for the extensiveness of my tests, of which
you say "don't waste your time" in your Web page: I had to convince
myself that I wasn't just being stupid.
Binaries used are from the stock Debian distributions, not recompiled
by me. The older set is on my home box (Debian 2.2, PostgreSQL-6.5.3,
built with gcc 2.95.2), and the newer one at work (Deb 2.x (testing): psql yields "PostgreSQL 7.1 on i686-pc-linux-gnu,
compiledby GCC
2.95.4"; libc6 is v2.2.3-5; processor AMD K6/2 .... anything else?).
Sample Code
Feed the following through "psql foo -f foo.sql | less":
---- snip ----
DROP TABLE foo;
CREATE TABLE "foo" (
"ipnum" inet,
"ipname" text);
DROP TABLE bar;
CREATE TABLE "bar" (
"ipnum" inet,
"ipname" text);
COPY "foo" FROM stdin;
192.168.187.10 theory
192.168.187.11 junior
192.168.40.43 granby
192.168.40.48 gotham
192.168.21.16 marian
192.168.20.6 marian
192.168.1.1 jips-gw
\.
COPY "bar" FROM stdin;
192.168.187.10 theory
192.168.187.11 junior
192.168.40.43 granby
192.168.40.48 gotham
192.168.21.16 marian
192.168.20.6 marian
192.168.3.8 real-gw
\.
SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 1: Shows intersection (6 rows).
SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo);
-- 2: Shows expected single row from bar.
INSERT INTO foo (ipnum) VALUES ('192.168.104.42');
-- Add row to foo with null ipname value.
SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo);
-- 3: Yields empty set. Is this a bug? it's certainly unexpected; -- what I'd expected was the same one-line reply
asfor test 2.
SELECT * FROM bar WHERE ipnum NOT IN (SELECT ipnum FROM foo);
-- 4: Yields expected single row again, by examining other column.
SELECT * FROM bar EXCEPT
SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 5: Yields expected row from bar.
SELECT * FROM bar WHERE NOT EXISTS (
SELECT * FROM foo WHERE foo.ipname = bar.ipname);
-- 6: Inspired by answer 4.23 in general PostgreSQL FAQ: yields -- expected row, as well as being somewhat faster
forlarge tables.
INSERT INTO bar (ipnum) VALUES ('192.168.104.42');
-- Move row to intersection.
SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 7: Shows non-null rows of intersection (6 rows).
SELECT * FROM bar WHERE ipname NOT IN (SELECT ipname FROM foo);
-- 8: Yields empty set.
SELECT * FROM bar EXCEPT
SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 9: Yields expected row from bar, plus empty-field row from the -- intersection.
SELECT * FROM bar WHERE NOT EXISTS (
SELECT * FROM foo WHERE foo.ipname = bar.ipname);
-- 10: Ditto.
SELECT * FROM bar WHERE ipname IS NOT NULL
EXCEPT
SELECT * FROM bar WHERE ipname IN (SELECT ipname FROM foo);
-- 11: Yields expected row from bar only.
SELECT * FROM bar WHERE ipname IS NOT NULL AND NOT EXISTS (
SELECT * FROM foo WHERE foo.ipname = bar.ipname);
-- 12: Ditto.
No file was uploaded with this report