BUG #12556: Clause IN and NOT IN buggy - Mailing list pgsql-bugs

From kevin.perais@trivia-marketing.com
Subject BUG #12556: Clause IN and NOT IN buggy
Date
Msg-id 20150115112720.2502.55131@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #12556: Clause IN and NOT IN buggy  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: BUG #12556: Clause IN and NOT IN buggy  (David G Johnston <david.g.johnston@gmail.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      12556
Logged by:          Kevin PERAIS
Email address:      kevin.perais@trivia-marketing.com
PostgreSQL version: 9.3.5
Operating system:   Ubuntu 12.04
Description:

Hi everybody.

I've been noticing several times that clauses IN and NOT IN are often buggy.
I'll show you 2 concrete cases to illustrate that.


---------------------------------------------
- 1 - IN and NOT IN not coherent with JOIN  -
---------------------------------------------

I have 2 tables here:

Account(id BIGINT PRIMARY KEY NOT NULL, ...)
Contact(contact_id VARCHAR(11) PRIMARY KEY NOT NULL, account_id BIGINT,
...)

I the contacts are linked to an account. However, I know I do not have all
the account, so I want to know how many contacts are linked to an account
and how many are not.

Here is a set of queries:

SELECT count(*)
FROM Account
WHERE id IS NULL;
--> 0

SELECT count(*)
FROM Contact
WHERE account_id IS NULL;
--> 0

SELECT count(*)
FROM Contact;
--> 257726

SELECT count(*)
FROM Contact
WHERE account_id IN (
  SELECT id
  FROM Account
);
--> 257726    (result very very suspect)

SELECT count(*)
FROM Contact
WHERE account_id NOT IN (
  SELECT id
  FROM Account
);
--> 0        (coherent with previous query, but result very very suspect)

SELECT count(*)
FROM Contact
JOIN Account ON Account.id = Contact.account_id;
--> 135664

SELECT count(Account.id)
FROM Contact
LEFT JOIN Account ON Account.id = Contact.account_id;
--> 135664

So I decided to take a random account_id in Contact table and run the
following queries:

SELECT count(*)
FROM Contact
WHERE account_id = 3074054072;
--> 6

SELECT count(*)
FROM Account
WHERE id = 3074054072;
--> 0

So that proves that IN and NOT IN queries give me wrong results.



-------------------------------------------------
- 2 - NOT IN clause not coherent with IN clause -
-------------------------------------------------

I have 2 tables:

Ref(num INTEGER UNIQUE, ...)
Sample(num VARCHAR(9), ...)

Here is a set of queries:

SELECT count(*)
FROM Sample;
--> 692

SELECT count(DISTINCT num)
FROM Sample;
--> 673

SELECT count(*)
FROM Sample
WHERE num IS NULL;
--> 19

--> 673 + 19 = 692
--> So we now know there are 673 distinct num and 19 NULL in the Sample
table.

SELECT count(*)
FROM Ref;
--> 8232

SELECT count(*)
FROM Ref
WHERE num IS NULL;
--> 151

SELECT count(*)
FROM Sample
WHERE num::INTEGER IN (
  SELECT num
  FROM Ref
);
--> 2

SELECT count(*)
FROM Sample
WHERE num::INTEGER NOT IN (
  SELECT num
  FROM Ref
);
--> 0      (Well that is very strange and not coherent at all with the
previous query).


I might be able to provide a data set to test case 2 (but not case 1 as data
are confidential).

Hope I gave enough info on the bug.
Kevin

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #12553: Altering search_path between function calls
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] BUG #12070: hstore extension: hstore_to_json_loose produces invalid JSON