Re: problem (bug?) with "in (subquery)" - Mailing list pgsql-sql

From Michael Fuhr
Subject Re: problem (bug?) with "in (subquery)"
Date
Msg-id 20050715133404.GA30210@winnie.fuhr.org
Whole thread Raw
In response to problem (bug?) with "in (subquery)"  (Luca Pireddu <luca@cs.ualberta.ca>)
Responses Re: problem (bug?) with "in (subquery)"
List pgsql-sql
On Thu, Jul 14, 2005 at 01:34:21AM -0600, Luca Pireddu wrote:
> I have the following query that isn't behaving like I would expect:
> 
> select * from strains s where s.id in (select strain_id from pathway_strains);

Any reason the subquery isn't doing "SELECT DISTINCT strain_id"?

> I would expect each strain record to appear only once.  Instead I get output 
> like this, where the same strain id appears many times:
> 
>   id   |     name     | organism
> -------+--------------+----------
>     83 | common       |       82 
>     83 | common       |       82 
>     83 | common       |       82 

What happens when you try each of the following?  Do they give the
expected results?  I did some tests and I'm wondering if the planner's
hash join is responsible for the duplicate rows.

SELECT * FROM strains WHERE id IN ( SELECT strain_id FROM pathway_strains ORDER BY strain_id
);

CREATE TEMPORARY TABLE foo AS SELECT strain_id FROM pathway_strains;
SELECT * FROM strains WHERE id IN (SELECT strain_id FROM foo);

SET enable_hashjoin TO off;
SELECT * FROM strains WHERE id IN (SELECT strain_id FROM pathway_strains);

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/


pgsql-sql by date:

Previous
From: Neil Dugan
Date:
Subject: Re: How to obtain the list of data table name only
Next
From: Tom Lane
Date:
Subject: Re: problem (bug?) with "in (subquery)"