Home > mailing lists

Re: problem (bug?) with "in (subquery)" - Mailing list pgsql-sql

From	Michael Fuhr
Subject	Re: problem (bug?) with "in (subquery)"
Date	July 15, 2005 10:34:15
Msg-id	20050715133404.GA30210@winnie.fuhr.org Whole thread Raw
In response to	problem (bug?) with "in (subquery)" (Luca Pireddu <luca@cs.ualberta.ca>)
Responses	Re: problem (bug?) with "in (subquery)"
List	pgsql-sql

Tree view

On Thu, Jul 14, 2005 at 01:34:21AM -0600, Luca Pireddu wrote:
> I have the following query that isn't behaving like I would expect:
> 
> select * from strains s where s.id in (select strain_id from pathway_strains);

Any reason the subquery isn't doing "SELECT DISTINCT strain_id"?

> I would expect each strain record to appear only once.  Instead I get output 
> like this, where the same strain id appears many times:
> 
>   id   |     name     | organism
> -------+--------------+----------
>     83 | common       |       82 
>     83 | common       |       82 
>     83 | common       |       82 

What happens when you try each of the following?  Do they give the
expected results?  I did some tests and I'm wondering if the planner's
hash join is responsible for the duplicate rows.

SELECT * FROM strains WHERE id IN ( SELECT strain_id FROM pathway_strains ORDER BY strain_id
);

CREATE TEMPORARY TABLE foo AS SELECT strain_id FROM pathway_strains;
SELECT * FROM strains WHERE id IN (SELECT strain_id FROM foo);

SET enable_hashjoin TO off;
SELECT * FROM strains WHERE id IN (SELECT strain_id FROM pathway_strains);

-- 
Michael Fuhr
http://www.fuhr.org/~mfuhr/

pgsql-sql by date:

From: Neil Dugan
Date: 15 July 2005, 09:16:30
Subject: Re: How to obtain the list of data table name only

From: Tom Lane
Date: 15 July 2005, 10:59:32
Subject: Re: problem (bug?) with "in (subquery)"

Re: problem (bug?) with "in (subquery)" - Mailing list pgsql-sql

Previous

Next