Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .." - Mailing list pgsql-sql

From Mike Winter
Subject Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .."
Date
Msg-id Pine.LNX.4.33L2.0305151524590.6984-100000@frontlogic.com
Whole thread Raw
Responses Re: Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .."  (Richard Huxton <dev@archonet.com>)
Re: Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .."  (Randall Lucas <rlucas@tercent.net>)
Re: Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .."  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
Hi, when doing queries of the type:

SELECT id FROM foo WHERE id IN (1, 4, 3, 2, 10, 11, 14) .., I get
terrible performance on tables of any resonable size.  I see the
same behaviour when doing queries of the form "SELECT id FROM
foo WHERE id = 5 OR id = 6 OR ..."

When doing an "EXPLAIN" on the query, I get output like the
following:

Index Scan using foo_idx, foo_idx, foo_idx, foo_idx, foo_idx,
foo_idx on foo (cost=0.00..18.16 rows=6 width=4)

If the "IN (1, 2, 3, 6, ..., n)" clause is big enough, the
database will actually throw an error saying "Recursive Depth
Exceeded" or something similar and not complete the query.

It looks to me like the query parser is recursively calling
an index scan for each row in the 'IN' clause rather than just
doing one index scan that it seems it should be.

My question is, does anyone have any alternate ideas for how I
can do a query like this and have it perform well?  The tables I
am working with are big enough that a sequential scan is not
helpful.  Is this a bug I am encountering or an error in my
query?  Is this a known issue?

I have seen this beahaviour on 7.2.1 and 7.3.2 on both Solaris
and Linux platforms.

Thanks for any input.





pgsql-sql by date:

Previous
From: Matt Mello
Date:
Subject: Re: "deadlock detected" documentation
Next
From: sammynash@uboot.com (Sammy)
Date:
Subject: Chain/Thread Problem