Re: Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .." - Mailing list pgsql-sql

From Richard Huxton
Subject Re: Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .."
Date
Msg-id 200305161829.09631.dev@archonet.com
Whole thread Raw
In response to Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .."  (Mike Winter <mike.winter@frontlogic.com>)
Responses Re: Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR  (Mike Winter <mike.winter@frontlogic.com>)
List pgsql-sql
On Thursday 15 May 2003 10:44 pm, Mike Winter wrote:
> Hi, when doing queries of the type:
>
> SELECT id FROM foo WHERE id IN (1, 4, 3, 2, 10, 11, 14) .., I get
> terrible performance on tables of any resonable size.  I see the
> same behaviour when doing queries of the form "SELECT id FROM
> foo WHERE id = 5 OR id = 6 OR ..."
[snip]
> It looks to me like the query parser is recursively calling
> an index scan for each row in the 'IN' clause rather than just
> doing one index scan that it seems it should be.

Hmm - not sure how you could. When it says index-scan it's actually traversing
a btree (probably), not scanning a list of indexes. The IN is basically
treated like a series of a OR b OR c, hence the similar behaviour.

> My question is, does anyone have any alternate ideas for how I
> can do a query like this and have it perform well?  The tables I
> am working with are big enough that a sequential scan is not
> helpful.  Is this a bug I am encountering or an error in my
> query?  Is this a known issue?

Known issue - the usual advice is to rewrite in the form of EXISTS, but I
can't think how to do that if you have a long list of literal values. You
could create a temp table to hold your matching values and join against it,
but I realise that's not a terribly elegant solution. Unless of course, it's
a search-engine type of situation where it makes a certain amount of sense.

> I have seen this beahaviour on 7.2.1 and 7.3.2 on both Solaris
> and Linux platforms.

Supposed to be some improvements in the forthcoming 7.4 but I don't know if
that will help your particular case.
--  Richard Huxton


pgsql-sql by date:

Previous
From: Richard Huxton
Date:
Subject: Re: Chain/Thread Problem
Next
From: Randall Lucas
Date:
Subject: Re: Efficiency Issue with "SELECT IN (..)" and "SELECT .. OR .."