Re: Have I b0rked something? Slow comparisons on "where x in (...)" - Mailing list pgsql-general

From Tom Lane
Subject Re: Have I b0rked something? Slow comparisons on "where x in (...)"
Date
Msg-id 21786.1178143189@sss.pgh.pa.us
Whole thread Raw
In response to Have I b0rked something? Slow comparisons on "where x in (...)"  (Stephen Harris <lists@spuddy.org>)
Responses Re: Have I b0rked something? Slow comparisons on "where x in (...)"
List pgsql-general
Stephen Harris <lists@spuddy.org> writes:
> Postgres version 8.0.9 on Solaris 2.8.  I know it's old but...
> I have a table with a million rows.

> I need to select data from this table based on an indexed column; I need
> to select 600 possible values from the column, returning around 24,000
> rows of data.

> In perl I have a hash which has 600 key values in it.

> So I did:

>   "select stuff from table where index_key in (" .
>       join(",",keys %hash) . ") AND non_index_row in ('xyz','abc','def')

> And in the perl
>   while(fetch())
>   {
>     do_stuff
>   }

> This resulted in a query string which executed in 12 minutes.  If I
> did an "explain" on the query string then I can see it was being expanded
> to 600 OR statements
>   where (index_key = 1) OR (index_key = 2) OR .....

In what, a seq scan?

That plan will require executing 600 integer comparisons at each of a
million rows, with only some trivial fraction avoided because of early
success.  So it works out that your machine is able to do something over
800K such comparisons per second, which seems a bit slow for any modern
machine ... but I note 8.0 didn't have any of the "virtual slot"
optimizations added in later releases, and is doing a fresh
heap_getattr() for each access to the variable.  If it's having to
grovel over a lot of variable-width fields to get to that field each
time, I can see where the time might get eaten up.  Where is the
index_key column in the tuples, exactly?

            regards, tom lane

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: forcing use of a specific (expression) index?
Next
From: Brent Wood
Date:
Subject: Re: Utility of OIDs in postgres