Thread: Planning large IN lists

Planning large IN lists

From

Neil Conway

Date:

10 May 2007, 15:20:36

When planning queries with a large IN expression in the WHERE clause,
the planner transforms the IN list into a scalar array expression. In
clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr
by calling scalararraysel(), which in turn estimates the selectivity of
*each* array element in order to determine the selectivity of the array
expression as a whole.

This is quite inefficient when the IN list is large. In a test case that
someone sent me privately, a simple query involving two cheap joins and
a ~1800 element IN list in the WHERE clause requires about 100ms to plan
but only ~10 ms to execute -- about 85% of the total runtime is spent in
scalararraysel(). (I'd include the profiling data, but KCacheGrind seems
stubbornly opposed to providing a textual summary of its results...)

Clearly, the current approach is fine when the array is small -- perhaps
for arrays above a certain number of elements, we could switch to
randomly sampling array elements, estimating their selectivities, and
then using that information to infer the estimated selectivity of the
entire array expression. That seems fairly crude, though: does anyone
have any better ideas?

-Neil

Re: Planning large IN lists

From

Lukas Kahwe Smith

Date:

10 May 2007, 15:42:51

Neil Conway wrote:

> Clearly, the current approach is fine when the array is small -- perhaps
> for arrays above a certain number of elements, we could switch to
> randomly sampling array elements, estimating their selectivities, and
> then using that information to infer the estimated selectivity of the
> entire array expression. That seems fairly crude, though: does anyone
> have any better ideas?

Optimizer hints in SQL
/me ducks and runs for cover.

regards,
Lukas

Re: Planning large IN lists

From

Tom Lane

Date:

10 May 2007, 15:53:37

Neil Conway <neilc@samurai.com> writes:
> When planning queries with a large IN expression in the WHERE clause,
> the planner transforms the IN list into a scalar array expression. In
> clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr
> by calling scalararraysel(), which in turn estimates the selectivity of
> *each* array element in order to determine the selectivity of the array
> expression as a whole.

> This is quite inefficient when the IN list is large.

That's the least of the problems.  We really ought to convert such cases
into an IN (VALUES(...)) type of query, since often repeated indexscans
aren't the best implementation.
        regards, tom lane

Re: Planning large IN lists

From

"Dann Corbit"

Date:

10 May 2007, 16:51:53

> -----Original Message-----
> From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-
> owner@postgresql.org] On Behalf Of Tom Lane
> Sent: Thursday, May 10, 2007 11:53 AM
> To: Neil Conway
> Cc: pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Planning large IN lists
>
> Neil Conway <neilc@samurai.com> writes:
> > When planning queries with a large IN expression in the WHERE
clause,
> > the planner transforms the IN list into a scalar array expression.
In
> > clause_selectivity(), we estimate the selectivity of the
ScalarArrayExpr
> > by calling scalararraysel(), which in turn estimates the selectivity
of
> > *each* array element in order to determine the selectivity of the
array
> > expression as a whole.
>
> > This is quite inefficient when the IN list is large.
>
> That's the least of the problems.  We really ought to convert such
cases
> into an IN (VALUES(...)) type of query, since often repeated
indexscans
> aren't the best implementation.

It seems to me that if you have a unique index on the in list column,
then the problem is simplified.  In that case, you just have to estimate
how many index seeks cost more than a table scan.  Usually, it's around
5-10% of the table size for the average database.  Not sure how it works
out in PostgreSQL.

So in the special case of an in list on a unique indexed column, compare
the cardinality of the table with the number of in list items and decide
to table scan or index seek based on that.

For arbitrary queries, it seems that it would be necessary to keep
histograms for the columns in question.  Perhaps it could be collected
with an advanced analyze option.

Re: Planning large IN lists

From

Bruce Momjian

Date:

14 May 2007, 15:12:27

Is this a TODO?

---------------------------------------------------------------------------

Tom Lane wrote:
> Neil Conway <neilc@samurai.com> writes:
> > When planning queries with a large IN expression in the WHERE clause,
> > the planner transforms the IN list into a scalar array expression. In
> > clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr
> > by calling scalararraysel(), which in turn estimates the selectivity of
> > *each* array element in order to determine the selectivity of the array
> > expression as a whole.
> 
> > This is quite inefficient when the IN list is large.
> 
> That's the least of the problems.  We really ought to convert such cases
> into an IN (VALUES(...)) type of query, since often repeated indexscans
> aren't the best implementation.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 7: You can help support the PostgreSQL project by donating at
> 
>                 http://www.postgresql.org/about/donate

--  Bruce Momjian  <bruce@momjian.us>          http://momjian.us EnterpriseDB
http://www.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +

Re: Planning large IN lists

From

"Atul Deopujari"

Date:

17 May 2007, 06:15:07

Hi,

Tom Lane wrote:
> Neil Conway <neilc@samurai.com> writes:
>> When planning queries with a large IN expression in the WHERE clause,
>> the planner transforms the IN list into a scalar array expression. In
>> clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr
>> by calling scalararraysel(), which in turn estimates the selectivity of
>> *each* array element in order to determine the selectivity of the array
>> expression as a whole.
> 
>> This is quite inefficient when the IN list is large.
> 
> That's the least of the problems.  We really ought to convert such cases
> into an IN (VALUES(...)) type of query, since often repeated indexscans
> aren't the best implementation.
> 
I thought of giving this a shot and while I was working on it, it 
occurred to me that we need to decide on a threshold value of the IN 
list size above which such transformation should take place. For small 
sizes of the IN list, scalararraysel() of IN list wins over the hash 
join involved in IN (VALUES(...)). But for larger sizes of the IN list, 
IN (VALUES(...)) comes out to be a clear winner.
I would like to know what does the community think should be a heuristic 
value of the IN list size beyond which this transformation should take 
place.
I was thinking of a GUC variable (or a hard coded value) which defaults 
to say 30. This is based on numbers from the following test:

postgres=# create table w (w text);
CREATE TABLE

postgres=# \copy w from '/usr/share/dict/words'

And run the following query with different IN list sizes
explain analyze select * from w where w in ('one', 'two', ...);

I got the following runtimes:
------------------------------------
IN list  IN (VALUES(...))     IN
size
------------------------------------
150     ~2000 ms           ~5500 ms
100     ~1500 ms           ~4000 ms
80      ~1400 ms           ~3000 ms
50      ~1400 ms           ~2500 ms
30      ~1500 ms           ~1500 ms
20      ~1400 ms           ~1200 ms
10      ~1400 ms           ~1200 ms
------------------------------------

The IN (VALUES(...)) gives an almost steady state behavior, while the IN  runtimes deteriorate with growing list size.

There would obviously be different conditions on which to base this 
value. I seek community opinion on this.

-- 
Atul

EnterpriseDB
www.enterprisedb.com

Re: Planning large IN lists

From

"Atul Deopujari"

Date:

17 May 2007, 06:18:39

Hi,

Tom Lane wrote:
> Neil Conway <neilc@samurai.com> writes:
>> When planning queries with a large IN expression in the WHERE clause,
>> the planner transforms the IN list into a scalar array expression. In
>> clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr
>> by calling scalararraysel(), which in turn estimates the selectivity of
>> *each* array element in order to determine the selectivity of the array
>> expression as a whole.
> 
>> This is quite inefficient when the IN list is large.
> 
> That's the least of the problems.  We really ought to convert such cases
> into an IN (VALUES(...)) type of query, since often repeated indexscans
> aren't the best implementation.
> 
I thought of giving this a shot and while I was working on it, it 
occurred to me that we need to decide on a threshold value of the IN 
list size above which such transformation should take place. For small 
sizes of the IN list, scalararraysel() of IN list wins over the hash 
join involved in IN (VALUES(...)). But for larger sizes of the IN list, 
IN (VALUES(...)) comes out to be a clear winner.
I would like to know what does the community think should be a heuristic 
value of the IN list size beyond which this transformation should take 
place.
I was thinking of a GUC variable (or a hard coded value) which defaults 
to say 30. This is based on numbers from the following test:

postgres=# create table w (w text);
CREATE TABLE

postgres=# \copy w from '/usr/share/dict/words'

And run the following query with different IN list sizes
explain analyze select * from w where w in ('one', 'two', ...);

I got the following runtimes:
------------------------------------
IN list  IN (VALUES(...))     IN
size
------------------------------------
150     ~2000 ms           ~5500 ms
100     ~1500 ms           ~4000 ms
80      ~1400 ms           ~3000 ms
50      ~1400 ms           ~2500 ms
30      ~1500 ms           ~1500 ms
20      ~1400 ms           ~1200 ms
10      ~1400 ms           ~1200 ms
------------------------------------

The IN (VALUES(...)) gives an almost steady state behavior, while the IN  runtimes deteriorate with growing list size.

There would obviously be different conditions on which to base this 
value. I seek community opinion on this.

-- 
Atul

EnterpriseDB
www.enterprisedb.com

Re: Planning large IN lists

From

Tom Lane

Date:

17 May 2007, 10:34:44

"Atul Deopujari" <atul.deopujari@enterprisedb.com> writes:
> Hi,
> Tom Lane wrote:
>> That's the least of the problems.  We really ought to convert such cases
>> into an IN (VALUES(...)) type of query, since often repeated indexscans
>> aren't the best implementation.
>> 
> I thought of giving this a shot and while I was working on it, it 
> occurred to me that we need to decide on a threshold value of the IN 
> list size above which such transformation should take place.

I see no good reason to suppose that there is/should be a constant
threshold --- most likely it depends on size of table, availability of
indexes, etc.  Having the planner try it both ways and compare costs
would be best.
        regards, tom lane

Re: Planning large IN lists

From

"Atul Deopujari"

Date:

17 May 2007, 16:00:36

Hi,

Tom Lane wrote:
> "Atul Deopujari" <atul.deopujari@enterprisedb.com> writes:
>> Hi,
>> Tom Lane wrote:
>>> That's the least of the problems.  We really ought to convert such cases
>>> into an IN (VALUES(...)) type of query, since often repeated indexscans
>>> aren't the best implementation.
>>>
>> I thought of giving this a shot and while I was working on it, it 
>> occurred to me that we need to decide on a threshold value of the IN 
>> list size above which such transformation should take place.
> 
> I see no good reason to suppose that there is/should be a constant
> threshold --- most likely it depends on size of table, availability of
> indexes, etc.  Having the planner try it both ways and compare costs
> would be best.
> 
Yes, letting the planner make its own decision would seem best (in 
accordance with what we do for different join paths). But for large IN 
lists, a substantial part of the planner is spent in estimating the 
selectivity of the ScalarArrayExpr by calling scalararraysel. If we are 
not eliminating this step in processing the IN list then we are not 
doing any optimization. Asking the planner to do scalararraysel and also 
compute cost of any other way and choose between the two is asking 
planner to do more work.

Factors such as size of table, availability of index etc. would affect 
both the ways similarly. So, if we see a gain in the execution of the IN 
list due to an external factor then we will also see a similar gain in 
the execution of the transformed IN (VALUES(...)) clause.

I agree that one value would not fit all cases. The problem with this 
approach is that for some cases, large IN list would perform better than 
the transformed IN (VALUES(...)) clause. But we know that the 
transformed IN (VALUES(...)) clause has almost a steady state behavior 
and it would not blow off the planner estimates. The error would be just 
marginal.

--
Atul

EnterpriseDB
www.enterprisedb.com

Re: Planning large IN lists

From

Tom Lane

Date:

17 May 2007, 17:16:38

"Atul Deopujari" <atuld@enterprisedb.com> writes:
> Yes, letting the planner make its own decision would seem best (in 
> accordance with what we do for different join paths). But for large IN 
> lists, a substantial part of the planner is spent in estimating the 
> selectivity of the ScalarArrayExpr by calling scalararraysel. If we are 
> not eliminating this step in processing the IN list then we are not 
> doing any optimization. Asking the planner to do scalararraysel and also 
> compute cost of any other way and choose between the two is asking 
> planner to do more work.

So?  Better planning usually involves more work.  In any case the above
argument seems irrelevant, because making scalararraysel more
approximate and less expensive for long lists could be done
independently of anything else.

> Factors such as size of table, availability of index etc. would affect 
> both the ways similarly. So, if we see a gain in the execution of the IN 
> list due to an external factor then we will also see a similar gain in 
> the execution of the transformed IN (VALUES(...)) clause.

Incorrect.  There is more than one way to do a join, and the above
argument only applies if the VALUES case is planned as a nestloop with
inner indexscan, which indeed is isomorphic to the scalararrayop
implementation ... except that it has higher per-tuple overhead, and
therefore will consistently lose, disregarding artifacts of planning
costs such as how hard we try to estimate the result size.  The case
where VALUES is actually a better plan is where the planner switches to
merge or hash join because there are too many values.  In the current
implementation, the planner is incapable of generating those plan shapes
from a scalararrayop, and that's what I'd like to see fixed.
        regards, tom lane

Re: Planning large IN lists

From

"Atul Deopujari"

Date:

18 May 2007, 00:46:32


Tom Lane wrote:
> "Atul Deopujari" <atuld@enterprisedb.com> writes:
>> Yes, letting the planner make its own decision would seem best (in 
>> accordance with what we do for different join paths). But for large IN 
>> lists, a substantial part of the planner is spent in estimating the 
>> selectivity of the ScalarArrayExpr by calling scalararraysel. If we are 
>> not eliminating this step in processing the IN list then we are not 
>> doing any optimization. Asking the planner to do scalararraysel and also 
>> compute cost of any other way and choose between the two is asking 
>> planner to do more work.
> 
> So?  Better planning usually involves more work.  In any case the above
> argument seems irrelevant, because making scalararraysel more
> approximate and less expensive for long lists could be done
> independently of anything else.
> 
Got you and agree with you.

-- 
Atul

EnterpriseDB
www.enterprisedb.com

Re: Planning large IN lists

From

Bruce Momjian

Date:

11 March 2008, 15:15:30

Added to TODO:

* Consider using a hash for joining to a large IN (VALUES ...) list
 http://archives.postgresql.org/pgsql-hackers/2007-05/msg00450.php


---------------------------------------------------------------------------

Atul Deopujari wrote:
> Hi,
> 
> Tom Lane wrote:
> > Neil Conway <neilc@samurai.com> writes:
> >> When planning queries with a large IN expression in the WHERE clause,
> >> the planner transforms the IN list into a scalar array expression. In
> >> clause_selectivity(), we estimate the selectivity of the ScalarArrayExpr
> >> by calling scalararraysel(), which in turn estimates the selectivity of
> >> *each* array element in order to determine the selectivity of the array
> >> expression as a whole.
> > 
> >> This is quite inefficient when the IN list is large.
> > 
> > That's the least of the problems.  We really ought to convert such cases
> > into an IN (VALUES(...)) type of query, since often repeated indexscans
> > aren't the best implementation.
> > 
> I thought of giving this a shot and while I was working on it, it 
> occurred to me that we need to decide on a threshold value of the IN 
> list size above which such transformation should take place. For small 
> sizes of the IN list, scalararraysel() of IN list wins over the hash 
> join involved in IN (VALUES(...)). But for larger sizes of the IN list, 
> IN (VALUES(...)) comes out to be a clear winner.
> I would like to know what does the community think should be a heuristic 
> value of the IN list size beyond which this transformation should take 
> place.
> I was thinking of a GUC variable (or a hard coded value) which defaults 
> to say 30. This is based on numbers from the following test:
> 
> postgres=# create table w (w text);
> CREATE TABLE
> 
> postgres=# \copy w from '/usr/share/dict/words'
> 
> And run the following query with different IN list sizes
> explain analyze select * from w where w in ('one', 'two', ...);
> 
> I got the following runtimes:
> ------------------------------------
> IN list  IN (VALUES(...))     IN
> size
> ------------------------------------
> 150     ~2000 ms           ~5500 ms
> 100     ~1500 ms           ~4000 ms
> 80      ~1400 ms           ~3000 ms
> 50      ~1400 ms           ~2500 ms
> 30      ~1500 ms           ~1500 ms
> 20      ~1400 ms           ~1200 ms
> 10      ~1400 ms           ~1200 ms
> ------------------------------------
> 
> The IN (VALUES(...)) gives an almost steady state behavior, while the IN 
>   runtimes deteriorate with growing list size.
> 
> There would obviously be different conditions on which to base this 
> value. I seek community opinion on this.
> 
> -- 
> Atul
> 
> EnterpriseDB
> www.enterprisedb.com
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 6: explain analyze is your friend

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://postgres.enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +