Home > mailing lists

Re: Planning large IN lists - Mailing list pgsql-hackers

From	Tom Lane
Subject	Re: Planning large IN lists
Date	May 17, 2007 17:16:38
Msg-id	12592.1179432985@sss.pgh.pa.us Whole thread Raw
In response to	Re: Planning large IN lists ("Atul Deopujari" <atuld@enterprisedb.com>)
Responses	Re: Planning large IN lists
List	pgsql-hackers

Tree view

"Atul Deopujari" <atuld@enterprisedb.com> writes:
> Yes, letting the planner make its own decision would seem best (in 
> accordance with what we do for different join paths). But for large IN 
> lists, a substantial part of the planner is spent in estimating the 
> selectivity of the ScalarArrayExpr by calling scalararraysel. If we are 
> not eliminating this step in processing the IN list then we are not 
> doing any optimization. Asking the planner to do scalararraysel and also 
> compute cost of any other way and choose between the two is asking 
> planner to do more work.

So?  Better planning usually involves more work.  In any case the above
argument seems irrelevant, because making scalararraysel more
approximate and less expensive for long lists could be done
independently of anything else.

> Factors such as size of table, availability of index etc. would affect 
> both the ways similarly. So, if we see a gain in the execution of the IN 
> list due to an external factor then we will also see a similar gain in 
> the execution of the transformed IN (VALUES(...)) clause.

Incorrect.  There is more than one way to do a join, and the above
argument only applies if the VALUES case is planned as a nestloop with
inner indexscan, which indeed is isomorphic to the scalararrayop
implementation ... except that it has higher per-tuple overhead, and
therefore will consistently lose, disregarding artifacts of planning
costs such as how hard we try to estimate the result size.  The case
where VALUES is actually a better plan is where the planner switches to
merge or hash join because there are too many values.  In the current
implementation, the planner is incapable of generating those plan shapes
from a scalararrayop, and that's what I'd like to see fixed.
        regards, tom lane

pgsql-hackers by date:

From: Magnus Hagander
Date: 17 May 2007, 17:14:09
Subject: Re: mb and ecpg regression tests

From: Bruce Momjian
Date: 17 May 2007, 18:29:28
Subject: Re: 8.3 release date on web site

Re: Planning large IN lists - Mailing list pgsql-hackers

Previous

Next