Home > mailing lists

Re: [GENERAL] Recursive optimization of IN subqueries - Mailing list pgsql-hackers

From	Dennis Haney
Subject	Re: [GENERAL] Recursive optimization of IN subqueries
Date	January 23, 2004 14:37:37
Msg-id	401169CB.5030500@diku.dk Whole thread Raw
Responses	Re: [GENERAL] Recursive optimization of IN subqueries
List	pgsql-hackers

Tree view

Tom Lane wrote:<br /><blockquote cite="mid27924.1074880349@sss.pgh.pa.us" type="cite"><pre wrap="">Dennis Haney <a
class="moz-txt-link-rfc2396E"href="mailto:davh@diku.dk"><davh@diku.dk></a> writes: </pre><blockquote
type="cite"><prewrap="">I saw it as though convert_IN_to_join rewrote the query from   </pre></blockquote><blockquote
type="cite"><prewrap="">select a.* from tenk1 a where a.unique1 in
 
(select c.thousand from tenk1 c where c.hundred = 99);   </pre></blockquote><blockquote type="cite"><pre wrap="">to
</pre></blockquote><blockquotetype="cite"><pre wrap="">select a.* from tenk1 a, tenk1 c where a.unique1 = c.thousand
AND
 
c.hundred = 99;   </pre></blockquote><blockquote type="cite"><pre wrap="">but after looking at it, I've reached the
conclusionthat the rewrite is 
 
to this instead:   </pre></blockquote><blockquote type="cite"><pre wrap="">select a.* from tenk1 a,  (select d.thousand
fromtenk1 d where 
 
d.hundred = 99) as c where a.unique1 = c.thousand;   </pre></blockquote><pre wrap="">
Right.  We do that, and then subsequently pull_up_subqueries transforms
it to the other representation.  The reason for this two-step approach
is that the intermediate form is still a useful improvement if the
subquery can't be pulled up for some reason (e.g., it's got grouping). </pre></blockquote> With improvement I can see
thatit can be materialized and thus used as a normal table in the planner. Is there any additional reasons that I can't
see?<br/> But this limited optimization makes me wonder, why the limitation to optimizing '='?<br /> And why must the
lefthandof the sublink be a variable of the upper query?<br /><br /><br /><blockquote
cite="mid27924.1074880349@sss.pgh.pa.us"type="cite"><pre wrap=""></pre><blockquote type="cite"><pre wrap="">except the
subselectis added as a range table entry instead of a 
 
subselect in the from-list (not that I understand this particular part, 
do you mind explaining?).   </pre></blockquote><pre wrap="">
Same thing.  Every entry in the from-list will have both an RTE and an
entry in the join tree.  This representation is partly historical
(before we had outer joins, there was only the range table and no join
tree at all), but it is convenient for many purposes. </pre></blockquote><br /> Then I don't understand why it gives
twodifferent execution plans? And the Query* is totally different for the two, eg. there is no RTE for the subquery in
thefirst query:<br /><br /><pre>davh=# explain select a.* from test1 a, (select num from test1 where id = 2) as b where
a.num= b.num;                                    QUERY PLAN
 
------------------------------------------------------------------------------------Hash Join  (cost=4.83..29.94
rows=11width=8)  Hash Cond: ("outer".num = "inner".num)  ->  Seq Scan on test1 a  (cost=0.00..20.00 rows=1000
width=8) ->  Hash  (cost=4.82..4.82 rows=2 width=4)        ->  Index Scan using test1_pkey on test1
(cost=0.00..4.82rows=2 width=4)              Index Cond: (id = 2)
 
(6 rows)

davh=# explain select a.* from test1 a where a.num in (select num from test1 where id = 2);
      QUERY PLAN
 
------------------------------------------------------------------------------------Hash IN Join  (cost=4.83..28.75
rows=6width=8)  Hash Cond: ("outer".num = "inner".num)  ->  Seq Scan on test1 a  (cost=0.00..20.00 rows=1000
width=8) ->  Hash  (cost=4.82..4.82 rows=2 width=4)        ->  Index Scan using test1_pkey on test1
(cost=0.00..4.82rows=2 width=4)              Index Cond: (id = 2)
 
(6 rows)

<blockquote type="cite"><pre wrap="">PS: this is a bit off-topic for pgsql-general, please pursue it on
-hackers if you have more questions.</pre></blockquote>ok
</pre><br /><pre class="moz-signature" cols="72">-- 
Dennis
</pre>

pgsql-hackers by date:

From: Tom Lane
Date: 23 January 2004, 11:13:43
Subject: 7.5 change documentation (was Re: cache control?)

From: Neil Conway
Date: 23 January 2004, 16:21:05
Subject: Re: 7.5 change documentation

Re: [GENERAL] Recursive optimization of IN subqueries - Mailing list pgsql-hackers

Previous

Next