Re: Propagating outer join conditions - Mailing list pgsql-performance

From Jonathan Blitz
Subject Re: Propagating outer join conditions
Date
Msg-id 021f01c716ef$3f5701d0$647ba8c0@jonathanlaptop
Whole thread Raw
In response to Propagating outer join conditions  ("Aaron Birkland" <birkie@gmail.com>)
Responses Re: Propagating outer join conditions  ("Aaron Birkland" <apb18@cornell.edu>)
List pgsql-performance
How about trying:

Select *
From
(Select * from t28 where t28.0='spec')  t28a
Left out join (t1 JOIN t11 ON
> (t11.o = '<http://example.org>' AND t11.s = t1.o)) ON t28a.s = t1.s

In this way, I think, the where clause on t28 would be performed before the
join rather than after.

Jonathan Blitz


> -----Original Message-----
> From: Aaron Birkland [mailto:birkie@gmail.com]
> Sent: Sunday, December 03, 2006 5:12 PM
> To: pgsql-performance@postgresql.org
> Subject: [PERFORM] Propagating outer join conditions
>
> The following left outer join plan puzzles me:
>
> EXPLAIN ANALYZE SELECT * from t28 LEFT OUTER JOIN (t1 JOIN t11 ON
> (t11.o = '<http://example.org>' AND t11.s = t1.o)) ON t28.s = t1.s
> WHERE t28.o = '"spec"';
>
> t28, t1, and t11 all have indexed columns named 's' and 'o' that contain
'text';
>
>  Nested Loop Left Join  (cost=794249.26..3289704.61 rows=1 width=301)
> (actual time=581293.390..581293.492 rows=1 loops=1)
>    Join Filter: (t28.s = t1.s)
>    ->  Index Scan using t28_o on t28  (cost=0.00..9.22 rows=1
> width=89) (actual time=0.073..0.077 rows=1 loops=1)
>          Index Cond: (o = '"spec"'::text)
>    ->  Merge Join  (cost=794249.26..3267020.66 rows=1813979 width=212)
> (actual time=230365.522..577078.266 rows=1894969 loops=1)
>          Merge Cond: (t1.o = t11.s)
>          ->  Index Scan using t1_o on t1  (cost=0.00..2390242.10
> rows=22225696 width=109) (actual time=0.209..162586.801 rows=22223925
> loops=1)
>          ->  Sort  (cost=794249.26..798784.21 rows=1813979 width=103)
> (actual time=230365.175..237409.474 rows=1894969 loops=1)
>                Sort Key: t11.s
>                ->  Bitmap Heap Scan on t11  (cost=78450.82..605679.55
> rows=1813979 width=103) (actual time=3252.103..22782.271 rows=1894969
> loops=1)
>                      Recheck Cond: (o = '<http://example.org>'::text)
>                      ->  Bitmap Index Scan on t11_o
> (cost=0.00..78450.82 rows=1813979 width=0) (actual
> time=2445.422..2445.422 rows=1894969 loops=1)
>                            Index Cond: (o = '<http://example.org>'::text)
>
>
> It seems to me that this plan is not very desirable, since the outer
> part of the nested loop left join (the merge join node) is very
> expensive. Is is possible to generate a plan that looks like this:
>
>  Nested Loop Left Join  (cost=???)
>    ->  Index Scan using t28_o on t28  (cost=0.00..9.11 rows=1 width=89)
>          Index Cond: (o = '"spec"'::text)
>    ->  Nested Loop  (cost=???)
>          ->  Index Scan using t1_s on t1  (cost=???)
>                Index Cond: (s = t28.s)
>          ->  Bitmap Heap Scan on t11  (cost=???)
>                Recheck Cond: (t11.s = t1.o)
>                Filter: (o = '<http://example.org>'::text)
>                ->  Bitmap Index Scan on t11_s  (cost=??? )
>                      Index Cond: (t11.s = t1.o)
>
> I *think* this plan is equivalent to the above if I'm assuming the
> behaviour of the 'nested loop left join' node correctly.  So far, I
> have been tweaking the statistics, cost estimates, and
> enabling.disabling certain plans to see if I can get it to propagate
> the join condition t1.s = t28.s to the outer node of the left join..
> but so far, I cannot.  So, my questions are:
>
> 1) Is my 'desired' query plan logic correct
> 2) Can the executor execute a plan such as my 'desired' plan
> 3) If (1) and (2) are 'yes', then how may I get the planner to
> generate such a plan, or do I just need to look harder into tweaking
> the statistics and cost estimates
>
>   -Aaron
>
> ---------------------------(end of broadcast)---------------------------
> TIP 9: In versions below 8.0, the planner will ignore your desire to
>        choose an index scan if your joining column's datatypes do not
>        match
>
>
> --
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.409 / Virus Database: 268.15.6/565 - Release Date: 12/2/2006
>
>
> --
> No virus found in this incoming message.
> Checked by AVG Free Edition.
> Version: 7.1.409 / Virus Database: 268.15.6/565 - Release Date: 12/02/2006
>

--
No virus found in this outgoing message.
Checked by AVG Free Edition.
Version: 7.1.409 / Virus Database: 268.15.6/565 - Release Date: 12/02/2006



pgsql-performance by date:

Previous
From: "Aaron Birkland"
Date:
Subject: Propagating outer join conditions
Next
From: "Aaron Birkland"
Date:
Subject: Re: Propagating outer join conditions