Thread: Using bitmap index scans-more efficient

Using bitmap index scans-more efficient

From
Kyle Bateman
Date:
How can I use bitmap index scans more effectively? (version 8.1.0)

I have a financial ledger (actually a view, grouping several other tables) 
containing about a million records.  Each record contains an account code and 
a project code.  I can query for all the transactions belonging to any single
project code and it is very fast and efficient (milliseconds/project).

But projects are organized in a hierarchical structure, so I also need to 
query the ledger for transactions belonging to a particular project and/or all
its progeny.  Depending on the method, this is taking several seconds to
several minutes per project.

For testing purposes, I'll present results using a smaller version of the 
ledger with the following query times:


It is most efficient to enumerate the group of projects using "in" (0.144 seconds)
 select * from ledger where proj in (4737,4789,4892,4893,4894,4895,4933,4934,4935);

---------------------------------------------------------------------------Nested Loop Left Join  (cost=19.73..4164.10
rows=7width=85)  ->  Nested Loop  (cost=19.73..4139.08 rows=7 width=81)        ->  Nested Loop  (cost=19.73..4100.07
rows=7width=63)              ->  Bitmap Heap Scan on apinv_items i  (cost=19.73..1185.71 rows=487 width=55)
      Recheck Cond: ((proj = 4737) OR (proj = 4789) OR (proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895)
OR(proj = 4933) OR (proj = 4934
 
) OR (proj = 4935))                    Filter: ((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status =
'pd'::bpchar))                   ->  BitmapOr  (cost=19.73..19.73 rows=495 width=0)                          ->  Bitmap
IndexScan on i_apinv_items_proj  (cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj =
4737)                         ->  Bitmap Index Scan on i_apinv_items_proj  (cost=0.00..2.19 rows=55 width=0)
                   Index Cond: (proj = 4789)                          ->  Bitmap Index Scan on i_apinv_items_proj
(cost=0.00..2.19rows=55 width=0)                                Index Cond: (proj = 4892)                          ->
BitmapIndex Scan on i_apinv_items_proj  (cost=0.00..2.19 rows=55 width=0)                                Index Cond:
(proj= 4893)                          ->  Bitmap Index Scan on i_apinv_items_proj  (cost=0.00..2.19 rows=55 width=0)
                           Index Cond: (proj = 4894)                          ->  Bitmap Index Scan on
i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4895)
              ->  Bitmap Index Scan on i_apinv_items_proj  (cost=0.00..2.19 rows=55 width=0)
   Index Cond: (proj = 4933)                          ->  Bitmap Index Scan on i_apinv_items_proj  (cost=0.00..2.19
rows=55width=0)                                Index Cond: (proj = 4934)                          ->  Bitmap Index Scan
oni_apinv_items_proj  (cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4935)
     ->  Index Scan using apinv_hdr_pkey on apinv_hdr h  (cost=0.00..5.97 rows=1 width=21)                    Index
Cond:(("outer".vendid = h.vendid) AND (("outer".invnum)::text = (h.invnum)::text))        ->  Index Scan using
vend_org_pkeyon vend_org v  (cost=0.00..5.56 rows=1 width=26)              Index Cond: (v.org_id = "outer".vendid)  ->
SeqScan on acct a  (cost=0.00..3.54 rows=1 width=4)        Filter: ((code)::text = 'ap'::text)
 
---------------------------------------------------------------------------

Problem is, the project list has to be hard-coded into the SQL statement. 
What I really need is transactions belonging to "project 4737 and all its progeny."
So I've tried using a many-to-many table proj_prog that describes which projects 
are progeny of which other projects.  Unfortunately, the query time then goes up 
by a factor of 6 (to 0.85 seconds).

Examples: select * from ledger where proj = any (array(select prog_id from proj_prog where proj_id = 4737)); select *
fromledger where proj = any (array[4737,4789,4892,4893,4894,4895,4933,4934,4935]);"
 

---------------------------------------------------------------------------Nested Loop Left Join
(cost=13584.99..17647.39rows=850 width=85)  InitPlan    ->  Index Scan using proj_prog_pkey on proj_prog
(cost=0.00..38.04rows=21 width=4)          Index Cond: (proj_id = 4737)  ->  Merge Join  (cost=13543.42..17565.44
rows=850width=81)        Merge Cond: ("outer".vendid = "inner".org_id)        ->  Merge Join  (cost=13543.42..17405.05
rows=850width=63)              Merge Cond: (("outer".vendid = "inner".vendid) AND (("outer".invnum)::text =
"inner"."?column10?"))             ->  Index Scan using apinv_hdr_pkey on apinv_hdr h  (cost=0.00..3148.16 rows=51016
width=21)             ->  Sort  (cost=13543.42..13693.47 rows=60020 width=55)                    Sort Key: i.vendid,
(i.invnum)::text                   ->  Seq Scan on apinv_items i  (cost=0.00..7197.27 rows=60020 width=55)
           Filter: (((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) AND (proj = ANY
($0)))       ->  Index Scan using vend_org_pkey on vend_org v  (cost=0.00..145.52 rows=1799 width=26)  ->  Materialize
(cost=3.54..3.55rows=1 width=4)        ->  Seq Scan on acct a  (cost=0.00..3.54 rows=1 width=4)              Filter:
((code)::text= 'ap'::text)
 

---------------------------------------------------------------------------

The worst case is the following types of queries (about 5 seconds):
 select * from ledger where proj in (select prog_id from proj_prog where proj_id = 4737); select l.* from ledger l,
proj_progp where l.proj = p.prog_id and p.proj_id = 4737;
 

---------------------------------------------------------------------------Hash Join  (cost=19032.47..23510.23 rows=6
width=85) Hash Cond: ("outer".proj = "inner".prog_id)  ->  Nested Loop Left Join  (cost=18994.38..23378.41 rows=1700
width=85)       ->  Hash Join  (cost=18990.84..23340.87 rows=1700 width=81)              Hash Cond: ("outer".vendid =
"inner".org_id)             ->  Merge Join  (cost=18935.35..23255.64 rows=1700 width=63)                    Merge Cond:
(("outer".vendid= "inner".vendid) AND (("outer".invnum)::text = "inner"."?column10?"))                    ->  Index
Scanusing apinv_hdr_pkey on apinv_hdr h  (cost=0.00..3148.16 rows=51016 width=21)                    ->  Sort
(cost=18935.35..19235.45rows=120041 width=55)                          Sort Key: i.vendid, (i.invnum)::text
            ->  Seq Scan on apinv_items i  (cost=0.00..4152.99 rows=120041 width=55)
Filter:((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar))              ->  Hash
(cost=50.99..50.99rows=1799 width=26)                    ->  Seq Scan on vend_org v  (cost=0.00..50.99 rows=1799
width=26)       ->  Materialize  (cost=3.54..3.55 rows=1 width=4)              ->  Seq Scan on acct a  (cost=0.00..3.54
rows=1width=4)                    Filter: ((code)::text = 'ap'::text)  ->  Hash  (cost=38.04..38.04 rows=21 width=4)
   ->  Index Scan using proj_prog_pkey on proj_prog p  (cost=0.00..38.04 rows=21 width=4)              Index Cond:
(proj_id= 4737)
 
---------------------------------------------------------------------------

I would like to be able to get the best performance like in the first query but 
without having to enumerate the projects (i.e. using a single query).
The secret seems to be the bitmap index scans.

Any ideas about whether/how this can be done?

Thanks!

Kyle Bateman



---------------------------------------------------------------------------
BTW, The ledger view is built roughly as follows:

create view rp_v_api as   select       h.adate                                 as adate,       (i.price *
i.quant)::numeric(14,2)     as amount,       substring(v.org_name from 1 for 40)     as descr, 
 
       i.proj                  as proj,       i.acct                  as acct,       1                       as
cr_proj,      a.acct_id               as cr_acct 
 
   from (       apinv_hdr               h       join apinv_items        i       on i.vendid     = h.vendid and i.invnum
=h.invnum       join vend_org           v       on v.org_id     = h.vendid       left join acct          a       on
a.code      = 'ap'   )   where i.status in ('en','cl','pd');
 




Re: Using bitmap index scans-more efficient

From
Florian Weimer
Date:
* Kyle Bateman:

> Any ideas about whether/how this can be done?

If the project tree is fairly consistent, it's convenient to encode it
using intervals instead of parent/child intervals.  IIRC, Celko's "SQL
for smarties" explains how to do this, and Kristian Koehntopp has
written some PHP code to implement it.

-- 
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Durlacher Allee 47            tel: +49-721-96201-1
D-76131 Karlsruhe             fax: +49-721-96201-99


Re: Using bitmap index scans-more efficient

From
Kyle Bateman
Date:
Florian Weimer wrote:

>* Kyle Bateman:
>
>  
>
>>Any ideas about whether/how this can be done?
>>    
>>
>
>If the project tree is fairly consistent, it's convenient to encode it
>using intervals instead of parent/child intervals.  IIRC, Celko's "SQL
>for smarties" explains how to do this, and Kristian Koehntopp has
>written some PHP code to implement it.
>
>  
>
I agree that this produces a more efficient query for finding the 
projects that are the progeny of another project, but I'm trying to 
figure out how that helps me select the right project transactions from 
my ledger efficiently.

This query produces wonderful results (very fast):

select * from ledger where proj >= 4737 and proj <= 4740;

But I'm assuming that using an interval-encoded project tree, I would 
have to do something like the following to get a progency group:

select * from ledger l, proj p where p.proj_id = l.proj and p.left > 
1234 and p.right < 2345;

The problem (at least according to my initial testing) is that this 
forces a join of the entire ledger and I get my slowest performance 
group (5 seconds).

What am I missing?

Kyle



Re: Using bitmap index scans-more efficient

From
Tom Lane
Date:
Kyle Bateman <kyle@actarg.com> writes:
> But I'm assuming that using an interval-encoded project tree, I would 
> have to do something like the following to get a progency group:

> select * from ledger l, proj p where p.proj_id = l.proj and p.left > 
> 1234 and p.right < 2345;

btree has no idea about the constraint (that I imagine exists) that left
<= right.  If you're just doing a simple index on (left, right) then the
above query requires scanning all index entries with left > 1234.  It
would probably help to say

select * from ledger l, proj p where p.proj_id = l.proj and p.left > 1234 and p.left < 2345 and p.right < 2345;

so that you can constrain the range of "left" values scanned.
        regards, tom lane


Re: Using bitmap index scans-more efficient

From
Kyle Bateman
Date:
Tom Lane wrote:

>Kyle Bateman <kyle@actarg.com> writes:
>  
>
>>But I'm assuming that using an interval-encoded project tree, I would 
>>have to do something like the following to get a progency group:
>>    
>>
>
>  
>
>>select * from ledger l, proj p where p.proj_id = l.proj and p.left > 
>>1234 and p.right < 2345;
>>    
>>
>
>btree has no idea about the constraint (that I imagine exists) that left
><= right.  If you're just doing a simple index on (left, right) then the
>above query requires scanning all index entries with left > 1234.  It
>would probably help to say
>
>select * from ledger l, proj p where p.proj_id = l.proj and
>  p.left > 1234 and p.left < 2345 and p.right < 2345;
>
>so that you can constrain the range of "left" values scanned.
>  
>
Thanks for the replies, Tom and Florian.

My problem is not that it is difficult (or costly) to determine the 
progeny of a given project.  I can determine this in about 90 msec 
regardless of whether I use an adjacency model, interval-encoding, or 
materialized path (current implementation).  The problem is, when I try 
to extract the ledger entries belonging to that progeny from a set of a 
million records, it seems to need to process all million records rather 
than being able to index right into the ones I want.

I'm not very good at reading explain output, but I tried to set up the 
query Tom suggests by creating an interval-encoded project table 
(proj_int) and then joining it to my ledger like so:

select l.* from ledger l, proj_int i where l.proj = i.proj_id and i.lft >= 5283 and i.lft < 5300 and i.rgt <= 5300;

On my mini-test-ledger of 100,000 entries, this takes the longest time 
(5 seconds) with the following explain output:
Hash Join  (cost=19018.46..23411.52 rows=14 width=85)  Hash Cond: ("outer".proj = "inner".proj_id)  ->  Nested Loop
LeftJoin  (cost=18994.38..23378.41 rows=1700 width=85)        ->  Hash Join  (cost=18990.84..23340.87 rows=1700
width=81)             Hash Cond: ("outer".vendid = "inner".org_id)              ->  Merge Join
(cost=18935.35..23255.64rows=1700 width=63)                    Merge Cond: (("outer".vendid = "inner".vendid) AND 
 
(("outer".invnum)::text = "inner"."?column10?"))                    ->  Index Scan using apinv_hdr_pkey on apinv_hdr h

(cost=0.00..3148.16 rows=51016 width=21)                    ->  Sort  (cost=18935.35..19235.45 rows=120041 
width=55)                          Sort Key: i.vendid, (i.invnum)::text                          ->  Seq Scan on
apinv_itemsi  
 
(cost=0.00..4152.99 rows=120041 width=55)                                Filter: ((status = 'en'::bpchar) OR 
(status = 'cl'::bpchar) OR (status = 'pd'::bpchar))              ->  Hash  (cost=50.99..50.99 rows=1799 width=26)
            ->  Seq Scan on vend_org v  (cost=0.00..50.99 
 
rows=1799 width=26)        ->  Materialize  (cost=3.54..3.55 rows=1 width=4)              ->  Seq Scan on acct a
(cost=0.00..3.54rows=1 width=4)                    Filter: ((code)::text = 'ap'::text)  ->  Hash  (cost=24.06..24.06
rows=10width=4)        ->  Bitmap Heap Scan on proj_int i  (cost=2.26..24.06 rows=10 
 
width=4)              Recheck Cond: ((lft >= 5283) AND (lft < 5300) AND (rgt <= 
5300))              ->  Bitmap Index Scan on i_proj_int_lft_rgt  
(cost=0.00..2.26 rows=10 width=0)                    Index Cond: ((lft >= 5283) AND (lft < 5300) AND 
(rgt <= 5300))

That is roughly equivalent to my materialized path method:

select l.* from ledger l where l.proj    in (select proj_id from proj_v 
where 4737 = any(ppath));


And is quite slow compared to 150 msec when enumerating the progeny 
projects like so:

select l.* from ledger l where l.proj in 
(4737,4789,4892,4893,4894,4895,4933,4934,4935);
Nested Loop Left Join  (cost=19.73..4164.10 rows=7 width=85)  ->  Nested Loop  (cost=19.73..4139.08 rows=7 width=81)
   ->  Nested Loop  (cost=19.73..4100.07 rows=7 width=63)              ->  Bitmap Heap Scan on apinv_items i  
 
(cost=19.73..1185.71 rows=487 width=55)                    Recheck Cond: ((proj = 4737) OR (proj = 4789) OR 
(proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR 
(proj = 4933) OR (proj = 4934) OR (proj = 4935))                    Filter: ((status = 'en'::bpchar) OR (status = 
'cl'::bpchar) OR (status = 'pd'::bpchar))                    ->  BitmapOr  (cost=19.73..19.73 rows=495 width=0)
                ->  Bitmap Index Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4737)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4789)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4892)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4893)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4894)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4895)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4933)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4934)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4935)              ->  Index Scan
usingapinv_hdr_pkey on apinv_hdr h  
 
(cost=0.00..5.97 rows=1 width=21)                    Index Cond: (("outer".vendid = h.vendid) AND 
(("outer".invnum)::text = (h.invnum)::text))        ->  Index Scan using vend_org_pkey on vend_org v  
(cost=0.00..5.56 rows=1 width=26)              Index Cond: (v.org_id = "outer".vendid) Nested Loop Left 
Join  (cost=19.73..4164.10 rows=7 width=85)  ->  Nested Loop  (cost=19.73..4139.08 rows=7 width=81)        ->  Nested
Loop (cost=19.73..4100.07 rows=7 width=63)              ->  Bitmap Heap Scan on apinv_items i  
 
(cost=19.73..1185.71 rows=487 width=55)                    Recheck Cond: ((proj = 4737) OR (proj = 4789) OR 
(proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR 
(proj = 4933) OR (proj = 4934) OR (proj = 4935))                    Filter: ((status = 'en'::bpchar) OR (status = 
'cl'::bpchar) OR (status = 'pd'::bpchar))                    ->  BitmapOr  (cost=19.73..19.73 rows=495 width=0)
                ->  Bitmap Index Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4737)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4789)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4892)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4893)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4894)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4895)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4933)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4934)                          ->
BitmapIndex Scan on i_apinv_items_proj  
 
(cost=0.00..2.19 rows=55 width=0)                                Index Cond: (proj = 4935)              ->  Index Scan
usingapinv_hdr_pkey on apinv_hdr h  
 
(cost=0.00..5.97 rows=1 width=21)                    Index Cond: (("outer".vendid = h.vendid) AND 
(("outer".invnum)::text = (h.invnum)::text))        ->  Index Scan using vend_org_pkey on vend_org v  
(cost=0.00..5.56 rows=1 width=26)              Index Cond: (v.org_id = "outer".vendid)  ->  Seq Scan on acct a
(cost=0.00..3.54rows=1 width=4)        Filter: ((code)::text = 'ap'::text)  ->  Seq Scan on acct a  (cost=0.00..3.54
rows=1width=4)        Filter: ((code)::text = 'ap'::text)
 

I'm still stumped...



Re: Using bitmap index scans-more efficient

From
Kyle Bateman
Date:
Tom Lane wrote:

>Kyle Bateman <kyle@actarg.com> writes:
>  
>
>>But I'm assuming that using an interval-encoded project tree, I would 
>>have to do something like the following to get a progency group:
>>    
>>
>>select * from ledger l, proj p where p.proj_id = l.proj and p.left > 
>>1234 and p.right < 2345;
>>    
>>

Here's an interesting result:

I created a function proj_left(int4) that returns the left interval 
number for a given project.  Then I created an index on the underlying 
table for the ledger  view(which took forever to build) like so:

create index i_test on apinv_items (proj_left(proj));

Now my query:

select * from ledger where proj_left(dr_proj) >= 5283 and 
proj_left(dr_proj) < 5300;

is very speedy.  Problem is, I had to mark the function proj_left() as 
immutable, which it can not be since the left and right values for a 
given project will change any time a project is added, removed, or moved 
around the hierarchy :(

So is there any good way to tell the planner to do several individual 
index scans for the projects involved in the desired progeny, or the 
results together and return the result?  This is what it seems to be 
choosing in the case of the query:
 select * from ledger where proj in (4737,4789,4892,4893,4894,4895,4933,4934,4935);





Re: Using bitmap index scans-more efficient

From
Tom Lane
Date:
Kyle Bateman <kyle@actarg.com> writes:
> I'm wondering if this might expose a weakness in the optimizer having to 
> do with left joins.

Before 8.2 the optimizer has no ability to rearrange the order of outer
joins.  Do you have time to try your test case against CVS HEAD?
        regards, tom lane


Re: Using bitmap index scans-more efficient

From
Kyle Bateman
Date:
Tom Lane wrote:

>Kyle Bateman <kyle@actarg.com> writes:
>  
>
>>I'm wondering if this might expose a weakness in the optimizer having to 
>>do with left joins.
>>    
>>
>
>Before 8.2 the optimizer has no ability to rearrange the order of outer
>joins.  Do you have time to try your test case against CVS HEAD?
>  
>

OK, I figured it out--grabbed the latest snapshot (hope that is what you 
need).

My results are similar:

select l.* from ledg_v1 l, proj p where l.proj = p.proj_id and 5 = 
p.par; (24 msec)Nested Loop  (cost=0.00..1991.93 rows=480 width=23)  ->  Nested Loop  (cost=0.00..4.68 rows=6 width=8)
     ->  Seq Scan on acct a  (cost=0.00..1.12 rows=1 width=4)              Filter: ((code)::text = 'ap'::text)
-> Index Scan using i_proj_par on proj p  (cost=0.00..3.49 
 
rows=6 width=4)              Index Cond: (5 = par)  ->  Index Scan using i_ledg_proj on ledg l  (cost=0.00..330.17 
rows=83 width=19)        Index Cond: (l.proj = "outer".proj_id)

select l.* from ledg_v2 l, proj p where l.proj = p.proj_id and 5 = 
p.par; (1.25 sec)Hash Join  (cost=4.63..16768.43 rows=480 width=23)  Hash Cond: ("outer".proj = "inner".proj_id)  ->
NestedLoop Left Join  (cost=1.13..14760.13 rows=400000 width=23)        ->  Seq Scan on ledg l  (cost=0.00..6759.00
rows=400000width=19)        ->  Materialize  (cost=1.13..1.14 rows=1 width=4)              ->  Seq Scan on acct a
(cost=0.00..1.12rows=1 width=4)                    Filter: ((code)::text = 'ap'::text)  ->  Hash  (cost=3.49..3.49
rows=6width=4)        ->  Index Scan using i_proj_par on proj p  (cost=0.00..3.49 
 
rows=6 width=4)              Index Cond: (5 = par)



Re: Using bitmap index scans-more efficient

From
Tom Lane
Date:
Kyle Bateman <kyle@actarg.com> writes:
> Tom Lane wrote:
>> Before 8.2 the optimizer has no ability to rearrange the order of outer
>> joins.  Do you have time to try your test case against CVS HEAD?

> OK, I figured it out--grabbed the latest snapshot (hope that is what you 
> need).
> My results are similar:

Are you sure you found a recent version?  I get this from CVS HEAD:

ledger=# explain analyze select l.* from ledg_v2 l, proj p where l.proj = p.proj_id and 5 = p.par;
                                    QUERY PLAN                                                            
 

----------------------------------------------------------------------------------------------------------------------------------Nested
LoopLeft Join  (cost=5.79..1386.74 rows=400 width=23) (actual time=0.125..1.543 rows=329 loops=1)  ->  Nested Loop
(cost=4.66..1377.61rows=400 width=19) (actual time=0.109..1.072 rows=329 loops=1)        ->  Index Scan using
i_proj_paron proj p  (cost=0.00..8.41 rows=5 width=4) (actual time=0.023..0.028 rows=4 loops=1)              Index
Cond:(5 = par)        ->  Bitmap Heap Scan on ledg l  (cost=4.66..272.83 rows=81 width=19) (actual time=0.073..0.213
rows=82loops=4)              Recheck Cond: (l.proj = p.proj_id)              ->  Bitmap Index Scan on i_ledg_proj
(cost=0.00..4.66rows=81 width=0) (actual time=0.041..0.041 rows=82 loops=4)                    Index Cond: (l.proj =
p.proj_id) ->  Materialize  (cost=1.13..1.14 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=329)        ->  Seq
Scanon acct a  (cost=0.00..1.12 rows=1 width=4) (actual time=0.009..0.011 rows=1 loops=1)              Filter:
((code)::text= 'ap'::text)Total runtime: 1.696 ms
 
(12 rows)

Yours is doing the left join inside the join to proj, which of course is
terrible because it has to form the whole 400K-row join result.
        regards, tom lane


PG 8.2beta reordering working for this case?

From
Kyle Bateman
Date:
Kyle Bateman wrote:

> Tom Lane wrote:
>
>>
>> Before 8.2 the optimizer has no ability to rearrange the order of outer
>> joins.  Do you have time to try your test case against CVS HEAD?
>>
>>
I've done some more refinement on my accounting ledger system that has
clarified some of the problems I was having with performance joining to
a union.  Here's an interesting test case.  I just tried it with PG
8.2beta1.  I don't claim to understand the new feature of reordering
queries very well, but it seems like this is related to that feature.
Since its still a performance problem in 8.2, I thought this might be a
helpful test case during beta.

To see this demo, run create.sql on a clean database.  It will create
all the needed tables, views and test data.

Then run q1 and q2 which are very efficient.  Then q3 is the slow one.
The reason is, it does the union, producing 300,000 records before
trying to select by project.  It seems like the optimizer should
internally rewrite the query to look more like what is in q4 (which is
very fast).

Is there a way to make the optimizer do this?

Kyle Bateman

-- Make test schema for demonstrating how the postgres optimizer might improve
-- performance on joins with unions

-- Contains a record for each project (job-costing) code
-- Projects are arranged in a hierarchical structure (parent/child)
-- --------------------------------------------------------------
create table proj (
    proj_id        int4 primary key,
    title        varchar,
    par            int4 references proj on update cascade
);
create index i_proj_par on proj (par);

-- Contains a record for every 2 combinations of projects which are related
-- to each other in the hierarchical project tree
-- (parent/child, ancestor/progenitor, etc.)
-- --------------------------------------------------------------
create table proj_rel (
    anst_id        int4 references proj on update cascade on delete cascade,    -- ancestor project number
    prog_id        int4 references proj on update cascade on delete cascade,    -- progenitor project number
    rel            int4,                                -- 0=self, 1=child, 2=grandchild, etc.
    primary key (anst_id, prog_id)
);

-- Contains a record for each account number and an optional alpha code to identify a sub-ledger
-- --------------------------------------------------------------
create table acct (
    acct_id        int4 primary key,    -- account number
    title        varchar,        -- name of the account
    code        varchar            -- alpha code for the account
);
create index i_acct_code on acct (code);

-- Each sub-ledger contains transactions unique to a certain part of the business
-- In addiiton to the standard fields, they all share in common, each sub-ledger
-- contains additional fields that are unique to it (so they can not all be
-- stored in a single table).  In our actual implementation, these sub-ledgers
-- are actually implemented as views joining even lower level tables.
-- --------------------------------------------------------------
create table subledg_A (
    rid            int4 primary key,        -- record ID
    amount        numeric(14,2),
    proj        int4 references proj on update cascade on delete cascade,
    unique_A        varchar                -- some other data
);
create index i_subledg_A_proj on subledg_A (proj);
-- --------------------------------------------------------------
create table subledg_B (
    rid            int4 primary key,        -- record ID
    amount        numeric(14,2),
    proj        int4 references proj on update cascade on delete cascade,
    unique_B        varchar                -- some other data
);
create index i_subledg_B_proj on subledg_B (proj);
-- --------------------------------------------------------------
create table subledg_C (
    rid            int4 primary key,        -- record ID
    amount        numeric(14,2),
    proj        int4 references proj on update cascade on delete cascade,
    unique_C        varchar                -- some other data
);
create index i_subledg_C_proj on subledg_C (proj);

-- These views allow a standard account code to presented in the appropriate ledgers
-- --------------------------------------------------------------
create view subview_A as select
   'AP ' || rid as trans_id,
    l.amount, l.proj,
    a.acct_id as acct
    from    subledg_A    l
        join    acct        a on a.code = 'ap';
-- --------------------------------------------------------------
create view subview_B as select
   'AR ' || rid as trans_id,
    l.amount, l.proj,
    a.acct_id as acct
    from    subledg_B    l
        join    acct        a on a.code = 'ar';
-- --------------------------------------------------------------
create view subview_C as select
   'PR ' || rid as trans_id,
    l.amount, l.proj,
    a.acct_id as acct
    from    subledg_C    l
        join    acct        a on a.code = 'pr';

-- General ledger - this should contain all transactions from all subledgers
-- --------------------------------------------------------------
create view gen_ledg as
    select trans_id, amount, proj, acct from subview_A
    union
    select trans_id, amount, proj, acct from subview_B
    union
    select trans_id, amount, proj, acct from subview_C;

-- Populate the project table:
insert into proj (proj_id,title,par) values ( 1, 'The main parent project', null);
insert into proj (proj_id,title,par) values ( 2, 'First  child of 1', 1);
insert into proj (proj_id,title,par) values ( 3, 'Second child of 1', 1);
insert into proj (proj_id,title,par) values ( 4, 'First  child of 2', 2);
insert into proj (proj_id,title,par) values ( 5, 'second child of 2', 2);
insert into proj (proj_id,title,par) values ( 6, 'First  child of 5', 5);
insert into proj (proj_id,title,par) values ( 7, 'Second child of 5', 5);
insert into proj (proj_id,title,par) values ( 8, 'Third  child of 5', 5);
insert into proj (proj_id,title,par) values ( 9, 'Fourth child of 5', 5);
insert into proj (proj_id,title,par) select *,'Sample project',1 from generate_series(10,5000);

-- Populate the project relationships table:
insert into proj_rel (anst_id,prog_id,rel) select proj_id, proj_id, 0 from proj;        -- self
insert into proj_rel (anst_id,prog_id,rel) select par, proj_id, 1 from proj where proj_id != 1;    -- parents
insert into proj_rel (anst_id,prog_id,rel) select 1, proj_id, 2 from proj where proj_id in (4,5);
insert into proj_rel (anst_id,prog_id,rel) select 1, proj_id, 3 from proj where proj_id in (6,7,8,9);
insert into proj_rel (anst_id,prog_id,rel) select 2, proj_id, 2 from proj where proj_id in (6,7,8,9);

-- Populate the account table:
insert into acct (acct_id,title,code) values ( 100, 'Account 100', 'cash');
insert into acct (acct_id,title,code) values ( 101, 'Account 101', null);
insert into acct (acct_id,title,code) values ( 102, 'Account 102', null);
insert into acct (acct_id,title,code) values ( 103, 'Account 103', 'ar');
insert into acct (acct_id,title,code) values ( 104, 'Account 104', null);
insert into acct (acct_id,title,code) values ( 105, 'Account 105', null);
insert into acct (acct_id,title,code) values ( 106, 'Account 106', 'ap');
insert into acct (acct_id,title,code) values ( 107, 'Account 107', null);
insert into acct (acct_id,title,code) values ( 108, 'Account 108', 'pr');
insert into acct (acct_id,title,code) values ( 109, 'Account 109', null);

insert into subledg_A (rid,amount,proj) select *, (random() * 10000), (random() * 4999) + 1 from
generate_series(1,100000);
insert into subledg_B (rid,amount,proj) select *, (random() * 10000), (random() * 4999) + 1 from
generate_series(1,100000);
insert into subledg_C (rid,amount,proj) select *, (random() * 10000), (random() * 4999) + 1 from
generate_series(1,100000);
vacuum analyze;
-- When querying the general ledger for all transactions belonging to project 5,
-- this first query is very fast.  In spite of the union, the optimizer seems to apply
-- the condition "proj = 5" to the inner tables first, and then append the results

--explain analyze
select * from gen_ledg where proj = 5;
-- It is also very fast when trying to find all transactions for the progeny of 5
-- projects (5,6,7,8,9), but only if you are only querying a sub-ledger directly:

--explain analyze
select lg.*
    from     subledg_A    lg
    join    proj_rel    pr    on pr.prog_id = lg.proj
    where    pr.anst_id = 5;
-- But when searching the general ledger for transactions belonging to the progeny
-- of project 5, the optimizer produces the union first.  It then has to process
-- 300,000 records to find the few it needs (even though the desired fields are
-- indexed in the underlying tables).

--explain analyze
select lg.*
    from     gen_ledg    lg
    join    proj_rel    pr    on pr.prog_id = lg.proj
    where    pr.anst_id = 5;
-- It would be nice if the optimizer could rewrite the query to work like this:
-- (which is very fast)

--explain analyze
select lg.*
    from     subledg_A    lg
    join    proj_rel    pr    on pr.prog_id = lg.proj
    where    pr.anst_id = 5

union select lg.*
    from     subledg_B    lg
    join    proj_rel    pr    on pr.prog_id = lg.proj
    where    pr.anst_id = 5

union select lg.*
    from     subledg_C    lg
    join    proj_rel    pr    on pr.prog_id = lg.proj
    where    pr.anst_id = 5
;

Re: PG 8.2beta reordering working for this case?

From
Tom Lane
Date:
Kyle Bateman <kyle@actarg.com> writes:
> Is there a way to make the optimizer do this?

Sorry, that's not happening for 8.2.  Consider using a union all (not
union) across the subledg_N tables directly and then joining to that.
That boils down to being a partitioning case and I think probably will
be covered by the 8.2 improvements.  There isn't any understanding of
how to commute joins and unions though ... (offhand I'm not even sure
of the conditions under which such a thing would be safe).
        regards, tom lane


Re: PG 8.2beta reordering working for this case?

From
Kyle Bateman
Date:
Tom Lane wrote:

>Kyle Bateman <kyle@actarg.com> writes:
>
>
>>Is there a way to make the optimizer do this?
>>
>>
>
>Sorry, that's not happening for 8.2.  Consider using a union all (not
>union) across the subledg_N tables directly and then joining to that.
>That boils down to being a partitioning case and I think probably will
>be covered by the 8.2 improvements.
>
Yup, union all is much more efficient.  It hadn't really occurred to me
the difference between union and union all.  But it makes sense to
eliminate the need for a unique sort.  The q3 query went from 10 seconds
to 1 second with just the addition of union all in the general ledger.

BTW, explain analyze still says 10 seconds of run time (and takes 10
seconds to run), but when I remove the explain analyze, the query runs
in about a second.  What's that all about?

Also, I came up with the view shown in the attachment.  It is still much
faster than joining to the union-all ledger (40 ms).  I'm not sure why
because I'm not sure if explain analyze is telling me the real story (I
see a sequential scan of the ledgers in there when it runs 10 seconds).
I'm not sure what it's doing when it runs in 1 second.

Kyle

-- This view is a possible workaround for the problem
drop view gen_ledg_pr;

--explain analyze
create view gen_ledg_pr as
  select lg.*, pr.anst_id
    from     subview_A    lg
    join    proj_rel    pr    on pr.prog_id = lg.proj

  union all select lg.*, pr.anst_id
    from     subview_B    lg
    join    proj_rel    pr    on pr.prog_id = lg.proj

  union all select lg.*, pr.anst_id
    from     subview_C    lg
    join    proj_rel    pr    on pr.prog_id = lg.proj
;

Re: PG 8.2beta reordering working for this case?

From
Tom Lane
Date:
Kyle Bateman <kyle@actarg.com> writes:
> BTW, explain analyze still says 10 seconds of run time (and takes 10 
> seconds to run), but when I remove the explain analyze, the query runs 
> in about a second.  What's that all about?

Instrumentation overhead ... you're probably running this on a PC with a
very slow clock-reading capability.  Each node output row counted by
explain analyze takes two gettimeofday() calls, and apparently it's not
unusual for those to take several microseconds on cheap motherboards,
even when the CPU is nominally very fast.
        regards, tom lane