Thread: Using bitmap index scans-more efficient
How can I use bitmap index scans more effectively? (version 8.1.0) I have a financial ledger (actually a view, grouping several other tables) containing about a million records. Each record contains an account code and a project code. I can query for all the transactions belonging to any single project code and it is very fast and efficient (milliseconds/project). But projects are organized in a hierarchical structure, so I also need to query the ledger for transactions belonging to a particular project and/or all its progeny. Depending on the method, this is taking several seconds to several minutes per project. For testing purposes, I'll present results using a smaller version of the ledger with the following query times: It is most efficient to enumerate the group of projects using "in" (0.144 seconds) select * from ledger where proj in (4737,4789,4892,4893,4894,4895,4933,4934,4935); ---------------------------------------------------------------------------Nested Loop Left Join (cost=19.73..4164.10 rows=7width=85) -> Nested Loop (cost=19.73..4139.08 rows=7 width=81) -> Nested Loop (cost=19.73..4100.07 rows=7width=63) -> Bitmap Heap Scan on apinv_items i (cost=19.73..1185.71 rows=487 width=55) Recheck Cond: ((proj = 4737) OR (proj = 4789) OR (proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR(proj = 4933) OR (proj = 4934 ) OR (proj = 4935)) Filter: ((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) -> BitmapOr (cost=19.73..19.73 rows=495 width=0) -> Bitmap IndexScan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4737) -> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4789) -> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19rows=55 width=0) Index Cond: (proj = 4892) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj= 4893) -> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4894) -> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4895) -> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4933) -> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55width=0) Index Cond: (proj = 4934) -> Bitmap Index Scan oni_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4935) -> Index Scan using apinv_hdr_pkey on apinv_hdr h (cost=0.00..5.97 rows=1 width=21) Index Cond:(("outer".vendid = h.vendid) AND (("outer".invnum)::text = (h.invnum)::text)) -> Index Scan using vend_org_pkeyon vend_org v (cost=0.00..5.56 rows=1 width=26) Index Cond: (v.org_id = "outer".vendid) -> SeqScan on acct a (cost=0.00..3.54 rows=1 width=4) Filter: ((code)::text = 'ap'::text) --------------------------------------------------------------------------- Problem is, the project list has to be hard-coded into the SQL statement. What I really need is transactions belonging to "project 4737 and all its progeny." So I've tried using a many-to-many table proj_prog that describes which projects are progeny of which other projects. Unfortunately, the query time then goes up by a factor of 6 (to 0.85 seconds). Examples: select * from ledger where proj = any (array(select prog_id from proj_prog where proj_id = 4737)); select * fromledger where proj = any (array[4737,4789,4892,4893,4894,4895,4933,4934,4935]);" ---------------------------------------------------------------------------Nested Loop Left Join (cost=13584.99..17647.39rows=850 width=85) InitPlan -> Index Scan using proj_prog_pkey on proj_prog (cost=0.00..38.04rows=21 width=4) Index Cond: (proj_id = 4737) -> Merge Join (cost=13543.42..17565.44 rows=850width=81) Merge Cond: ("outer".vendid = "inner".org_id) -> Merge Join (cost=13543.42..17405.05 rows=850width=63) Merge Cond: (("outer".vendid = "inner".vendid) AND (("outer".invnum)::text = "inner"."?column10?")) -> Index Scan using apinv_hdr_pkey on apinv_hdr h (cost=0.00..3148.16 rows=51016 width=21) -> Sort (cost=13543.42..13693.47 rows=60020 width=55) Sort Key: i.vendid, (i.invnum)::text -> Seq Scan on apinv_items i (cost=0.00..7197.27 rows=60020 width=55) Filter: (((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) AND (proj = ANY ($0))) -> Index Scan using vend_org_pkey on vend_org v (cost=0.00..145.52 rows=1799 width=26) -> Materialize (cost=3.54..3.55rows=1 width=4) -> Seq Scan on acct a (cost=0.00..3.54 rows=1 width=4) Filter: ((code)::text= 'ap'::text) --------------------------------------------------------------------------- The worst case is the following types of queries (about 5 seconds): select * from ledger where proj in (select prog_id from proj_prog where proj_id = 4737); select l.* from ledger l, proj_progp where l.proj = p.prog_id and p.proj_id = 4737; ---------------------------------------------------------------------------Hash Join (cost=19032.47..23510.23 rows=6 width=85) Hash Cond: ("outer".proj = "inner".prog_id) -> Nested Loop Left Join (cost=18994.38..23378.41 rows=1700 width=85) -> Hash Join (cost=18990.84..23340.87 rows=1700 width=81) Hash Cond: ("outer".vendid = "inner".org_id) -> Merge Join (cost=18935.35..23255.64 rows=1700 width=63) Merge Cond: (("outer".vendid= "inner".vendid) AND (("outer".invnum)::text = "inner"."?column10?")) -> Index Scanusing apinv_hdr_pkey on apinv_hdr h (cost=0.00..3148.16 rows=51016 width=21) -> Sort (cost=18935.35..19235.45rows=120041 width=55) Sort Key: i.vendid, (i.invnum)::text -> Seq Scan on apinv_items i (cost=0.00..4152.99 rows=120041 width=55) Filter:((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) -> Hash (cost=50.99..50.99rows=1799 width=26) -> Seq Scan on vend_org v (cost=0.00..50.99 rows=1799 width=26) -> Materialize (cost=3.54..3.55 rows=1 width=4) -> Seq Scan on acct a (cost=0.00..3.54 rows=1width=4) Filter: ((code)::text = 'ap'::text) -> Hash (cost=38.04..38.04 rows=21 width=4) -> Index Scan using proj_prog_pkey on proj_prog p (cost=0.00..38.04 rows=21 width=4) Index Cond: (proj_id= 4737) --------------------------------------------------------------------------- I would like to be able to get the best performance like in the first query but without having to enumerate the projects (i.e. using a single query). The secret seems to be the bitmap index scans. Any ideas about whether/how this can be done? Thanks! Kyle Bateman --------------------------------------------------------------------------- BTW, The ledger view is built roughly as follows: create view rp_v_api as select h.adate as adate, (i.price * i.quant)::numeric(14,2) as amount, substring(v.org_name from 1 for 40) as descr, i.proj as proj, i.acct as acct, 1 as cr_proj, a.acct_id as cr_acct from ( apinv_hdr h join apinv_items i on i.vendid = h.vendid and i.invnum =h.invnum join vend_org v on v.org_id = h.vendid left join acct a on a.code = 'ap' ) where i.status in ('en','cl','pd');
* Kyle Bateman: > Any ideas about whether/how this can be done? If the project tree is fairly consistent, it's convenient to encode it using intervals instead of parent/child intervals. IIRC, Celko's "SQL for smarties" explains how to do this, and Kristian Koehntopp has written some PHP code to implement it. -- Florian Weimer <fweimer@bfk.de> BFK edv-consulting GmbH http://www.bfk.de/ Durlacher Allee 47 tel: +49-721-96201-1 D-76131 Karlsruhe fax: +49-721-96201-99
Florian Weimer wrote: >* Kyle Bateman: > > > >>Any ideas about whether/how this can be done? >> >> > >If the project tree is fairly consistent, it's convenient to encode it >using intervals instead of parent/child intervals. IIRC, Celko's "SQL >for smarties" explains how to do this, and Kristian Koehntopp has >written some PHP code to implement it. > > > I agree that this produces a more efficient query for finding the projects that are the progeny of another project, but I'm trying to figure out how that helps me select the right project transactions from my ledger efficiently. This query produces wonderful results (very fast): select * from ledger where proj >= 4737 and proj <= 4740; But I'm assuming that using an interval-encoded project tree, I would have to do something like the following to get a progency group: select * from ledger l, proj p where p.proj_id = l.proj and p.left > 1234 and p.right < 2345; The problem (at least according to my initial testing) is that this forces a join of the entire ledger and I get my slowest performance group (5 seconds). What am I missing? Kyle
Kyle Bateman <kyle@actarg.com> writes: > But I'm assuming that using an interval-encoded project tree, I would > have to do something like the following to get a progency group: > select * from ledger l, proj p where p.proj_id = l.proj and p.left > > 1234 and p.right < 2345; btree has no idea about the constraint (that I imagine exists) that left <= right. If you're just doing a simple index on (left, right) then the above query requires scanning all index entries with left > 1234. It would probably help to say select * from ledger l, proj p where p.proj_id = l.proj and p.left > 1234 and p.left < 2345 and p.right < 2345; so that you can constrain the range of "left" values scanned. regards, tom lane
Tom Lane wrote: >Kyle Bateman <kyle@actarg.com> writes: > > >>But I'm assuming that using an interval-encoded project tree, I would >>have to do something like the following to get a progency group: >> >> > > > >>select * from ledger l, proj p where p.proj_id = l.proj and p.left > >>1234 and p.right < 2345; >> >> > >btree has no idea about the constraint (that I imagine exists) that left ><= right. If you're just doing a simple index on (left, right) then the >above query requires scanning all index entries with left > 1234. It >would probably help to say > >select * from ledger l, proj p where p.proj_id = l.proj and > p.left > 1234 and p.left < 2345 and p.right < 2345; > >so that you can constrain the range of "left" values scanned. > > Thanks for the replies, Tom and Florian. My problem is not that it is difficult (or costly) to determine the progeny of a given project. I can determine this in about 90 msec regardless of whether I use an adjacency model, interval-encoding, or materialized path (current implementation). The problem is, when I try to extract the ledger entries belonging to that progeny from a set of a million records, it seems to need to process all million records rather than being able to index right into the ones I want. I'm not very good at reading explain output, but I tried to set up the query Tom suggests by creating an interval-encoded project table (proj_int) and then joining it to my ledger like so: select l.* from ledger l, proj_int i where l.proj = i.proj_id and i.lft >= 5283 and i.lft < 5300 and i.rgt <= 5300; On my mini-test-ledger of 100,000 entries, this takes the longest time (5 seconds) with the following explain output: Hash Join (cost=19018.46..23411.52 rows=14 width=85) Hash Cond: ("outer".proj = "inner".proj_id) -> Nested Loop LeftJoin (cost=18994.38..23378.41 rows=1700 width=85) -> Hash Join (cost=18990.84..23340.87 rows=1700 width=81) Hash Cond: ("outer".vendid = "inner".org_id) -> Merge Join (cost=18935.35..23255.64rows=1700 width=63) Merge Cond: (("outer".vendid = "inner".vendid) AND (("outer".invnum)::text = "inner"."?column10?")) -> Index Scan using apinv_hdr_pkey on apinv_hdr h (cost=0.00..3148.16 rows=51016 width=21) -> Sort (cost=18935.35..19235.45 rows=120041 width=55) Sort Key: i.vendid, (i.invnum)::text -> Seq Scan on apinv_itemsi (cost=0.00..4152.99 rows=120041 width=55) Filter: ((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) -> Hash (cost=50.99..50.99 rows=1799 width=26) -> Seq Scan on vend_org v (cost=0.00..50.99 rows=1799 width=26) -> Materialize (cost=3.54..3.55 rows=1 width=4) -> Seq Scan on acct a (cost=0.00..3.54rows=1 width=4) Filter: ((code)::text = 'ap'::text) -> Hash (cost=24.06..24.06 rows=10width=4) -> Bitmap Heap Scan on proj_int i (cost=2.26..24.06 rows=10 width=4) Recheck Cond: ((lft >= 5283) AND (lft < 5300) AND (rgt <= 5300)) -> Bitmap Index Scan on i_proj_int_lft_rgt (cost=0.00..2.26 rows=10 width=0) Index Cond: ((lft >= 5283) AND (lft < 5300) AND (rgt <= 5300)) That is roughly equivalent to my materialized path method: select l.* from ledger l where l.proj in (select proj_id from proj_v where 4737 = any(ppath)); And is quite slow compared to 150 msec when enumerating the progeny projects like so: select l.* from ledger l where l.proj in (4737,4789,4892,4893,4894,4895,4933,4934,4935); Nested Loop Left Join (cost=19.73..4164.10 rows=7 width=85) -> Nested Loop (cost=19.73..4139.08 rows=7 width=81) -> Nested Loop (cost=19.73..4100.07 rows=7 width=63) -> Bitmap Heap Scan on apinv_items i (cost=19.73..1185.71 rows=487 width=55) Recheck Cond: ((proj = 4737) OR (proj = 4789) OR (proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR (proj = 4933) OR (proj = 4934) OR (proj = 4935)) Filter: ((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) -> BitmapOr (cost=19.73..19.73 rows=495 width=0) -> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4737) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4789) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4892) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4893) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4894) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4895) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4933) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4934) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4935) -> Index Scan usingapinv_hdr_pkey on apinv_hdr h (cost=0.00..5.97 rows=1 width=21) Index Cond: (("outer".vendid = h.vendid) AND (("outer".invnum)::text = (h.invnum)::text)) -> Index Scan using vend_org_pkey on vend_org v (cost=0.00..5.56 rows=1 width=26) Index Cond: (v.org_id = "outer".vendid) Nested Loop Left Join (cost=19.73..4164.10 rows=7 width=85) -> Nested Loop (cost=19.73..4139.08 rows=7 width=81) -> Nested Loop (cost=19.73..4100.07 rows=7 width=63) -> Bitmap Heap Scan on apinv_items i (cost=19.73..1185.71 rows=487 width=55) Recheck Cond: ((proj = 4737) OR (proj = 4789) OR (proj = 4892) OR (proj = 4893) OR (proj = 4894) OR (proj = 4895) OR (proj = 4933) OR (proj = 4934) OR (proj = 4935)) Filter: ((status = 'en'::bpchar) OR (status = 'cl'::bpchar) OR (status = 'pd'::bpchar)) -> BitmapOr (cost=19.73..19.73 rows=495 width=0) -> Bitmap Index Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4737) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4789) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4892) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4893) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4894) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4895) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4933) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4934) -> BitmapIndex Scan on i_apinv_items_proj (cost=0.00..2.19 rows=55 width=0) Index Cond: (proj = 4935) -> Index Scan usingapinv_hdr_pkey on apinv_hdr h (cost=0.00..5.97 rows=1 width=21) Index Cond: (("outer".vendid = h.vendid) AND (("outer".invnum)::text = (h.invnum)::text)) -> Index Scan using vend_org_pkey on vend_org v (cost=0.00..5.56 rows=1 width=26) Index Cond: (v.org_id = "outer".vendid) -> Seq Scan on acct a (cost=0.00..3.54rows=1 width=4) Filter: ((code)::text = 'ap'::text) -> Seq Scan on acct a (cost=0.00..3.54 rows=1width=4) Filter: ((code)::text = 'ap'::text) I'm still stumped...
Tom Lane wrote: >Kyle Bateman <kyle@actarg.com> writes: > > >>But I'm assuming that using an interval-encoded project tree, I would >>have to do something like the following to get a progency group: >> >> >>select * from ledger l, proj p where p.proj_id = l.proj and p.left > >>1234 and p.right < 2345; >> >> Here's an interesting result: I created a function proj_left(int4) that returns the left interval number for a given project. Then I created an index on the underlying table for the ledger view(which took forever to build) like so: create index i_test on apinv_items (proj_left(proj)); Now my query: select * from ledger where proj_left(dr_proj) >= 5283 and proj_left(dr_proj) < 5300; is very speedy. Problem is, I had to mark the function proj_left() as immutable, which it can not be since the left and right values for a given project will change any time a project is added, removed, or moved around the hierarchy :( So is there any good way to tell the planner to do several individual index scans for the projects involved in the desired progeny, or the results together and return the result? This is what it seems to be choosing in the case of the query: select * from ledger where proj in (4737,4789,4892,4893,4894,4895,4933,4934,4935);
Kyle Bateman <kyle@actarg.com> writes: > I'm wondering if this might expose a weakness in the optimizer having to > do with left joins. Before 8.2 the optimizer has no ability to rearrange the order of outer joins. Do you have time to try your test case against CVS HEAD? regards, tom lane
Tom Lane wrote: >Kyle Bateman <kyle@actarg.com> writes: > > >>I'm wondering if this might expose a weakness in the optimizer having to >>do with left joins. >> >> > >Before 8.2 the optimizer has no ability to rearrange the order of outer >joins. Do you have time to try your test case against CVS HEAD? > > OK, I figured it out--grabbed the latest snapshot (hope that is what you need). My results are similar: select l.* from ledg_v1 l, proj p where l.proj = p.proj_id and 5 = p.par; (24 msec)Nested Loop (cost=0.00..1991.93 rows=480 width=23) -> Nested Loop (cost=0.00..4.68 rows=6 width=8) -> Seq Scan on acct a (cost=0.00..1.12 rows=1 width=4) Filter: ((code)::text = 'ap'::text) -> Index Scan using i_proj_par on proj p (cost=0.00..3.49 rows=6 width=4) Index Cond: (5 = par) -> Index Scan using i_ledg_proj on ledg l (cost=0.00..330.17 rows=83 width=19) Index Cond: (l.proj = "outer".proj_id) select l.* from ledg_v2 l, proj p where l.proj = p.proj_id and 5 = p.par; (1.25 sec)Hash Join (cost=4.63..16768.43 rows=480 width=23) Hash Cond: ("outer".proj = "inner".proj_id) -> NestedLoop Left Join (cost=1.13..14760.13 rows=400000 width=23) -> Seq Scan on ledg l (cost=0.00..6759.00 rows=400000width=19) -> Materialize (cost=1.13..1.14 rows=1 width=4) -> Seq Scan on acct a (cost=0.00..1.12rows=1 width=4) Filter: ((code)::text = 'ap'::text) -> Hash (cost=3.49..3.49 rows=6width=4) -> Index Scan using i_proj_par on proj p (cost=0.00..3.49 rows=6 width=4) Index Cond: (5 = par)
Kyle Bateman <kyle@actarg.com> writes: > Tom Lane wrote: >> Before 8.2 the optimizer has no ability to rearrange the order of outer >> joins. Do you have time to try your test case against CVS HEAD? > OK, I figured it out--grabbed the latest snapshot (hope that is what you > need). > My results are similar: Are you sure you found a recent version? I get this from CVS HEAD: ledger=# explain analyze select l.* from ledg_v2 l, proj p where l.proj = p.proj_id and 5 = p.par; QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------Nested LoopLeft Join (cost=5.79..1386.74 rows=400 width=23) (actual time=0.125..1.543 rows=329 loops=1) -> Nested Loop (cost=4.66..1377.61rows=400 width=19) (actual time=0.109..1.072 rows=329 loops=1) -> Index Scan using i_proj_paron proj p (cost=0.00..8.41 rows=5 width=4) (actual time=0.023..0.028 rows=4 loops=1) Index Cond:(5 = par) -> Bitmap Heap Scan on ledg l (cost=4.66..272.83 rows=81 width=19) (actual time=0.073..0.213 rows=82loops=4) Recheck Cond: (l.proj = p.proj_id) -> Bitmap Index Scan on i_ledg_proj (cost=0.00..4.66rows=81 width=0) (actual time=0.041..0.041 rows=82 loops=4) Index Cond: (l.proj = p.proj_id) -> Materialize (cost=1.13..1.14 rows=1 width=4) (actual time=0.000..0.000 rows=1 loops=329) -> Seq Scanon acct a (cost=0.00..1.12 rows=1 width=4) (actual time=0.009..0.011 rows=1 loops=1) Filter: ((code)::text= 'ap'::text)Total runtime: 1.696 ms (12 rows) Yours is doing the left join inside the join to proj, which of course is terrible because it has to form the whole 400K-row join result. regards, tom lane
Kyle Bateman wrote: > Tom Lane wrote: > >> >> Before 8.2 the optimizer has no ability to rearrange the order of outer >> joins. Do you have time to try your test case against CVS HEAD? >> >> I've done some more refinement on my accounting ledger system that has clarified some of the problems I was having with performance joining to a union. Here's an interesting test case. I just tried it with PG 8.2beta1. I don't claim to understand the new feature of reordering queries very well, but it seems like this is related to that feature. Since its still a performance problem in 8.2, I thought this might be a helpful test case during beta. To see this demo, run create.sql on a clean database. It will create all the needed tables, views and test data. Then run q1 and q2 which are very efficient. Then q3 is the slow one. The reason is, it does the union, producing 300,000 records before trying to select by project. It seems like the optimizer should internally rewrite the query to look more like what is in q4 (which is very fast). Is there a way to make the optimizer do this? Kyle Bateman -- Make test schema for demonstrating how the postgres optimizer might improve -- performance on joins with unions -- Contains a record for each project (job-costing) code -- Projects are arranged in a hierarchical structure (parent/child) -- -------------------------------------------------------------- create table proj ( proj_id int4 primary key, title varchar, par int4 references proj on update cascade ); create index i_proj_par on proj (par); -- Contains a record for every 2 combinations of projects which are related -- to each other in the hierarchical project tree -- (parent/child, ancestor/progenitor, etc.) -- -------------------------------------------------------------- create table proj_rel ( anst_id int4 references proj on update cascade on delete cascade, -- ancestor project number prog_id int4 references proj on update cascade on delete cascade, -- progenitor project number rel int4, -- 0=self, 1=child, 2=grandchild, etc. primary key (anst_id, prog_id) ); -- Contains a record for each account number and an optional alpha code to identify a sub-ledger -- -------------------------------------------------------------- create table acct ( acct_id int4 primary key, -- account number title varchar, -- name of the account code varchar -- alpha code for the account ); create index i_acct_code on acct (code); -- Each sub-ledger contains transactions unique to a certain part of the business -- In addiiton to the standard fields, they all share in common, each sub-ledger -- contains additional fields that are unique to it (so they can not all be -- stored in a single table). In our actual implementation, these sub-ledgers -- are actually implemented as views joining even lower level tables. -- -------------------------------------------------------------- create table subledg_A ( rid int4 primary key, -- record ID amount numeric(14,2), proj int4 references proj on update cascade on delete cascade, unique_A varchar -- some other data ); create index i_subledg_A_proj on subledg_A (proj); -- -------------------------------------------------------------- create table subledg_B ( rid int4 primary key, -- record ID amount numeric(14,2), proj int4 references proj on update cascade on delete cascade, unique_B varchar -- some other data ); create index i_subledg_B_proj on subledg_B (proj); -- -------------------------------------------------------------- create table subledg_C ( rid int4 primary key, -- record ID amount numeric(14,2), proj int4 references proj on update cascade on delete cascade, unique_C varchar -- some other data ); create index i_subledg_C_proj on subledg_C (proj); -- These views allow a standard account code to presented in the appropriate ledgers -- -------------------------------------------------------------- create view subview_A as select 'AP ' || rid as trans_id, l.amount, l.proj, a.acct_id as acct from subledg_A l join acct a on a.code = 'ap'; -- -------------------------------------------------------------- create view subview_B as select 'AR ' || rid as trans_id, l.amount, l.proj, a.acct_id as acct from subledg_B l join acct a on a.code = 'ar'; -- -------------------------------------------------------------- create view subview_C as select 'PR ' || rid as trans_id, l.amount, l.proj, a.acct_id as acct from subledg_C l join acct a on a.code = 'pr'; -- General ledger - this should contain all transactions from all subledgers -- -------------------------------------------------------------- create view gen_ledg as select trans_id, amount, proj, acct from subview_A union select trans_id, amount, proj, acct from subview_B union select trans_id, amount, proj, acct from subview_C; -- Populate the project table: insert into proj (proj_id,title,par) values ( 1, 'The main parent project', null); insert into proj (proj_id,title,par) values ( 2, 'First child of 1', 1); insert into proj (proj_id,title,par) values ( 3, 'Second child of 1', 1); insert into proj (proj_id,title,par) values ( 4, 'First child of 2', 2); insert into proj (proj_id,title,par) values ( 5, 'second child of 2', 2); insert into proj (proj_id,title,par) values ( 6, 'First child of 5', 5); insert into proj (proj_id,title,par) values ( 7, 'Second child of 5', 5); insert into proj (proj_id,title,par) values ( 8, 'Third child of 5', 5); insert into proj (proj_id,title,par) values ( 9, 'Fourth child of 5', 5); insert into proj (proj_id,title,par) select *,'Sample project',1 from generate_series(10,5000); -- Populate the project relationships table: insert into proj_rel (anst_id,prog_id,rel) select proj_id, proj_id, 0 from proj; -- self insert into proj_rel (anst_id,prog_id,rel) select par, proj_id, 1 from proj where proj_id != 1; -- parents insert into proj_rel (anst_id,prog_id,rel) select 1, proj_id, 2 from proj where proj_id in (4,5); insert into proj_rel (anst_id,prog_id,rel) select 1, proj_id, 3 from proj where proj_id in (6,7,8,9); insert into proj_rel (anst_id,prog_id,rel) select 2, proj_id, 2 from proj where proj_id in (6,7,8,9); -- Populate the account table: insert into acct (acct_id,title,code) values ( 100, 'Account 100', 'cash'); insert into acct (acct_id,title,code) values ( 101, 'Account 101', null); insert into acct (acct_id,title,code) values ( 102, 'Account 102', null); insert into acct (acct_id,title,code) values ( 103, 'Account 103', 'ar'); insert into acct (acct_id,title,code) values ( 104, 'Account 104', null); insert into acct (acct_id,title,code) values ( 105, 'Account 105', null); insert into acct (acct_id,title,code) values ( 106, 'Account 106', 'ap'); insert into acct (acct_id,title,code) values ( 107, 'Account 107', null); insert into acct (acct_id,title,code) values ( 108, 'Account 108', 'pr'); insert into acct (acct_id,title,code) values ( 109, 'Account 109', null); insert into subledg_A (rid,amount,proj) select *, (random() * 10000), (random() * 4999) + 1 from generate_series(1,100000); insert into subledg_B (rid,amount,proj) select *, (random() * 10000), (random() * 4999) + 1 from generate_series(1,100000); insert into subledg_C (rid,amount,proj) select *, (random() * 10000), (random() * 4999) + 1 from generate_series(1,100000); vacuum analyze; -- When querying the general ledger for all transactions belonging to project 5, -- this first query is very fast. In spite of the union, the optimizer seems to apply -- the condition "proj = 5" to the inner tables first, and then append the results --explain analyze select * from gen_ledg where proj = 5; -- It is also very fast when trying to find all transactions for the progeny of 5 -- projects (5,6,7,8,9), but only if you are only querying a sub-ledger directly: --explain analyze select lg.* from subledg_A lg join proj_rel pr on pr.prog_id = lg.proj where pr.anst_id = 5; -- But when searching the general ledger for transactions belonging to the progeny -- of project 5, the optimizer produces the union first. It then has to process -- 300,000 records to find the few it needs (even though the desired fields are -- indexed in the underlying tables). --explain analyze select lg.* from gen_ledg lg join proj_rel pr on pr.prog_id = lg.proj where pr.anst_id = 5; -- It would be nice if the optimizer could rewrite the query to work like this: -- (which is very fast) --explain analyze select lg.* from subledg_A lg join proj_rel pr on pr.prog_id = lg.proj where pr.anst_id = 5 union select lg.* from subledg_B lg join proj_rel pr on pr.prog_id = lg.proj where pr.anst_id = 5 union select lg.* from subledg_C lg join proj_rel pr on pr.prog_id = lg.proj where pr.anst_id = 5 ;
Kyle Bateman <kyle@actarg.com> writes: > Is there a way to make the optimizer do this? Sorry, that's not happening for 8.2. Consider using a union all (not union) across the subledg_N tables directly and then joining to that. That boils down to being a partitioning case and I think probably will be covered by the 8.2 improvements. There isn't any understanding of how to commute joins and unions though ... (offhand I'm not even sure of the conditions under which such a thing would be safe). regards, tom lane
Tom Lane wrote: >Kyle Bateman <kyle@actarg.com> writes: > > >>Is there a way to make the optimizer do this? >> >> > >Sorry, that's not happening for 8.2. Consider using a union all (not >union) across the subledg_N tables directly and then joining to that. >That boils down to being a partitioning case and I think probably will >be covered by the 8.2 improvements. > Yup, union all is much more efficient. It hadn't really occurred to me the difference between union and union all. But it makes sense to eliminate the need for a unique sort. The q3 query went from 10 seconds to 1 second with just the addition of union all in the general ledger. BTW, explain analyze still says 10 seconds of run time (and takes 10 seconds to run), but when I remove the explain analyze, the query runs in about a second. What's that all about? Also, I came up with the view shown in the attachment. It is still much faster than joining to the union-all ledger (40 ms). I'm not sure why because I'm not sure if explain analyze is telling me the real story (I see a sequential scan of the ledgers in there when it runs 10 seconds). I'm not sure what it's doing when it runs in 1 second. Kyle -- This view is a possible workaround for the problem drop view gen_ledg_pr; --explain analyze create view gen_ledg_pr as select lg.*, pr.anst_id from subview_A lg join proj_rel pr on pr.prog_id = lg.proj union all select lg.*, pr.anst_id from subview_B lg join proj_rel pr on pr.prog_id = lg.proj union all select lg.*, pr.anst_id from subview_C lg join proj_rel pr on pr.prog_id = lg.proj ;
Kyle Bateman <kyle@actarg.com> writes: > BTW, explain analyze still says 10 seconds of run time (and takes 10 > seconds to run), but when I remove the explain analyze, the query runs > in about a second. What's that all about? Instrumentation overhead ... you're probably running this on a PC with a very slow clock-reading capability. Each node output row counted by explain analyze takes two gettimeofday() calls, and apparently it's not unusual for those to take several microseconds on cheap motherboards, even when the CPU is nominally very fast. regards, tom lane