BUG #6075: mergejoin early termination estimates bug or defficiency - Mailing list pgsql-bugs
From | Roman |
---|---|
Subject | BUG #6075: mergejoin early termination estimates bug or defficiency |
Date | |
Msg-id | 201106222235.p5MMZS45031751@wwwmaster.postgresql.org Whole thread Raw |
List | pgsql-bugs |
The following bug has been logged online: Bug reference: 6075 Logged by: Roman Email address: roman_mir@hotmail.com PostgreSQL version: 9.0.3 Operating system: Red Hat 4.4.1-2, 64-bit Description: mergejoin early termination estimates bug or defficiency Details: Entered PG IRC, spoke with RhodiumToad, who advised that this must be reported as a bug or deficiency. http://pgsql.privatepaste.com/09af338ded - the SQL and EXPLAIN ANALYZE output, and http://explain.depesz.com/s/RDN8 - explanation. SQL took 30 seconds to perform, HashAggregate was 8239.96..8240.28 One missing index was added, took runtime to 27 seconds. Eventually executed "set enable_mergejoin=false;" command and runtime of the SQL went down to 38 MILLISECONDS. From 27 seconds down to 38ms with mergejoin off. Tried setting cpu tuple to 0.1, this did not improve the situation, ended up setting mergejoin to off in the postgresql.conf If more details are needed, please contact me at roman_mir@hotmail.com -- In case the paste bin is emptied, here is the SQL: select r.store_id, sum(ri.amount_bought) as count, CASE WHEN sum(least(ri.total_price_before_discount, ri.total_price_after_discount))<0 AND sum(greatest(ri.total_price_before_discount, ri.total_price_after_discount))<0 THEN sum(greatest(ri.total_price_before_discount, ri.total_price_after_discount)) ELSE sum(least(ri.total_price_before_discount, ri.total_price_after_discount)) END as total_sale, CASE WHEN sum(ri.total_discount_amount)<>0 THEN sum(ri.total_discount_amount) ELSE 0 END as total_discount, ri.receipt_id, ri.barcode as barcode, ri.item_id, ri.master_product_id, null as name, null as price from receipt r, receipt_item ri where r.store_id in (0,1,2,3,5,6,7,8,9,10,12,13,14) and r.receipt_id=ri.receipt_id and r.receipt_date>'2011-06-22' and r.receipt_date<'2011-06-24' and ri.receipt_item_id in ( select distinct ri.receipt_item_id from receipt r, receipt_item ri, master_product mp, master_product_label mpl where r.receipt_id=ri.receipt_id and ri.master_product_id=mp.product_id and mp.product_id=mpl.product_id and ( (mpl.label_id= 530)) and r.receipt_date>'2011-06-22' and r.receipt_date<'2011-06-24' ) group by r.store_id, ri.receipt_id, ri.barcode, ri.item_id, ri.master_product_id; Here is the EXPLAIN ANALYZE with mergejoin on: HashAggregate (cost=8579.64..8579.68 rows=1 width=72) (actual time=30901.150..30901.391 rows=352 loops=1) -> Nested Loop (cost=8255.64..8579.61 rows=1 width=72) (actual time=30897.164..30900.399 rows=356 loops=1) -> Nested Loop (cost=8255.64..8547.23 rows=32 width=68) (actual time=30897.138..30898.823 rows=356 loops=1) -> HashAggregate (cost=8255.64..8255.96 rows=32 width=8) (actual time=30897.127..30897.252 rows=356 loops=1) -> Merge Join (cost=1165.04..8255.56 rows=32 width=8) (actual time=62.582..30896.810 rows=356 loops=1) Merge Cond: (ri.master_product_id = mpl.product_id) -> Merge Join (cost=3.19..243053.40 rows=1099 width=24) (actual time=0.038..30890.508 rows=13204 loops=1) Merge Cond: (ri.master_product_id = mp.product_id) -> Nested Loop (cost=0.00..8099569.68 rows=37762 width=16) (actual time=0.032..30852.168 rows=13204 loops=1) -> Index Scan using receipt_item_master_product_idx on receipt_item ri (cost=0.00..729693.91 rows=7400642 width=24) (actual time=0.014..4165.121 rows=7398781 loops=1) -> Index Scan using receipt_pkey on receipt r (cost=0.00..0.98 rows=1 width=8) (actual time=0.003..0.003 rows=0 loops=7398781) Index Cond: (r.receipt_id = ri.receipt_id) Filter: ((r.receipt_date > '2011-06-22 00:00:00'::timestamp without time zone) AND (r.receipt_date < '2011-06-24 00:00:00'::timestamp without time zone)) -> Index Scan using master_product_pkey on master_product mp (cost=0.00..7229.15 rows=34421 width=8) (actual time=0.004..24.787 rows=41681 loops=1) -> Sort (cost=1161.84..1168.42 rows=2629 width=8) (actual time=3.429..3.698 rows=2884 loops=1) Sort Key: mpl.product_id Sort Method: quicksort Memory: 224kB -> Bitmap Heap Scan on master_product_label mpl (cost=60.65..1012.51 rows=2629 width=8) (actual time=0.475..1.615 rows=2715 loops=1) Recheck Cond: (label_id = 530) -> Bitmap Index Scan on master_product_label_labelid_idx (cost=0.00..59.99 rows=2629 width=0) (actual time=0.398..0.398 rows=2715 loops=1) Index Cond: (label_id = 530) -> Index Scan using receipt_item_pkey on receipt_item ri (cost=0.00..9.08 rows=1 width=76) (actual time=0.004..0.004 rows=1 loops=356) Index Cond: (ri.receipt_item_id = ri.receipt_item_id) -> Index Scan using receipt_pkey on receipt r (cost=0.00..1.00 rows=1 width=12) (actual time=0.003..0.004 rows=1 loops=356) Index Cond: (r.receipt_id = ri.receipt_id) Filter: ((r.receipt_date > '2011-06-22 00:00:00'::timestamp without time zone) AND (r.receipt_date < '2011-06-24 00:00:00'::timestamp without time zone) AND (r.store_id = ANY ('{0,1,2,3,5,6,7,8,9,10,12,13,14}'::integer[]))) Total runtime: 30901.690 ms Here is the EXPLAIN ANALYZE with mergejoin off: HashAggregate (cost=156852.07..156852.11 rows=1 width=72) (actual time=38.404..38.647 rows=352 loops=1) -> Nested Loop (cost=156536.82..156852.04 rows=1 width=72) (actual time=34.483..37.646 rows=356 loops=1) -> Nested Loop (cost=156536.82..156828.40 rows=32 width=68) (actual time=34.464..36.153 rows=356 loops=1) -> HashAggregate (cost=156536.82..156537.14 rows=32 width=8) (actual time=34.454..34.550 rows=356 loops=1) -> Nested Loop (cost=1189.44..156536.74 rows=32 width=8) (actual time=3.530..34.244 rows=356 loops=1) -> Hash Join (cost=1189.44..156185.25 rows=1097 width=24) (actual time=3.522..32.878 rows=356 loops=1) Hash Cond: (ri.master_product_id = mpl.product_id) -> Nested Loop (cost=144.07..153714.67 rows=37713 width=16) (actual time=0.766..27.303 rows=13204 loops=1) -> Bitmap Heap Scan on receipt r (cost=144.07..12325.61 rows=3819 width=8) (actual time=0.753..2.032 rows=4435 loops=1) Recheck Cond: ((receipt_date > '2011-06-22 00:00:00'::timestamp without time zone) AND (receipt_date < '2011-06-24 00:00:00'::timestamp without time zone)) -> Bitmap Index Scan on receipt_receipt_date_idx (cost=0.00..143.12 rows=3819 width=0) (actual time=0.691..0.691 rows=4435 loops=1) Index Cond: ((receipt_date > '2011-06-22 00:00:00'::timestamp without time zone) AND (receipt_date < '2011-06-24 00:00:00'::timestamp without time zone)) -> Index Scan using receipt_item_receiptid_idx on receipt_item ri (cost=0.00..36.90 rows=10 width=24) (actual time=0.003..0.005 rows=3 loops=4435) Index Cond: (ri.receipt_id = r.receipt_id) -> Hash (cost=1012.51..1012.51 rows=2629 width=8) (actual time=2.333..2.333 rows=2715 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 107kB -> Bitmap Heap Scan on master_product_label mpl (cost=60.65..1012.51 rows=2629 width=8) (actual time=0.478..1.689 rows=2715 loops=1) Recheck Cond: (label_id = 530) -> Bitmap Index Scan on master_product_label_labelid_idx (cost=0.00..59.99 rows=2629 width=0) (actual time=0.407..0.407 rows=2715 loops=1) Index Cond: (label_id = 530) -> Index Scan using master_product_pkey on master_product mp (cost=0.00..0.31 rows=1 width=8) (actual time=0.003..0.003 rows=1 loops=356) Index Cond: (mp.product_id = ri.master_product_id) -> Index Scan using receipt_item_pkey on receipt_item ri (cost=0.00..9.08 rows=1 width=76) (actual time=0.004..0.004 rows=1 loops=356) Index Cond: (ri.receipt_item_id = ri.receipt_item_id) -> Index Scan using receipt_receiptid_receiptdate_idx on receipt r (cost=0.00..0.73 rows=1 width=12) (actual time=0.003..0.004 rows=1 loops=356) Index Cond: ((r.receipt_id = ri.receipt_id) AND (r.receipt_date > '2011-06-22 00:00:00'::timestamp without time zone) AND (r.receipt_date < '2011-06-24 00:00:00'::timestamp without time zone)) Filter: (r.store_id = ANY ('{0,1,2,3,5,6,7,8,9,10,12,13,14}'::integer[])) Total runtime: 38.820 ms Thank you, Roman
pgsql-bugs by date: