Improving Query - Mailing list pgsql-performance
From | Ketema |
---|---|
Subject | Improving Query |
Date | |
Msg-id | 1193746737.222960.206150@50g2000hsm.googlegroups.com Whole thread Raw |
Responses |
Re: Improving Query
Re: Improving Query |
List | pgsql-performance |
I have the following query that is a part of a function: select order_details.tpv_success_id, order_details.tpv_id, order_details.ver_code, order_details.app_id, order_details.acct_id, order_details.first_name || ' ' || order_details.last_name as customer_name, order_details.order_date as creation_date, verification_success.checked_out, order_details.csr_name, order_details.products, order_details.promotion, order_details.division_id, order_details.has_dish_billing_info, (select array_upper(acct_product_data_requirements_details.acct_prod_ids, 1)) as num_prods, count(distinct case when provision_issues.account_product_id is not null and provision_issues.resolved_date is null then provision_issues.account_product_id end ) as num_open_issues, count(case when reconciliations.rec_id is not null and reconciliation_cancels.rec_cancel_id is null then reconciliations.rec_id end ) as num_provisioned, count(case when reconciliation_cancels.rec_cancel_id is not null then reconciliation_cancels.rec_cancel_id end ) as num_canceled from frontier.order_details inner join frontier.verification_success on order_details.tpv_success_id = verification_success.tpv_success_id inner join frontier.acct_product_data_requirements_details left outer join frontier.provision_issues on provision_issues.account_product_id = any(acct_product_data_requirements_details.acct_prod_ids) and provision_issues.resolved_date is null left outer join frontier.reconciliations left outer join frontier.reconciliation_cancels on reconciliations.rec_id = reconciliation_cancels.rec_id on reconciliations.account_product_id = any(acct_product_data_requirements_details.acct_prod_ids) on order_details.acct_id = acct_product_data_requirements_details.acct_id where verification_success.checked_out is null group by order_details.tpv_success_id, order_details.tpv_id, order_details.ver_code, order_details.app_id, order_details.acct_id, order_details.first_name || ' ' || order_details.last_name, order_details.order_date, verification_success.checked_out, order_details.csr_name, order_details.products, order_details.promotion, num_prods, order_details.division_id, order_details.has_dish_billing_info having ( count(case when reconciliations.rec_id is not null and reconciliation_cancels.rec_cancel_id is null then reconciliations.rec_id end ) < (select array_upper(acct_product_data_requirements_details.acct_prod_ids, 1)) ) and ( count(distinct case when provision_issues.account_product_id is not null and provision_issues.resolved_date is null then provision_issues.account_product_id end ) + count(case when reconciliations.rec_id is not null and reconciliation_cancels.rec_cancel_id is null then reconciliations.rec_id end ) + count(case when reconciliation_cancels.rec_cancel_id is not null then reconciliation_cancels.rec_cancel_id end ) ) < (select array_upper(acct_product_data_requirements_details.acct_prod_ids, 1)) and order_details.division_id = any('{79,38,70,66,35,40,37,36,67,41,65,39}') --this array here varies. indexes are present for the different variations order by order_details.order_date here is the execution plan: Sort (cost=1350779962.18..1350969875.28 rows=75965240 width=290) (actual time=16591.711..16591.723 rows=110 loops=1) Sort Key: frontier.order_details.order_date -> GroupAggregate (cost=1235567017.53..1295217399.34 rows=75965240 width=290) (actual time=16583.383..16591.420 rows=110 loops=1) Filter: ((count(CASE WHEN ((rec_id IS NOT NULL) AND (rec_cancel_id IS NULL)) THEN rec_id ELSE NULL::integer END) < (subplan)) AND (((count(DISTINCT CASE WHEN ((account_product_id IS NOT NULL) AND (resolved_date IS NULL)) THEN account_product_id ELSE NULL::integer END) + count(CASE WHEN ((rec_id IS NOT NULL) AND (rec_cancel_id IS NULL)) THEN rec_id ELSE NULL::integer END)) + count(CASE WHEN (rec_cancel_id IS NOT NULL) THEN rec_cancel_id ELSE NULL::integer END)) < (subplan))) -> Sort (cost=1235567017.53..1238002161.29 rows=974057502 width=290) (actual time=16576.997..16577.513 rows=3366 loops=1) Sort Key: frontier.order_details.tpv_success_id, frontier.order_details.tpv_id, frontier.order_details.ver_code, frontier.order_details.app_id, frontier.order_details.acct_id, (((frontier.order_details.first_name)::text || ' '::text) || (frontier.order_details.last_name)::text), frontier.order_details.order_date, verification_success.checked_out, frontier.order_details.csr_name, frontier.order_details.products, frontier.order_details.promotion, (subplan), frontier.order_details.division_id, frontier.order_details.has_dish_billing_info -> Merge Join (cost=100001383.41..310142000.26 rows=974057502 width=290) (actual time=1055.584..16560.634 rows=3366 loops=1) Merge Cond: ("outer".acct_id = "inner".acct_id) -> Nested Loop Left Join (cost=328.94..173666048.48 rows=1928122302 width=53) (actual time=0.236..16499.771 rows=7192 loops=1) Join Filter: ("inner".account_product_id = ANY ("outer".acct_prod_ids)) -> Nested Loop Left Join (cost=2.21..134714.57 rows=564852 width=45) (actual time=0.215..1021.209 rows=5523 loops=1) Join Filter: ("inner".account_product_id = ANY ("outer".acct_prod_ids)) -> Index Scan using "FKI_acct_id" on acct_product_data_requirements_details (cost=0.00..488.19 rows=5484 width=33) (actual time=0.011..15.502 rows=5484 loops=1) -> Bitmap Heap Scan on provision_issues (cost=2.21..17.27 rows=206 width=12) (actual time=0.035..0.106 rows=206 loops=5484) Recheck Cond: (resolved_date IS NULL) -> Bitmap Index Scan on "IX_resolved_date_null" (cost=0.00..2.21 rows=206 width=0) (actual time=0.032..0.032 rows=206 loops=5484) -> Materialize (cost=326.74..395.01 rows=6827 width=12) (actual time=0.000..0.852 rows=6827 loops=5523) -> Merge Left Join (cost=0.00..319.91 rows=6827 width=12) (actual time=0.016..13.426 rows=6827 loops=1) Merge Cond: ("outer".rec_id = "inner".rec_id) -> Index Scan using reconciliation_pkey on reconciliations (cost=0.00..215.59 rows=6827 width=8) (actual time=0.004..4.209 rows=6827 loops=1) -> Index Scan using "FKI_rec_id" on reconciliation_cancels (cost=0.00..56.80 rows=2436 width=8) (actual time=0.004..1.534 rows=2436 loops=1) -> Sort (cost=100001054.46..100001061.39 rows=2770 width=241) (actual time=18.984..19.937 rows=3366 loops=1) Sort Key: frontier.order_details.acct_id -> Hash Join (cost=100000131.90..100000896.08 rows=2770 width=241) (actual time=6.525..13.644 rows=2459 loops=1) Hash Cond: ("outer".tpv_success_id = "inner".tpv_success_id) -> Append (cost=100000000.00..100000694.84 rows=2776 width=199) (actual time=0.092..4.627 rows=2459 loops=1) -> Seq Scan on order_details (cost=100000000.00..100000012.45 rows=35 width=199) (actual time=0.001..0.001 rows=0 loops=1) Filter: (division_id = ANY ('{79,38,70,66,35,40,37,36,67,41,65,39}'::integer[])) -> Bitmap Heap Scan on order_details_august_2007 order_details (cost=2.33..88.63 rows=380 width=158) (actual time=0.089..0.557 rows=330 loops=1) Recheck Cond: (division_id = ANY ('{79,38,70,66,35,40,37,36,67,41,65,39}'::integer[])) -> Bitmap Index Scan on "IX_august_2007_provision_list2" (cost=0.00..2.33 rows=380 width=0) (actual time=0.075..0.075 rows=330 loops=1) -> Bitmap Heap Scan on order_details_july_2007 order_details (cost=2.31..71.39 rows=288 width=159) (actual time=0.082..0.521 rows=314 loops=1) Recheck Cond: (division_id = ANY ('{79,38,70,66,35,40,37,36,67,41,65,39}'::integer[])) -> Bitmap Index Scan on "IX_july_2007_provision_list2" (cost=0.00..2.31 rows=288 width=0) (actual time=0.069..0.069 rows=314 loops=1) -> Bitmap Heap Scan on order_details_june_2007 order_details (cost=2.05..71.82 rows=279 width=148) (actual time=0.029..0.106 rows=51 loops=1) Recheck Cond: (division_id = ANY ('{79,38,70,66,35,40,37,36,67,41,65,39}'::integer[])) -> Bitmap Index Scan on "IX_june_2007_provision_list2" (cost=0.00..2.05 rows=279 width=0) (actual time=0.022..0.022 rows=51 loops=1) -> Bitmap Heap Scan on order_details_october_2007 order_details (cost=7.24..279.34 rows=1060 width=159) (actual time=0.285..2.035 rows=1244 loops=1) Recheck Cond: (division_id = ANY ('{79,38,70,66,35,40,37,36,67,41,65,39}'::integer[])) -> Bitmap Index Scan on "IX_october_2007_provision_list2" (cost=0.00..7.24 rows=1060 width=0) (actual time=0.239..0.239 rows=1244 loops=1) -> Bitmap Heap Scan on order_details_september_2007 order_details (cost=4.52..171.21 rows=734 width=150) (actual time=0.130..0.856 rows=520 loops=1) Recheck Cond: (division_id = ANY ('{79,38,70,66,35,40,37,36,67,41,65,39}'::integer[])) -> Bitmap Index Scan on "IX_september_2007_provision_list2" (cost=0.00..4.52 rows=734 width=0) (actual time=0.110..0.110 rows=520 loops=1) -> Hash (cost=118.21..118.21 rows=5473 width=8) (actual time=6.414..6.414 rows=5484 loops=1) -> Bitmap Heap Scan on verification_success (cost=22.48..118.21 rows=5473 width=8) (actual time=0.731..3.311 rows=5484 loops=1) Recheck Cond: (checked_out IS NULL) -> Bitmap Index Scan on "IX_checked_out_isnull" (cost=0.00..22.48 rows=5473 width=0) (actual time=0.722..0.722 rows=5485 loops=1) SubPlan -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.001..0.001 rows=1 loops=3366) SubPlan -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=780) -> Result (cost=0.00..0.01 rows=1 width=0) (actual time=0.000..0.000 rows=1 loops=2459) Total runtime: 16593.353 ms I have attached an erd of the tables used in this query. If it is stripped out it can be viewed here: http://www.ketema.net/provision_list_tables_erd.jpg My concern is with the sort step that takes 15 seconds by itself: -> Sort (cost=1235567017.53..1238002161.29 rows=974057502 width=290) (actual time=16576.997..16577.513 rows=3366 loops=1) Sort Key: frontier.order_details.tpv_success_id, frontier.order_details.tpv_id, frontier.order_details.ver_code, frontier.order_details.app_id, frontier.order_details.acct_id, (((frontier.order_details.first_name)::text || ' '::text) || (frontier.order_details.last_name)::text), frontier.order_details.order_date, verification_success.checked_out, frontier.order_details.csr_name, frontier.order_details.products, frontier.order_details.promotion, (subplan), frontier.order_details.division_id, frontier.order_details.has_dish_billing_info I believe this is due to the aggregate done in the select clause as well as the sub select: (select array_upper(acct_product_data_requirements_details.acct_prod_ids, 1)) as num_prods, count(distinct case when provision_issues.account_product_id is not null and provision_issues.resolved_date is null then provision_issues.account_product_id end ) as num_open_issues, count(case when reconciliations.rec_id is not null and reconciliation_cancels.rec_cancel_id is null then reconciliations.rec_id end ) as num_provisioned, count(case when reconciliation_cancels.rec_cancel_id is not null then reconciliation_cancels.rec_cancel_id end ) as num_canceled I believe the counts and sub select cause the sort to be performed multiple times? How can I improve this step? Things I have thought about: 1)Creating indexes on the aggregates...Found out this can't be done. 2)Create Views of the counts and the sub select...is this any faster as the view is executed at run time anyway? 3)Create actual tables of the sub select and aggregates...How would this be maintained to ensure it was always accurate? 4)Increasing hardware resources. Currently box is on a single processor amd64 with 8Gb of RAM. below are the settings for resource usage. shared_buffers = 65536 temp_buffers = 5000 max_prepared_transactions = 2000 work_mem = 131072 maintenance_work_mem = 512000 max_stack_depth = 7168 max_fsm_pages = 160000 max_fsm_relations = 4000 The only function of this box if for Pg, so I do not mind it using every last drop of ram and resources that it can. 5)Upgrade version of pg..currently is running 8.1.4 Would appreciate any suggestions. Thanks http://pgsql.privatepaste.com/7ffDdPQvIN
pgsql-performance by date: