Re: Very poor performance with Nested Loop Anti Join - Mailing list pgsql-performance

From Andreas Joseph Krogh
Subject Re: Very poor performance with Nested Loop Anti Join
Date
Msg-id VisenaEmail.36.dfb69f223c50e31c.1564a108885@tc7-visena
Whole thread Raw
In response to Re: Very poor performance with Nested Loop Anti Join  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
På tirsdag 02. august 2016 kl. 01:15:05, skrev Tom Lane <tgl@sss.pgh.pa.us>:
Andreas Joseph Krogh <andreas@visena.com> writes:
> This query performs terribly slow (~26 minutes,��1561346.597ms):

Seems like the key misestimation is on the inner antijoin:

>                ->  Hash Anti Join  (cost=654.21..4008.72 rows=1 width=8) (actual time=9.016..40.672 rows=76174 loops=1)
>                      Hash Cond: (il.invoice_id = creditnote.credit_against)
>                      ->  Seq Scan on onp_crm_invoice_line il  (cost=0.00..3062.01 rows=78001 width=8) (actual time=0.005..11.259 rows=78614 loops=1)
>                      ->  Hash  (cost=510.56..510.56 rows=11492 width=8) (actual time=8.940..8.940 rows=372 loops=1)
>                            Buckets: 16384  Batches: 1  Memory Usage: 143kB
>                            ->  Seq Scan on onp_crm_invoice creditnote  (cost=0.00..510.56 rows=11492 width=8) (actual time=0.014..7.882 rows=11507 loops=1)
>                                  Filter: ((sent_date <= '2016-06-27'::date) AND ((status_key)::text = 'INVOICE_STATUS_INVOICED'::text))
>                                  Rows Removed by Filter: 149

If it realized that this produces 78k rows not 1, it'd likely do something
smarter at the outer antijoin.

I have no idea why that estimate's so far off though.  What PG version is
this?  Stats all up to date on these two tables?
 
Sorry for not providing PG-version, this is on 9.5.3.
All stats are up to date, or should be a I've analyzed all manually.
 
Are the rows excluded
by the filter condition on "creditnote" significantly different from the
rest of that table?
 
This happens also without the filter-cond:
 
explain analyze
SELECT
      log.relation_id as company_id    , sum(log.duration) AS duration
FROM onp_crm_activity_log log
    JOIN onp_crm_person logfor ON logfor.onp_user_id = log.logged_for AND logfor.is_resource = FALSE

WHERE 1 = 1
      -- Filter out already invoiced
      AND NOT EXISTS(    SELECT * FROM onp_crm_calendarentry_invoice_membership cem        JOIN onp_crm_invoice_line il ON cem.invoice_line_id = il.id        JOIN onp_crm_invoice inv ON il.invoice_id = inv.entity_id    WHERE cem.calendar_entry_id = log.id          AND NOT EXISTS(        SELECT * FROM onp_crm_invoice creditnote        WHERE il.invoice_id = creditnote.credit_against    )
)

GROUP BY
    log.relation_id
;
 
                                                                                                                     QUERY PLAN                                                                                                                      
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=12049.35..12067.11 rows=1421 width=12) (actual time=1386683.646..1386683.858 rows=720 loops=1)   Group Key: log.relation_id   ->  Nested Loop Anti Join  (cost=512.08..12039.32 rows=2006 width=12) (actual time=395.017..1386576.756 rows=93480 loops=1)         Join Filter: (cem.calendar_entry_id = log.id)         Rows Removed by Join Filter: 12185913244
         ->  Hash Join  (cost=86.56..9757.61 rows=2006 width=20) (actual time=0.165..366.778 rows=181872 loops=1)               Hash Cond: (log.logged_for = logfor.onp_user_id)               ->  Seq Scan on onp_crm_activity_log log  (cost=0.00..8957.45 rows=184945 width=24) (actual time=0.003..256.862 rows=182606 loops=1)               ->  Hash  (cost=39.46..39.46 rows=3768 width=8) (actual time=0.132..0.132 rows=36 loops=1)                     Buckets: 4096  Batches: 1  Memory Usage: 34kB                     ->  Bitmap Heap Scan on onp_crm_person logfor  (cost=3.69..39.46 rows=3768 width=8) (actual time=0.033..0.125 rows=36 loops=1)                           Recheck Cond: (onp_user_id IS NOT NULL)                           Filter: (NOT is_resource)                           Rows Removed by Filter: 5
                           Heap Blocks: exact=10
                           ->  Bitmap Index Scan on onp_crm_person_onp_id_idx  (cost=0.00..2.75 rows=41 width=0) (actual time=0.017..0.017 rows=41 loops=1)         ->  Materialize  (cost=425.53..2251.62 rows=1 width=4) (actual time=0.000..2.544 rows=67003 loops=181872)               ->  Nested Loop  (cost=425.53..2251.61 rows=1 width=4) (actual time=3.283..320.057 rows=88511 loops=1)                     ->  Nested Loop  (cost=425.24..2251.30 rows=1 width=8) (actual time=3.241..154.783 rows=88511 loops=1)                           ->  Hash Anti Join  (cost=424.95..2250.75 rows=1 width=8) (actual time=3.110..30.097 rows=76281 loops=1)                                 Hash Cond: (il.invoice_id = creditnote.credit_against)                                 ->  Index Only Scan using origo_invoice_line_id_invoice_idx on onp_crm_invoice_line il  (cost=0.29..1530.95 rows=78707 width=8) (actual time=0.030..13.719 rows=78740 loops=1)                                       Heap Fetches: 2967
                                 ->  Hash  (cost=278.22..278.22 rows=11715 width=8) (actual time=3.003..3.003 rows=376 loops=1)                                       Buckets: 16384  Batches: 1  Memory Usage: 143kB                                       ->  Index Only Scan using origo_invoice_credit_against_idx on onp_crm_invoice creditnote  (cost=0.29..278.22 rows=11715 width=8) (actual time=0.042..2.082 rows=11692 loops=1)                                             Heap Fetches: 1151
                           ->  Index Only Scan using onp_crm_calendarentry_invoice_invoice_line_id_calendar_entr_key on onp_crm_calendarentry_invoice_membership cem  (cost=0.29..0.45 rows=9 width=8) (actual time=0.001..0.001 rows=1 loops=76281)                                 Index Cond: (invoice_line_id = il.id)                                 Heap Fetches: 4753
                     ->  Index Only Scan using onp_crm_invoice_pkey on onp_crm_invoice inv  (cost=0.29..0.30 rows=1 width=8) (actual time=0.001..0.001 rows=1 loops=88511)                           Index Cond: (entity_id = il.invoice_id)                           Heap Fetches: 12084
 Planning time: 5.824 ms Execution time: 1386686.664 ms
(35 rows)
 
With set enable_nestloop to off;
 
                                                                                                            QUERY PLAN                                                                                                            
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 HashAggregate  (cost=13975.70..13993.46 rows=1421 width=12) (actual time=338.185..338.341 rows=720 loops=1)   Group Key: log.relation_id   ->  Hash Anti Join  (cost=4265.19..13965.66 rows=2007 width=12) (actual time=147.696..318.314 rows=93511 loops=1)         Hash Cond: (log.id = cem.calendar_entry_id)         ->  Hash Join  (cost=86.56..9761.69 rows=2007 width=20) (actual time=0.166..127.604 rows=181915 loops=1)               Hash Cond: (log.logged_for = logfor.onp_user_id)               ->  Seq Scan on onp_crm_activity_log log  (cost=0.00..8961.23 rows=185023 width=24) (actual time=0.006..84.093 rows=182649 loops=1)               ->  Hash  (cost=39.46..39.46 rows=3768 width=8) (actual time=0.123..0.123 rows=36 loops=1)                     Buckets: 4096  Batches: 1  Memory Usage: 34kB                     ->  Bitmap Heap Scan on onp_crm_person logfor  (cost=3.69..39.46 rows=3768 width=8) (actual time=0.038..0.102 rows=36 loops=1)                           Recheck Cond: (onp_user_id IS NOT NULL)                           Filter: (NOT is_resource)                           Rows Removed by Filter: 5
                           Heap Blocks: exact=10
                           ->  Bitmap Index Scan on onp_crm_person_onp_id_idx  (cost=0.00..2.75 rows=41 width=0) (actual time=0.019..0.019 rows=41 loops=1)         ->  Hash  (cost=4178.62..4178.62 rows=1 width=4) (actual time=147.497..147.497 rows=88523 loops=1)               Buckets: 131072 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 4137kB               ->  Hash Join  (cost=2553.20..4178.62 rows=1 width=4) (actual time=98.512..133.017 rows=88523 loops=1)                     Hash Cond: (cem.invoice_line_id = il.id)                     ->  Seq Scan on onp_crm_calendarentry_invoice_membership cem  (cost=0.00..1290.66 rows=89266 width=8) (actual time=0.006..11.151 rows=89175 loops=1)                     ->  Hash  (cost=2553.19..2553.19 rows=1 width=4) (actual time=98.481..98.481 rows=76286 loops=1)                           Buckets: 131072 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 3706kB                           ->  Merge Join  (cost=2252.01..2553.19 rows=1 width=4) (actual time=50.922..87.641 rows=76286 loops=1)                                 Merge Cond: (il.invoice_id = inv.entity_id)                                 ->  Sort  (cost=2251.73..2251.73 rows=1 width=8) (actual time=50.872..55.552 rows=76286 loops=1)                                       Sort Key: il.invoice_id                                       Sort Method: quicksort  Memory: 6648kB                                       ->  Hash Anti Join  (cost=425.91..2251.72 rows=1 width=8) (actual time=5.904..35.979 rows=76286 loops=1)                                             Hash Cond: (il.invoice_id = creditnote.credit_against)                                             ->  Index Only Scan using origo_invoice_line_id_invoice_idx on onp_crm_invoice_line il  (cost=0.29..1530.95 rows=78707 width=8) (actual time=0.028..16.124 rows=78745 loops=1)                                                   Heap Fetches: 2972
                                             ->  Hash  (cost=278.74..278.74 rows=11750 width=8) (actual time=5.792..5.792 rows=376 loops=1)                                                   Buckets: 16384  Batches: 1  Memory Usage: 143kB                                                   ->  Index Only Scan using origo_invoice_credit_against_idx on onp_crm_invoice creditnote  (cost=0.29..278.74 rows=11750 width=8) (actual time=0.067..4.466 rows=11694 loops=1)                                                         Heap Fetches: 1155
                                 ->  Index Only Scan using onp_crm_invoice_pkey on onp_crm_invoice inv  (cost=0.29..272.09 rows=11750 width=8) (actual time=0.040..10.755 rows=76661 loops=1)                                       Heap Fetches: 3840
 Planning time: 3.762 ms Execution time: 339.634 ms
(39 rows)
 
 
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Attachment

pgsql-performance by date:

Previous
From: Om Prakash Jaiswal
Date:
Subject: Create language plperlu Error
Next
From: jonescam
Date:
Subject: Need Beta Users for New Database Monitoring Solution!