Thread: BUG #14948: cost overflow

BUG #14948: cost overflow

From
jasc@gmx.net
Date:
The following bug has been logged on the website:

Bug reference:      14948
Logged by:          Jan Schulz
Email address:      jasc@gmx.net
PostgreSQL version: 10.1
Operating system:   Linux (Ubuntu 16.04 LTS + HWE)
Description:

Hello,

I was asked on twitter to post this issue to the bugs list. 

We have a server which reports negative costs in a query plan:

   GroupAggregate
(cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00
rows=1600000000 width=254)

On one system it looks funny, on other systems it looks normal (see query
plans for the same query on different systems/environments).

The background for this is a server getting killed by OOM killer and we are
currently debugging this: we have work_mem set to 2GB and since a few days
we have crashes with these settings which we haven't yet pin to a root cause
-> this might be related to these crashes or it might be a result of this
crashes, but it looks funny in any case.

Thanks,

Jan

Full EXPLAIN (ANALYSE BUFFERS) on the funny looking system:

GroupAggregate
(cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00
rows=1600000000 width=254) (actual time=1.640..1.640 rows=0 loops=1)
  Group Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk,
('-1'::smallint), (('-1'::smallint)::integer)
  Buffers: shared hit=11
  ->  Sort
(cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00
rows=16812456500 width=120) (actual time=1.638..1.638 rows=0 loops=1)
        Sort Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk,
('-1'::smallint), (('-1'::smallint)::integer)
        Sort Method: quicksort  Memory: 25kB
        Buffers: shared hit=11
        ->  Append
(cost=329985.38..-25142102027752040829944596389028691968.00 rows=16812456500
width=120) (actual time=1.621..1.621 rows=0 loops=1)
              Buffers: shared hit=11
              ->  Subquery Scan on "*SELECT* 1"  (cost=329985.38..462708.80
rows=3537400 width=120) (actual time=0.077..0.077 rows=0 loops=1)
                    Buffers: shared hit=6
                    ->  Merge Right Join  (cost=329985.38..418491.30
rows=3537400 width=118) (actual time=0.076..0.076 rows=0 loops=1)
                          Merge Cond: (cm.campaign_name =
ev.campaign_name)
                          Buffers: shared hit=6
                          ->  Index Scan using campaign__campaign_name on
campaign cm  (cost=0.15..67.80 rows=1310 width=34) (actual time=0.015..0.015
rows=1 loops=1)
                                Buffers: shared hit=2
                          ->  Sort  (cost=329985.23..331335.38 rows=540061
width=74) (actual time=0.058..0.058 rows=0 loops=1)
                                Sort Key: ev.campaign_name
                                Sort Method: quicksort  Memory: 25kB
                                Buffers: shared hit=4
                                ->  Merge Right Join
(cost=270456.49..278563.96 rows=540061 width=74) (actual time=0.049..0.049
rows=0 loops=1)
                                      Merge Cond: (et.event_type_name =
ev.event_type)
                                      Buffers: shared hit=4
                                      ->  Sort  (cost=90.93..94.20 rows=1310
width=32) (actual time=0.025..0.025 rows=1 loops=1)
                                            Sort Key: et.event_type_name
                                            Sort Method: quicksort  Memory:
25kB
                                            Buffers: shared hit=1
                                            ->  Seq Scan on event_type et
(cost=0.00..23.10 rows=1310 width=32) (actual time=0.006..0.007 rows=5
loops=1)
                                                  Buffers: shared hit=1
                                      ->  Sort  (cost=270365.57..270571.70
rows=82452 width=74) (actual time=0.023..0.023 rows=0 loops=1)
                                            Sort Key: ev.event_type
                                            Sort Method: quicksort  Memory:
25kB
                                            Buffers: shared hit=3
                                            ->  Hash Left Join
(cost=40461.31..263632.84 rows=82452 width=74) (actual time=0.014..0.014
rows=0 loops=1)
                                                  Hash Cond:
(ev.click_subscriber_key = csu.subscriber_name)
                                                  Buffers: shared hit=3
                                                  ->  Hash Left Join
(cost=21328.11..243458.68 rows=82452 width=102) (actual time=0.013..0.013
rows=0 loops=1)
                                                        Hash Cond:
(ev.open_subscriber_key = osu.subscriber_name)
                                                        Buffers: shared
hit=3
                                                        ->  Hash Left Join
(cost=2194.91..223284.53 rows=82452 width=130) (actual time=0.012..0.012
rows=0 loops=1)
                                                              Hash Cond:
(ev.sendid = si.sendid_name)
                                                              Buffers:
shared hit=3
                                                              ->  Bitmap
Heap Scan on event ev  (cost=2155.44..222204.65 rows=82452 width=160)
(actual time=0.012..0.012 rows=0 loops=1)
                                                                    Recheck
Cond: (event_day_fk = 20171120)
                                                                    Buffers:
shared hit=3
                                                                    ->
Bitmap Index Scan on event__event_day_fk  (cost=0.00..2134.82 rows=82452
width=0) (actual time=0.009..0.009 rows=0 loops=1)

Index Cond: (event_day_fk = 20171120)

Buffers: shared hit=3
                                                              ->  Hash
(cost=23.10..23.10 rows=1310 width=34) (never executed)
                                                                    ->  Seq
Scan on sendid si  (cost=0.00..23.10 rows=1310 width=34) (never executed)
                                                        ->  Hash
(cost=11259.20..11259.20 rows=629920 width=36) (never executed)
                                                              ->  Seq Scan
on subscriber osu  (cost=0.00..11259.20 rows=629920 width=36) (never
executed)
                                                  ->  Hash
(cost=11259.20..11259.20 rows=629920 width=36) (never executed)
                                                        ->  Seq Scan on
subscriber csu  (cost=0.00..11259.20 rows=629920 width=36) (never
executed)
              ->  Merge Left Join
(cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00
rows=16808919100 width=120) (actual time=1.543..1.543 rows=0 loops=1)
                    Merge Cond: (t.customer_fk = cu.customer_id)
                    Buffers: shared hit=5
                    ->  Sort
(cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00
rows=16808919100 width=82) (actual time=1.543..1.543 rows=0 loops=1)
                          Sort Key: t.customer_fk
                          Sort Method: quicksort  Memory: 25kB
                          Buffers: shared hit=5
                          ->  Nested Loop
(cost=-2021924571963149.75..-25142102027752040829944596389028691968.00
rows=16808919100 width=82) (actual time=1.537..1.537 rows=0 loops=1)
                                Join Filter: (si_1.sendid_name =
ev_1.sendid)
                                Buffers: shared hit=5
                                ->  HashAggregate
(cost=239661.55..242267.55 rows=260600 width=34) (actual time=1.536..1.536
rows=0 loops=1)
                                      Group Key: ev_1.sendid,
cm_1.campaign_id
                                      Buffers: shared hit=5
                                      ->  Merge Join
(cost=228900.82..236989.10 rows=534491 width=34) (actual time=0.016..0.016
rows=0 loops=1)
                                            Merge Cond: (cm_1.campaign_name
= ev_1.campaign_name)
                                            Buffers: shared hit=5
                                            ->  Index Scan using
campaign__campaign_name on campaign cm_1  (cost=0.15..67.80 rows=1303
width=34) (actual time=0.007..0.007 rows=1 loops=1)
                                                  Filter: (campaign_id IS
NOT NULL)
                                                  Buffers: shared hit=2
                                            ->  Sort
(cost=228900.67..229105.77 rows=82040 width=64) (actual time=0.008..0.008
rows=0 loops=1)
                                                  Sort Key:
ev_1.campaign_name
                                                  Sort Method: quicksort
Memory: 25kB
                                                  Buffers: shared hit=3
                                                  ->  Bitmap Heap Scan on
event ev_1  (cost=2155.34..222204.55 rows=82040 width=64) (actual
time=0.002..0.002 rows=0 loops=1)
                                                        Recheck Cond:
(event_day_fk = 20171120)
                                                        Filter: (sendid IS
NOT NULL)
                                                        Buffers: shared
hit=3
                                                        ->  Bitmap Index
Scan on event__event_day_fk  (cost=0.00..2134.82 rows=82452 width=0) (actual
time=0.002..0.002 rows=0 loops=1)
                                                              Index Cond:
(event_day_fk = 20171120)
                                                              Buffers:
shared hit=3
                                ->  Nested Loop
(cost=-2021924572202811.25..-96477751449547350295223172333568.00
rows=84496365 width=144) (never executed)
                                      Join Filter: (ad.ad_group_name =
si_1.sendid_name)
                                      ->  Seq Scan on sendid si_1
(cost=0.00..23.10 rows=1303 width=34) (never executed)
                                            Filter: (sendid_id IS NOT
NULL)
                                      ->  Nested Loop
(cost=-2021924572202811.25..-74042786991210548153116786688.00 rows=84929134
width=110) (never executed)
                                            Join Filter: (t.ad_fk =
ad.ad_id)
                                            ->  Index Scan using
ad__ad_group_name on ad  (cost=0.42..1415932.35 rows=2157 width=40) (never
executed)
                                                  Filter: (channel_name =
'Email'::text)
                                            ->  Nested Loop Left Join
(cost=-2021924572202811.75..-34326744084937666832891904.00 rows=16977262428
width=86) (never executed)
                                                  ->  Merge Left Join
(cost=767470.10..256025144.48 rows=16977262428 width=92) (never executed)
                                                        Merge Cond:
(t.touchpoint_id = rp.touchpoint_fk)
                                                        ->  Merge Left Join
(cost=595487.90..1108803.95 rows=34168763 width=68) (never executed)
                                                              Merge Cond:
(t.touchpoint_id = ap.touchpoint_fk)
                                                              ->  Sort
(cost=509005.69..509397.99 rows=156921 width=44) (never executed)
                                                                    Sort
Key: t.touchpoint_id
                                                                    ->
Append  (cost=0.00..495463.66 rows=156921 width=44) (never executed)
                                                                          ->
 Seq Scan on touchpoint t  (cost=0.00..0.00 rows=1 width=44) (never
executed)

   Filter: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_0 t_1  (cost=353.49..33026.70 rows=10460
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_0__day_fk  (cost=0.00..350.88
rows=10460 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_1 t_2  (cost=353.31..32951.98 rows=10436
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_1__day_fk  (cost=0.00..350.70
rows=10436 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_2 t_3  (cost=353.51..33033.17 rows=10462
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_2__day_fk  (cost=0.00..350.89
rows=10462 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_3 t_4  (cost=353.21..32910.10 rows=10423
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_3__day_fk  (cost=0.00..350.60
rows=10423 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_4 t_5  (cost=353.62..33075.49 rows=10476
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_4__day_fk  (cost=0.00..351.00
rows=10476 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_5 t_6  (cost=358.10..33273.80 rows=10538
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_5__day_fk  (cost=0.00..355.46
rows=10538 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_6 t_7  (cost=353.21..32913.40 rows=10424
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_6__day_fk  (cost=0.00..350.61
rows=10424 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_7 t_8  (cost=353.48..33023.38 rows=10459
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_7__day_fk  (cost=0.00..350.87
rows=10459 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_8 t_9  (cost=353.53..33042.77 rows=10465
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_8__day_fk  (cost=0.00..350.92
rows=10465 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_9 t_10  (cost=353.66..33094.94 rows=10482
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_9__day_fk  (cost=0.00..351.04
rows=10482 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_10 t_11  (cost=353.35..32968.57 rows=10442
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_10__day_fk  (cost=0.00..350.74
rows=10442 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_11 t_12  (cost=353.66..33094.59 rows=10482
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_11__day_fk  (cost=0.00..351.04
rows=10482 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_12 t_13  (cost=357.94..33211.98 rows=10518
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_12__day_fk  (cost=0.00..355.31
rows=10518 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_13 t_14  (cost=352.93..32799.96 rows=10388
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_13__day_fk  (cost=0.00..350.34
rows=10388 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                                          ->
 Bitmap Heap Scan on touchpoint_14 t_15  (cost=353.53..33042.83 rows=10465
width=44) (never executed)

   Recheck Cond: (day_fk = 20171120)

   ->  Bitmap Index Scan on touchpoint_14__day_fk  (cost=0.00..350.92
rows=10465 width=0) (never executed)

         Index Cond: (day_fk = 20171120)
                                                              ->  Sort
(cost=86482.21..86591.08 rows=43549 width=32) (never executed)
                                                                    Sort
Key: ap.touchpoint_fk
                                                                    ->
Append  (cost=0.00..83126.68 rows=43549 width=32) (never executed)
                                                                          ->
 Seq Scan on acquisition_performance ap  (cost=0.00..0.00 rows=1 width=32)
(never executed)

   Filter: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_0 ap_1  (cost=50.91..5537.62
rows=2901 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_0__performance_attribution_model_fk
(cost=0.00..50.18 rows=2901 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_1 ap_2  (cost=50.95..5548.79
rows=2907 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_1__performance_attribution_model_fk
(cost=0.00..50.23 rows=2907 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_2 ap_3  (cost=50.95..5547.08
rows=2906 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_2__performance_attribution_model_fk
(cost=0.00..50.22 rows=2906 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_3 ap_4  (cost=50.87..5528.44
rows=2896 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_3__performance_attribution_model_fk
(cost=0.00..50.14 rows=2896 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_4 ap_5  (cost=50.91..5537.32
rows=2901 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_4__performance_attribution_model_fk
(cost=0.00..50.18 rows=2901 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_5 ap_6  (cost=50.95..5546.79
rows=2906 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_5__performance_attribution_model_fk
(cost=0.00..50.22 rows=2906 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_6 ap_7  (cost=50.92..5539.61
rows=2902 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_6__performance_attribution_model_fk
(cost=0.00..50.19 rows=2902 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_7 ap_8  (cost=50.88..5530.15
rows=2897 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_7__performance_attribution_model_fk
(cost=0.00..50.15 rows=2897 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_8 ap_9  (cost=50.88..5530.15
rows=2897 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_8__performance_attribution_model_fk
(cost=0.00..50.15 rows=2897 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_9 ap_10  (cost=50.95..5547.08
rows=2906 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_9__performance_attribution_model_fk
(cost=0.00..50.22 rows=2906 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_10 ap_11  (cost=50.88..5530.73
rows=2897 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_10__performance_attribution_model_fk
(cost=0.00..50.15 rows=2897 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_11 ap_12  (cost=50.99..5555.96
rows=2911 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_11__performance_attribution_model_fk
(cost=0.00..50.26 rows=2911 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_12 ap_13  (cost=50.99..5558.25
rows=2912 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_12__performance_attribution_model_fk
(cost=0.00..50.27 rows=2912 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_13 ap_14  (cost=50.92..5539.91
rows=2902 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_13__performance_attribution_model_fk
(cost=0.00..50.19 rows=2902 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Heap Scan on acquisition_performance_14 ap_15  (cost=50.95..5548.79
rows=2907 width=32) (never executed)

   Recheck Cond: (performance_attribution_model_fk = 2)

   ->  Bitmap Index Scan on
acquisition_performance_14__performance_attribution_model_fk
(cost=0.00..50.23 rows=2907 width=0) (never executed)

         Index Cond: (performance_attribution_model_fk = 2)
                                                        ->  Sort
(cost=171982.20..172230.63 rows=99373 width=32) (never executed)
                                                              Sort Key:
rp.touchpoint_fk
                                                              ->  Append
(cost=0.00..163733.96 rows=99373 width=32) (never executed)
                                                                    ->  Seq
Scan on reactivation_performance rp  (cost=0.00..0.00 rows=1 width=32)
(never executed)

Filter: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_0 rp_1  (cost=111.52..10860.88
rows=6592 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_0__performance_attribution_model_fk
(cost=0.00..109.87 rows=6592 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_1 rp_2  (cost=111.11..10775.38
rows=6540 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_1__performance_attribution_model_fk
(cost=0.00..109.48 rows=6540 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_2 rp_3  (cost=111.49..10855.79
rows=6589 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_2__performance_attribution_model_fk
(cost=0.00..109.84 rows=6589 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_3 rp_4  (cost=111.30..10814.83
rows=6564 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_3__performance_attribution_model_fk
(cost=0.00..109.66 rows=6564 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_4 rp_5  (cost=111.62..10883.29
rows=6606 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_4__performance_attribution_model_fk
(cost=0.00..109.97 rows=6606 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_5 rp_6  (cost=119.65..11740.46
rows=7126 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_5__performance_attribution_model_fk
(cost=0.00..117.87 rows=7126 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_6 rp_7  (cost=110.50..10646.17
rows=6461 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_6__performance_attribution_model_fk
(cost=0.00..108.88 rows=6461 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_7 rp_8  (cost=111.42..10841.49
rows=6580 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_7__performance_attribution_model_fk
(cost=0.00..109.78 rows=6580 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_8 rp_9  (cost=111.04..10761.08
rows=6531 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_8__performance_attribution_model_fk
(cost=0.00..109.41 rows=6531 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_9 rp_10  (cost=116.87..11151.48
rows=6767 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_9__performance_attribution_model_fk
(cost=0.00..115.18 rows=6767 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_10 rp_11
(cost=110.49..10644.25 rows=6460 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_10__performance_attribution_model_fk
(cost=0.00..108.88 rows=6460 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_11 rp_12
(cost=117.28..11238.07 rows=6820 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_11__performance_attribution_model_fk
(cost=0.00..115.58 rows=6820 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_12 rp_13
(cost=117.45..11273.68 rows=6842 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_12__performance_attribution_model_fk
(cost=0.00..115.74 rows=6842 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_13 rp_14
(cost=109.69..10476.41 rows=6357 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_13__performance_attribution_model_fk
(cost=0.00..108.10 rows=6357 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                                    ->
Bitmap Heap Scan on reactivation_performance_14 rp_15
(cost=111.09..10770.70 rows=6537 width=32) (never executed)

Recheck Cond: (performance_attribution_model_fk = 2)
                                                                          ->
 Bitmap Index Scan on
reactivation_performance_14__performance_attribution_model_fk
(cost=0.00..109.45 rows=6537 width=0) (never executed)

   Index Cond: (performance_attribution_model_fk = 2)
                                                  ->  Bitmap Heap Scan on
converting_touchpoints_attribution conv
(cost=-2021924572970281.75..-2021924572970244.00 rows=1 width=10) (never
executed)
                                                        Recheck Cond:
((t.touchpoint_id = touchpoint_fk) AND (performance_attribution_model_fk =
2))
                                                        ->  BitmapAnd
(cost=-2021924572970281.75..-2021924572970281.75 rows=1850 width=0) (never
executed)
                                                              ->  Bitmap
Index Scan on converting_touchpoints_attribution__touchpoint_fk
(cost=0.00..-2021911688068401.00 rows=1850 width=0) (never executed)
                                                                    Index
Cond: (t.touchpoint_id = touchpoint_fk)
                                                              ->  Bitmap
Index Scan on
converting_touchpoints_attribution__performance_attribution_mod
(cost=0.00..-12884901880.97 rows=1850 width=0) (never executed)
                                                                    Index
Cond: (performance_attribution_model_fk = 2)
                    ->  Sort  (cost=696479.79..707910.70 rows=4572361
width=6) (never executed)
                          Sort Key: cu.customer_id
                          ->  Seq Scan on customer cu  (cost=0.00..190673.61
rows=4572361 width=6) (never executed)
Planning time: 7.154 ms
Execution time: 6.358 ms


##############
# query (which is usually run in parallel for multiple day_fks):
##############

EXPLAIN ( ANALYSE, BUFFERS )
SELECT
  -- dimensions
  20170920                              AS day_fk,

  sendID_fk                             AS sendID_fk,
  campaign_fk                           AS campaign_fk,

  region_fk                             AS region_fk,
  customer_segment_fk                   AS customer_segment_fk,

  -- touchpoint
  hll_union_agg(hll_add(hll_empty(14, 5), hll_hash_bigint(
      visitor_id))) :: hll(14, 5)       AS visitors,
  hll_union_agg(hll_add(hll_empty(14, 5), hll_hash_bigint(
      customer_fk))) :: hll(14, 5)      AS customers,

  sum(
      number_of_touchpoints)            AS number_of_touchpoints,
  sum(
      number_of_converting_touchpoints) AS
number_of_converting_touchpoints,

  sum(
      number_of_bounces)                AS number_of_bounces,
  sum(
      number_of_sent)                   AS number_of_sent,
  sum(
      number_of_open)                   AS number_of_opens,
  sum(
      number_of_clicks)                 AS number_of_clicks,
  -- this is https://github.com/citusdata/postgresql-hll
  hll_union_agg(hll_add(hll_empty(12, 5),
hll_hash_integer(click_subscriber_fk))) :: hll
  (12, 5)                               AS number_of_unique_clicks,
  hll_union_agg(hll_add(hll_empty(12, 5),
hll_hash_integer(open_subscriber_fk))) :: hll
  (12, 5)                               AS number_of_unique_opens,

  sum(
      number_of_net_orders)             AS number_of_net_orders,
  sum(
      number_of_first_net_orders)       AS number_of_first_net_orders,
  sum(
      number_of_recurring_net_orders)   AS number_of_recurring_net_orders,
  sum(
      contribution_margin_1)            AS contribution_margin_1,
  sum(
      net_revenue)                      AS net_revenue,
  sum(
      net_revenue_first_net_orders)     AS net_revenue_first_net_orders,
  sum(
      net_revenue_recurring_net_orders) AS
net_revenue_recurring_net_orders,
  sum(
      promotion_discount_value)         AS promotion_discount_value


FROM (
       -- data from salesforce
       SELECT
         si.sendID_id      AS sendID_fk,
         cm.campaign_id    AS campaign_fk,
         -1 :: SMALLINT    AS region_fk,
         -1 :: SMALLINT    AS customer_segment_fk,

         NULL              AS visitor_id,
         NULL              AS customer_fk,
         NULL              AS number_of_touchpoints,
         NULL              AS number_of_converting_touchpoints,

         CASE WHEN et.event_type_name = 'Bounce'
           THEN 1
         ELSE 0 END        AS number_of_bounces,
         CASE WHEN et.event_type_name = 'Sent'
           THEN 1
         ELSE 0 END        AS number_of_sent,
         CASE WHEN et.event_type_name = 'Open'
           THEN 1
         ELSE 0 END        AS number_of_open,
         CASE WHEN et.event_type_name = 'Click'
           THEN 1
         ELSE 0 END        AS number_of_clicks,

         csu.subscriber_id AS click_subscriber_fk,
         osu.subscriber_id AS open_subscriber_fk,

         NULL              AS number_of_net_orders,
         NULL              AS number_of_first_net_orders,
         NULL              AS number_of_recurring_net_orders,
         NULL              AS contribution_margin_1,
         NULL              AS net_revenue,
         NULL              AS net_revenue_first_net_orders,
         NULL              AS net_revenue_recurring_net_orders,
         NULL              AS promotion_discount_value

       FROM sf_tmp.event ev
         LEFT JOIN sf_dim.campaign cm
           ON ev.campaign_name = cm.campaign_name
         LEFT JOIN sf_dim.event_type et
           ON ev.event_type = et.event_type_name
         LEFT JOIN sf_dim.sendid si
           ON ev.Sendid = si.sendid_name
         LEFT JOIN sf_dim.subscriber osu
           ON ev.open_subscriber_key = osu.subscriber_name
         LEFT JOIN sf_dim.subscriber csu
           ON ev.click_subscriber_key = csu.subscriber_name

       WHERE ev.event_day_fk = 20171120

       -- data from ad performance
       UNION ALL
       SELECT
         si.sendid_id                                       AS sendID_fk,
         sendid_campaign_match.campaign_fk                  AS
campaign_fk,
         t.region_fk,
         COALESCE(cu.customer_segment_fk, -10),

         t.visitor_id,
         t.customer_fk,
         t.number_of_touchpoints,
         conv.converting_click :: INTEGER                   AS
number_of_converting_touchpoints,

         NULL                                               AS
number_of_bounces,
         NULL                                               AS
number_of_sent,
         NULL                                               AS
number_of_open,
         NULL                                               AS
number_of_clicks,

         NULL                                               AS
click_subscriber_fk,
         NULL                                               AS
open_subscriber_fk,

         coalesce(ap.number_of_first_net_orders, 0.0) +
         coalesce(rp.number_of_recurring_net_orders,
                  0.0)                                      AS
number_of_net_orders,
         ap.number_of_first_net_orders                      AS
number_of_first_net_orders,
         rp.number_of_recurring_net_orders                  AS
number_of_recurring_net_orders,
         coalesce(ap.contribution_margin_1_first_net_orders, 0.0) +
         coalesce(rp.contribution_margin_1_recurring_net_orders,
                  0.0)                                      AS
contribution_margin_1,
         coalesce(ap.net_revenue_first_net_orders, 0.0) +
         coalesce(rp.net_revenue_recurring_net_orders, 0.0) AS
net_revenue,
         ap.net_revenue_first_net_orders                    AS
net_revenue_first_net_orders,
         rp.net_revenue_recurring_net_orders                AS
net_revenue_recurring_net_orders,
         t.promotion_discount_value


       FROM m_dim.touchpoint t
         JOIN m_dim.ad
           ON ad.ad_id = t.ad_fk AND ad.channel_name = 'Email'
         LEFT JOIN sf_dim.sendid si
           ON si.Sendid_name = ad.ad_group_name
         LEFT JOIN (
                     SELECT DISTINCT
                       ev.sendid,
                       cm.campaign_id AS campaign_fk
                     FROM sf_tmp.event ev
                       LEFT JOIN sf_dim.campaign cm
                         ON ev.campaign_name = cm.campaign_name
                     WHERE EVENT_day_fk = 20171120 AND
                           ev.sendid IS NOT NULL) sendid_campaign_match
           ON sendid_campaign_match.sendid = si.sendid_name
         LEFT JOIN os_dim.customer cu
           ON cu.customer_id = t.customer_fk
         LEFT JOIN m_dim.acquisition_performance ap
           ON ap.touchpoint_fk = t.touchpoint_id
              AND ap.performance_attribution_model_fk = 2
         LEFT JOIN m_dim.reactivation_performance rp
           ON rp.touchpoint_fk = t.touchpoint_id
              AND rp.performance_attribution_model_fk = 2
         LEFT JOIN m_dim.converting_touchpoints_attribution conv
           ON t.touchpoint_id = conv.touchpoint_fk
              AND conv.performance_attribution_model_fk = 2
       WHERE t.day_fk = 20171120 AND si.sendid_id IS NOT NULL AND
campaign_fk IS NOT NULL
     ) q

GROUP BY
  sendID_fk,
  campaign_fk,
  region_fk,
  customer_segment_fk;


##############
# On a different system it looks like this (much more sane):
##############

GroupAggregate  (cost=384347.08..414413.11 rows=37349 width=254) (actual
time=5343.214..5417.604 rows=1194 loops=1)
  Group Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk,
('-1'::smallint), (('-1'::smallint)::integer)
  Buffers: shared hit=421793 read=26153, temp read=1561 written=1717
  ->  Sort  (cost=384347.08..385280.81 rows=373491 width=120) (actual
time=5343.180..5351.390 rows=55816 loops=1)
        Sort Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk,
('-1'::smallint), (('-1'::smallint)::integer)
        Sort Method: external merge  Disk: 2880kB
        Buffers: shared hit=421793 read=26153, temp read=1561 written=1717
        ->  Append  (cost=140964.58..319780.25 rows=373491 width=120)
(actual time=2319.179..5287.481 rows=55816 loops=1)
              Buffers: shared hit=421787 read=26153, temp read=1201
written=1356
              ->  Subquery Scan on "*SELECT* 1"  (cost=140964.58..155107.05
rows=373160 width=120) (actual time=2319.178..2358.269 rows=51664 loops=1)
                    Buffers: shared hit=19573 read=20783, temp read=1201
written=1356
                    ->  Merge Right Join  (cost=140964.58..150442.55
rows=373160 width=118) (actual time=2319.176..2350.994 rows=51664 loops=1)
                          Merge Cond: (et.event_type_name = ev.event_type)
                          Buffers: shared hit=19573 read=20783, temp
read=1201 written=1356
                          ->  Sort  (cost=90.93..94.20 rows=1310 width=32)
(actual time=1.088..1.090 rows=5 loops=1)
                                Sort Key: et.event_type_name
                                Sort Method: quicksort  Memory: 25kB
                                Buffers: shared read=1
                                ->  Seq Scan on event_type et
(cost=0.00..23.10 rows=1310 width=32) (actual time=1.065..1.066 rows=5
loops=1)
                                      Buffers: shared read=1
                          ->  Materialize  (cost=140873.65..141158.50
rows=56971 width=17) (actual time=2318.081..2331.993 rows=51664 loops=1)
                                Buffers: shared hit=19573 read=20782, temp
read=1201 written=1356
                                ->  Sort  (cost=140873.65..141016.08
rows=56971 width=17) (actual time=2318.076..2323.706 rows=51664 loops=1)
                                      Sort Key: ev.event_type
                                      Sort Method: external merge  Disk:
1312kB
                                      Buffers: shared hit=19573 read=20782,
temp read=1201 written=1205
                                      ->  Hash Left Join
(cost=90861.89..135705.53 rows=56971 width=17) (actual
time=1897.747..2291.303 rows=51664 loops=1)
                                            Hash Cond: (ev.campaign_name =
cm.campaign_name)
                                            Buffers: shared hit=19573
read=20782, temp read=1037 written=1040
                                            ->  Hash Left Join
(cost=90858.44..134918.72 rows=56971 width=38) (actual
time=1897.709..2277.430 rows=51664 loops=1)
                                                  Hash Cond: (ev.sendid =
si.sendid_name)
                                                  Buffers: shared hit=19572
read=20782, temp read=1037 written=1040
                                                  ->  Merge Right Join
(cost=90848.40..134190.85 rows=56971 width=42) (actual
time=1897.586..2263.942 rows=51664 loops=1)
                                                        Merge Cond:
(osu.subscriber_name = ev.open_subscriber_key)
                                                        Buffers: shared
hit=19571 read=20781, temp read=1037 written=1040
                                                        ->  Index Scan using
subscriber__subscriber_name on subscriber osu  (cost=0.43..39487.86
rows=1323362 width=36) (actual time=0.010..216.141 rows=1323194 loops=1)
                                                              Buffers:
shared hit=19562
                                                        ->  Materialize
(cost=90847.98..91132.83 rows=56971 width=70) (actual
time=1897.564..1916.819 rows=51664 loops=1)
                                                              Buffers:
shared hit=9 read=20781, temp read=1037 written=1040
                                                              ->  Sort
(cost=90847.98..90990.40 rows=56971 width=70) (actual
time=1897.561..1911.064 rows=51664 loops=1)
                                                                    Sort
Key: ev.open_subscriber_key
                                                                    Sort
Method: external merge  Disk: 4112kB
                                                                    Buffers:
shared hit=9 read=20781, temp read=1037 written=1040
                                                                    ->
Merge Right Join  (cost=41811.94..85011.85 rows=56971 width=70) (actual
time=215.181..1861.479 rows=51664 loops=1)

Merge Cond: (csu.subscriber_name = ev.click_subscriber_key)

Buffers: shared hit=9 read=20781, temp read=523 written=523
                                                                          ->
 Index Scan using subscriber__subscriber_name on subscriber csu
(cost=0.43..39487.86 rows=1323362 width=36) (actual time=0.005..1510.625
rows=1323194 loops=1)

   Buffers: shared hit=9 read=19553
                                                                          ->
 Materialize  (cost=41778.04..42062.89 rows=56971 width=98) (actual
time=214.084..225.830 rows=51664 loops=1)

   Buffers: shared read=1228, temp read=523 written=523

   ->  Sort  (cost=41778.04..41920.47 rows=56971 width=98) (actual
time=214.080..220.578 rows=51664 loops=1)

         Sort Key: ev.click_subscriber_key

         Sort Method: external sort  Disk: 4184kB

         Buffers: shared read=1228, temp read=523 written=523

         ->  Index Scan using event__event_day_fk on event ev
(cost=0.56..35495.92 rows=56971 width=98) (actual time=3.877..192.397
rows=51664 loops=1)

               Index Cond: (event_day_fk = 20171120)

               Buffers: shared read=1228
                                                  ->  Hash  (cost=5.57..5.57
rows=357 width=7) (actual time=0.103..0.103 rows=357 loops=1)
                                                        Buckets: 1024
Batches: 1  Memory Usage: 22kB
                                                        Buffers: shared
hit=1 read=1
                                                        ->  Seq Scan on
sendid si  (cost=0.00..5.57 rows=357 width=7) (actual time=0.013..0.059
rows=357 loops=1)
                                                              Buffers:
shared hit=1 read=1
                                            ->  Hash  (cost=2.09..2.09
rows=109 width=26) (actual time=0.032..0.032 rows=109 loops=1)
                                                  Buckets: 1024  Batches: 1
Memory Usage: 15kB
                                                  Buffers: shared hit=1
                                                  ->  Seq Scan on campaign
cm  (cost=0.00..2.09 rows=109 width=26) (actual time=0.007..0.013 rows=109
loops=1)
                                                        Buffers: shared
hit=1
              ->  Hash Join  (cost=37715.40..164669.88 rows=331 width=120)
(actual time=264.900..2924.332 rows=4152 loops=1)
                    Hash Cond: (ad.ad_group_name = si_1.sendid_name)
                    Buffers: shared hit=402214 read=5370
                    ->  Nested Loop Left Join  (cost=37705.37..164655.25
rows=37 width=115) (actual time=264.770..2919.386 rows=4152 loops=1)
                          Buffers: shared hit=402212 read=5370
                          ->  Nested Loop Left Join
(cost=37705.37..163946.79 rows=16 width=99) (actual time=210.937..2437.542
rows=4152 loops=1)
                                Buffers: shared hit=214029 read=5076
                                ->  Nested Loop Left Join
(cost=37705.37..163597.96 rows=8 width=75) (actual time=173.726..2018.663
rows=4152 loops=1)
                                      Buffers: shared hit=27013 read=4975
                                      ->  Nested Loop Left Join
(cost=37704.81..163579.72 rows=8 width=73) (actual time=96.881..1909.229
rows=4152 loops=1)
                                            Buffers: shared hit=9961
read=4896
                                            ->  Hash Join
(cost=37704.38..163562.73 rows=8 width=71) (actual time=96.873..1877.702
rows=4152 loops=1)
                                                  Hash Cond:
(ad.ad_group_name = sendid_campaign_match.sendid)
                                                  Buffers: shared hit=1232
read=4896
                                                  ->  Hash Join
(cost=304.31..126040.79 rows=38 width=63) (actual time=60.935..1839.130
rows=4285 loops=1)
                                                        Hash Cond: (t.ad_fk
= ad.ad_id)
                                                        Buffers: shared
hit=3 read=4896
                                                        ->  Append
(cost=0.00..124157.87 rows=125038 width=44) (actual time=48.277..1810.384
rows=123117 loops=1)
                                                              Buffers:
shared read=4855
                                                              ->  Seq Scan
on touchpoint t  (cost=0.00..0.00 rows=1 width=44) (actual time=0.002..0.002
rows=0 loops=1)
                                                                    Filter:
(day_fk = 20171120)
                                                              ->  Index Scan
using touchpoint_0__day_fk on touchpoint_0 t_1  (cost=0.43..7616.82
rows=7654 width=44) (actual time=48.273..237.069 rows=8295 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=331
                                                              ->  Index Scan
using touchpoint_1__day_fk on touchpoint_1 t_2  (cost=0.43..8099.65
rows=8152 width=44) (actual time=3.415..83.608 rows=8257 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=341
                                                              ->  Index Scan
using touchpoint_2__day_fk on touchpoint_2 t_3  (cost=0.43..8207.70
rows=8262 width=44) (actual time=3.172..49.089 rows=8168 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=325
                                                              ->  Index Scan
using touchpoint_3__day_fk on touchpoint_3 t_4  (cost=0.43..8639.32
rows=8712 width=44) (actual time=3.443..141.148 rows=8377 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=330
                                                              ->  Index Scan
using touchpoint_4__day_fk on touchpoint_4 t_5  (cost=0.43..8540.24
rows=8608 width=44) (actual time=3.038..147.760 rows=8197 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=314
                                                              ->  Index Scan
using touchpoint_5__day_fk on touchpoint_5 t_6  (cost=0.43..7897.13
rows=7940 width=44) (actual time=3.870..42.844 rows=8264 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=324
                                                              ->  Index Scan
using touchpoint_6__day_fk on touchpoint_6 t_7  (cost=0.43..8156.89
rows=8211 width=44) (actual time=3.439..144.567 rows=8194 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=328
                                                              ->  Index Scan
using touchpoint_7__day_fk on touchpoint_7 t_8  (cost=0.43..8830.66
rows=8908 width=44) (actual time=4.362..143.630 rows=8068 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=322
                                                              ->  Index Scan
using touchpoint_8__day_fk on touchpoint_8 t_9  (cost=0.43..9695.86
rows=9808 width=44) (actual time=3.337..142.971 rows=8273 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=329
                                                              ->  Index Scan
using touchpoint_9__day_fk on touchpoint_9 t_10  (cost=0.43..8124.32
rows=8179 width=44) (actual time=3.349..150.226 rows=8167 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=325
                                                              ->  Index Scan
using touchpoint_10__day_fk on touchpoint_10 t_11  (cost=0.43..8356.39
rows=8418 width=44) (actual time=4.030..39.337 rows=8233 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=325
                                                              ->  Index Scan
using touchpoint_11__day_fk on touchpoint_11 t_12  (cost=0.43..7450.94
rows=7481 width=44) (actual time=3.529..45.088 rows=8140 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=319
                                                              ->  Index Scan
using touchpoint_12__day_fk on touchpoint_12 t_13  (cost=0.43..7753.21
rows=7790 width=44) (actual time=3.935..153.074 rows=8131 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=313
                                                              ->  Index Scan
using touchpoint_13__day_fk on touchpoint_13 t_14  (cost=0.43..8589.00
rows=8660 width=44) (actual time=3.264..145.147 rows=8255 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=318
                                                              ->  Index Scan
using touchpoint_14__day_fk on touchpoint_14 t_15  (cost=0.43..8199.74
rows=8254 width=44) (actual time=2.911..135.576 rows=8098 loops=1)
                                                                    Index
Cond: (day_fk = 20171120)
                                                                    Buffers:
shared read=311
                                                        ->  Hash
(cost=300.40..300.40 rows=313 width=35) (actual time=11.881..11.881
rows=1145 loops=1)
                                                              Buckets: 2048
(originally 1024)  Batches: 1 (originally 1)  Memory Usage: 109kB
                                                              Buffers:
shared read=41
                                                              ->  Index Scan
using ad__channel_name on ad  (cost=0.42..300.40 rows=313 width=35) (actual
time=2.580..11.649 rows=1145 loops=1)
                                                                    Index
Cond: (channel_name = 'Email'::text)
                                                                    Buffers:
shared read=41
                                                  ->  Hash
(cost=37079.88..37079.88 rows=25615 width=8) (actual time=35.890..35.890
rows=115 loops=1)
                                                        Buckets: 32768
Batches: 1  Memory Usage: 261kB
                                                        Buffers: shared
hit=1229
                                                        ->  Subquery Scan on
sendid_campaign_match  (cost=36567.58..37079.88 rows=25615 width=8) (actual
time=35.713..35.857 rows=115 loops=1)
                                                              Buffers:
shared hit=1229
                                                              ->
HashAggregate  (cost=36567.58..36823.73 rows=25615 width=8) (actual
time=35.712..35.843 rows=115 loops=1)
                                                                    Group
Key: ev_1.sendid, cm_1.campaign_id
                                                                    Buffers:
shared hit=1229
                                                                    ->  Hash
Join  (cost=4.02..36282.72 rows=56971 width=8) (actual time=0.071..25.206
rows=51664 loops=1)

Hash Cond: (ev_1.campaign_name = cm_1.campaign_name)

Buffers: shared hit=1229
                                                                          ->
 Index Scan using event__event_day_fk on event ev_1  (cost=0.56..35495.92
rows=56971 width=29) (actual time=0.024..12.175 rows=51664 loops=1)

   Index Cond: (event_day_fk = 20171120)

   Filter: (sendid IS NOT NULL)

   Buffers: shared hit=1228
                                                                          ->
 Hash  (cost=2.09..2.09 rows=109 width=26) (actual time=0.038..0.038
rows=109 loops=1)

   Buckets: 1024  Batches: 1  Memory Usage: 15kB

   Buffers: shared hit=1

   ->  Seq Scan on campaign cm_1  (cost=0.00..2.09 rows=109 width=26)
(actual time=0.010..0.020 rows=109 loops=1)

         Filter: (campaign_id IS NOT NULL)

         Buffers: shared hit=1
                                            ->  Index Scan using
customer_pkey on customer cu  (cost=0.43..2.12 rows=1 width=6) (actual
time=0.007..0.007 rows=1 loops=4152)
                                                  Index Cond: (customer_id =
t.customer_fk)
                                                  Buffers: shared hit=8729
                                      ->  Index Scan using
converting_touchpoints_attribution__touchpoint_fk_performance_a on
converting_touchpoints_attribution conv  (cost=0.56..2.27 rows=1 width=10)
(actual time=0.026..0.026 rows=0 loops=4152)
                                            Index Cond: ((t.touchpoint_id =
touchpoint_fk) AND (performance_attribution_model_fk = 2))
                                            Buffers: shared hit=17052
read=79
                                ->  Append  (cost=0.00..43.44 rows=16
width=32) (actual time=0.099..0.099 rows=0 loops=4152)
                                      Buffers: shared hit=187016 read=101
                                      ->  Seq Scan on
acquisition_performance ap  (cost=0.00..0.00 rows=1 width=32) (actual
time=0.000..0.000 rows=0 loops=4152)
                                            Filter:
((performance_attribution_model_fk = 2) AND (touchpoint_fk =
t.touchpoint_id))
                                      ->  Index Scan using
acquisition_performance_0__touchpoint_fk on acquisition_performance_0 ap_1
(cost=0.42..2.90 rows=1 width=32) (actual time=0.027..0.027 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12456
read=9
                                      ->  Index Scan using
acquisition_performance_1__touchpoint_fk on acquisition_performance_1 ap_2
(cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12500
read=7
                                      ->  Index Scan using
acquisition_performance_2__touchpoint_fk on acquisition_performance_2 ap_3
(cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12451
read=6
                                      ->  Index Scan using
acquisition_performance_3__touchpoint_fk on acquisition_performance_3 ap_4
(cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12455
read=8
                                      ->  Index Scan using
acquisition_performance_4__touchpoint_fk on acquisition_performance_4 ap_5
(cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12480
read=9
                                      ->  Index Scan using
acquisition_performance_5__touchpoint_fk on acquisition_performance_5 ap_6
(cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Buffers: shared hit=12451
read=5
                                      ->  Index Scan using
acquisition_performance_6__touchpoint_fk on acquisition_performance_6 ap_7
(cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Buffers: shared hit=12516
read=6
                                      ->  Index Scan using
acquisition_performance_7__touchpoint_fk on acquisition_performance_7 ap_8
(cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12451
read=6
                                      ->  Index Scan using
acquisition_performance_8__touchpoint_fk on acquisition_performance_8 ap_9
(cost=0.42..2.89 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12455
read=8
                                      ->  Index Scan using
acquisition_performance_9__touchpoint_fk on acquisition_performance_9 ap_10
(cost=0.42..2.89 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12473
read=7
                                      ->  Index Scan using
acquisition_performance_10__touchpoint_fk on acquisition_performance_10
ap_11  (cost=0.42..2.89 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Buffers: shared hit=12451
read=5
                                      ->  Index Scan using
acquisition_performance_11__touchpoint_fk on acquisition_performance_11
ap_12  (cost=0.42..2.90 rows=1 width=32) (actual time=0.027..0.027 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12451
read=6
                                      ->  Index Scan using
acquisition_performance_12__touchpoint_fk on acquisition_performance_12
ap_13  (cost=0.42..2.89 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12523
read=8
                                      ->  Index Scan using
acquisition_performance_13__touchpoint_fk on acquisition_performance_13
ap_14  (cost=0.42..2.89 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Buffers: shared hit=12451
read=5
                                      ->  Index Scan using
acquisition_performance_14__touchpoint_fk on acquisition_performance_14
ap_15  (cost=0.42..2.89 rows=1 width=32) (actual time=0.003..0.003 rows=0
loops=4152)
                                            Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                            Filter:
(performance_attribution_model_fk = 2)
                                            Rows Removed by Filter: 0
                                            Buffers: shared hit=12452
read=6
                          ->  Append  (cost=0.00..44.12 rows=16 width=32)
(actual time=0.105..0.115 rows=0 loops=4152)
                                Buffers: shared hit=188183 read=294
                                ->  Seq Scan on reactivation_performance rp
(cost=0.00..0.00 rows=1 width=32) (actual time=0.000..0.000 rows=0
loops=4152)
                                      Filter:
((performance_attribution_model_fk = 2) AND (touchpoint_fk =
t.touchpoint_id))
                                ->  Index Scan using
reactivation_performance_0__touchpoint_fk on reactivation_performance_0 rp_1
 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12549 read=20
                                ->  Index Scan using
reactivation_performance_1__touchpoint_fk on reactivation_performance_1 rp_2
 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12574 read=21
                                ->  Index Scan using
reactivation_performance_2__touchpoint_fk on reactivation_performance_2 rp_3
 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12544 read=20
                                ->  Index Scan using
reactivation_performance_3__touchpoint_fk on reactivation_performance_3 rp_4
 (cost=0.43..2.94 rows=1 width=32) (actual time=0.005..0.005 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12552 read=19
                                ->  Index Scan using
reactivation_performance_4__touchpoint_fk on reactivation_performance_4 rp_5
 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12516 read=18
                                ->  Index Scan using
reactivation_performance_5__touchpoint_fk on reactivation_performance_5 rp_6
 (cost=0.43..2.95 rows=1 width=32) (actual time=0.028..0.028 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12562 read=21
                                ->  Index Scan using
reactivation_performance_6__touchpoint_fk on reactivation_performance_6 rp_7
 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12530 read=18
                                ->  Index Scan using
reactivation_performance_7__touchpoint_fk on reactivation_performance_7 rp_8
 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12535 read=19
                                ->  Index Scan using
reactivation_performance_8__touchpoint_fk on reactivation_performance_8 rp_9
 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12529 read=20
                                ->  Index Scan using
reactivation_performance_9__touchpoint_fk on reactivation_performance_9
rp_10  (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12561 read=19
                                ->  Index Scan using
reactivation_performance_10__touchpoint_fk on reactivation_performance_10
rp_11  (cost=0.43..2.94 rows=1 width=32) (actual time=0.028..0.028 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12533 read=21
                                ->  Index Scan using
reactivation_performance_11__touchpoint_fk on reactivation_performance_11
rp_12  (cost=0.43..2.95 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12591 read=20
                                ->  Index Scan using
reactivation_performance_12__touchpoint_fk on reactivation_performance_12
rp_13  (cost=0.43..2.95 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12527 read=18
                                ->  Index Scan using
reactivation_performance_13__touchpoint_fk on reactivation_performance_13
rp_14  (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12531 read=20
                                ->  Index Scan using
reactivation_performance_14__touchpoint_fk on reactivation_performance_14
rp_15  (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0
loops=4152)
                                      Index Cond: (touchpoint_fk =
t.touchpoint_id)
                                      Filter:
(performance_attribution_model_fk = 2)
                                      Rows Removed by Filter: 0
                                      Buffers: shared hit=12549 read=20
                    ->  Hash  (cost=5.57..5.57 rows=357 width=7) (actual
time=0.105..0.105 rows=357 loops=1)
                          Buckets: 1024  Batches: 1  Memory Usage: 22kB
                          Buffers: shared hit=2
                          ->  Seq Scan on sendid si_1  (cost=0.00..5.57
rows=357 width=7) (actual time=0.014..0.057 rows=357 loops=1)
                                Filter: (sendid_id IS NOT NULL)
                                Buffers: shared hit=2
Planning time: 635.005 ms
Execution time: 5423.894 ms




Re: BUG #14948: cost overflow

From
Alvaro Herrera
Date:
Hi Jan,

it seems pretty likely that these are indeed overflows.  However,
because of the whitespace mangling these plans are pretty much
impossible to read.  Would you mind pasting these again, as text/plain
attachment instead?

Does the query actually take 6ms in the system with the funny explain,
and 5s on the other?

Thanks

jasc@gmx.net wrote:

> Full EXPLAIN (ANALYSE BUFFERS) on the funny looking system:
> 
> GroupAggregate 
> (cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00
> rows=1600000000 width=254) (actual time=1.640..1.640 rows=0 loops=1)
>   Group Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk,
> ('-1'::smallint), (('-1'::smallint)::integer)
>   Buffers: shared hit=11
>   ->  Sort 
> (cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00
> rows=16812456500 width=120) (actual time=1.638..1.638 rows=0 loops=1)
>         Sort Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk,
> ('-1'::smallint), (('-1'::smallint)::integer)
>         Sort Method: quicksort  Memory: 25kB
>         Buffers: shared hit=11
>         ->  Append 
> (cost=329985.38..-25142102027752040829944596389028691968.00 rows=16812456500
> width=120) (actual time=1.621..1.621 rows=0 loops=1)
>               Buffers: shared hit=11
[...]
> Planning time: 7.154 ms
> Execution time: 6.358 ms



> ##############
> # On a different system it looks like this (much more sane):
> ##############
> 
> GroupAggregate  (cost=384347.08..414413.11 rows=37349 width=254) (actual
> time=5343.214..5417.604 rows=1194 loops=1)
>   Group Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk,
> ('-1'::smallint), (('-1'::smallint)::integer)
>   Buffers: shared hit=421793 read=26153, temp read=1561 written=1717
>   ->  Sort  (cost=384347.08..385280.81 rows=373491 width=120) (actual
> time=5343.180..5351.390 rows=55816 loops=1)
>         Sort Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk,
> ('-1'::smallint), (('-1'::smallint)::integer)
>         Sort Method: external merge  Disk: 2880kB
>         Buffers: shared hit=421793 read=26153, temp read=1561 written=1717
>         ->  Append  (cost=140964.58..319780.25 rows=373491 width=120)
> (actual time=2319.179..5287.481 rows=55816 loops=1)
>               Buffers: shared hit=421787 read=26153, temp read=1201
> written=1356


> Planning time: 635.005 ms
> Execution time: 5423.894 ms


-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: BUG #14948: cost overflow

From
Tom Lane
Date:
jasc@gmx.net writes:
> We have a server which reports negative costs in a query plan:

The root of the weirdness seems to be here:

>           ->  Bitmap Heap Scan on converting_touchpoints_attribution conv
(cost=-2021924572970281.75..-2021924572970244.00rows=1 width=10) (never executed) 
>                 Recheck Cond: ((t.touchpoint_id = touchpoint_fk) AND (performance_attribution_model_fk = 2))
>                 ->  BitmapAnd (cost=-2021924572970281.75..-2021924572970281.75 rows=1850 width=0) (never executed)
>                       ->  Bitmap Index Scan on converting_touchpoints_attribution__touchpoint_fk
(cost=0.00..-2021911688068401.00rows=1850 width=0) (never executed) 
>                             Index Cond: (t.touchpoint_id = touchpoint_fk)
>                       ->  Bitmap Index Scan on converting_touchpoints_attribution__performance_attribution_mod
(cost=0.00..-12884901880.97rows=1850 width=0) (never executed) 
>                             Index Cond: (performance_attribution_model_fk = 2)

I'm not sure what to make of that, but it seems to be broken in more ways
than just the wacko cost estimates.  For one thing, the planner is supposed
to set up index qual conditions in the form "indexedvar op something",
and the (t.touchpoint_id = touchpoint_fk) condition appears to be
backwards.  The row estimate at the intermediate BitmapAnd node seems
to be out of line as well, why doesn't it match the estimate for the
heapscan node?

Can you get a similarly broken plan if you try something involving just
this table, say "select * from converting_touchpoints_attribution where
touchpoint_fk = 42 and performance_attribution_model_fk = 2" ?

Could we see the pg_stats rows for those two columns?  What is in
pg_class.reltuples and relpages for this table and these two indexes?

            regards, tom lane


Re: BUG #14948: cost overflow

From
Jan Schulz
Date:
Hello!

On 5 December 2017 at 16:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not sure what to make of that, but it seems to be broken in more ways
> than just the wacko cost estimates. 

We run our server with basically no data security in mind as we do regenerate all tables from the source systems each time. So the OOM kill might have broken some tables/data? m_dim wasn't regenerated since then because the regeneration of that schema triggers the OOM.

relevant postgresql.conf entries:

wal_level = minimal
# doesn't start if wal_level=minimal and this is >0
max_wal_senders = 0
fsync = off
synchronous_commit = off
full_page_writes = off
wal_buffers = -1
autovacuum_max_workers = 1

> Can you get a similarly broken plan if you try something involving just
> this table, say "select * from converting_touchpoints_attribution where
> touchpoint_fk = 42 and performance_attribution_model_fk = 2" ?

EXPLAIN ( ANALYSE, BUFFERS )
select * from m_dim.converting_touchpoints_attribution where
touchpoint_fk = 42 and performance_attribution_model_fk = 2

Bitmap Heap Scan on converting_touchpoints_attribution  
     (cost=-25769803761.69..-25769803757.67 rows=1 width=16) (actual time=0.005..0.005 rows=0 loops=1)
  Recheck Cond: ((performance_attribution_model_fk = 2) AND (touchpoint_fk = 42))
  ->  BitmapAnd  (cost=-25769803761.69..-25769803761.69 rows=1 width=0) (actual time=0.004..0.004 rows=0 loops=1)
        ->  Bitmap Index Scan on converting_touchpoints_attribution__performance_attribution_mod  
                (cost=0.00..-12884901880.97 rows=1 width=0) (actual time=0.003..0.003 rows=0 loops=1)
              Index Cond: (performance_attribution_model_fk = 2)
        ->  Bitmap Index Scan on converting_touchpoints_attribution__touchpoint_fk  
                (cost=0.00..-12884901880.97 rows=1 width=0) (never executed)
              Index Cond: (touchpoint_fk = 42)
Planning time: 0.122 ms
Execution time: 0.036 ms

This is how it looks on a working system (generated with 2MB work_mem...)

Index Scan using converting_touchpoints_attribution__touchpoint_fk_performance_a on converting_touchpoints_attribution  
    (cost=0.56..2.58 rows=1 width=16) (actual time=2.314..2.314 rows=0 loops=1)
  Index Cond: ((touchpoint_fk = 42) AND (performance_attribution_model_fk = 2))
  Buffers: shared hit=6 read=4
Planning time: 5.371 ms
Execution time: 2.342 ms

> Could we see the pg_stats rows for those two columns?  What is in
> pg_class.reltuples and relpages for this table and these two indexes?

SELECT
  nspname || '.' ||relname,
  relpages,
  reltuples
FROM pg_class
  JOIN pg_catalog.pg_namespace n
    ON n.oid = pg_class.relnamespace
WHERE relname ILIKE 'converting_touchpoints_attributio%';

+-----------------------------------------------------------------------+----------+-----------+
| nspname+relname                                                       | relpages | reltuples |
+-----------------------------------------------------------------------+----------+-----------+
| m_dim.converting_touchpoints_attribution__touchpoint_fk_performance_a | 84153    | 24282896  |
| m_dim.converting_touchpoints_attribution___day_id_performance_attribu | 84153    | 24282896  |
| m_dim.converting_touchpoints_attribution__touchpoint_fk               | 6        | 1026      |
| m_dim.converting_touchpoints_attribution__performance_attribution_mod | 5        | 1026      |
| m_dim.converting_touchpoints_attribution                              | 131259   | 24282892  |
+-----------------------------------------------------------------------+----------+-----------+
5 rows in set.

On a good system it looks like this:

+-----------------------------------------------------------------------+----------+-----------+
| nspname+relname                                                       | relpages | reltuples |
+-----------------------------------------------------------------------+----------+-----------+
| m_dim.converting_touchpoints_attribution___day_id_performance_attribu | 84656    | 24428338  |
| m_dim.converting_touchpoints_attribution                              | 132046   | 24428338  |
| m_dim.converting_touchpoints_attribution__touchpoint_fk_performance_a | 84656    | 24428338  |
| m_dim.converting_touchpoints_attribution__touchpoint_fk               | 6        | 1032      |
| m_dim.converting_touchpoints_attribution__performance_attribution_mod | 5        | 1032      |
| m_tmp.converting_touchpoints_attribution                              | 132046   | 24428338  |
+-----------------------------------------------------------------------+----------+-----------+
6 rows in set.

-> Seems like no real difference (the data has a few days differences now + m_tmp.converting_touchpoints_attribution is an UNLOGGED table so won't survive the crash)

Best regards,

Jan
--
Jan Schulz
mail: jasc@gmx.net


>
>                         regards, tom lane

Fwd: BUG #14948: cost overflow

From
Jan Schulz
Date:
[resent to the list, sorry Alvaro]

Hello,

On 5 December 2017 at 13:39, Alvaro Herrera <alvherre@2ndquadrant.com> wrote:
> it seems pretty likely that these are indeed overflows.  However,
> because of the whitespace mangling these plans are pretty much
> impossible to read.  Would you mind pasting these again, as text/plain
> attachment instead?

I attached the plans in plain text.

> Does the query actually take 6ms in the system with the funny explain,
> and 5s on the other?

I didn't really noticed the timings, could be. I was more surprised
that the broken plan actually resulted in zero rows because it should
have had some.

Kind regards,

Jan
--
Jan Schulz
mail: jasc@gmx.net

Attachment

Re: BUG #14948: cost overflow

From
Tom Lane
Date:
Jan Schulz <jasc@gmx.net> writes:
> On 5 December 2017 at 16:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Can you get a similarly broken plan if you try something involving just
>> this table, say "select * from converting_touchpoints_attribution where
>> touchpoint_fk = 42 and performance_attribution_model_fk = 2" ?

> EXPLAIN ( ANALYSE, BUFFERS )
> select * from m_dim.converting_touchpoints_attribution where
> touchpoint_fk = 42 and performance_attribution_model_fk = 2

> Bitmap Heap Scan on converting_touchpoints_attribution
>      (cost=-25769803761.69..-25769803757.67 rows=1 width=16) (actual
> time=0.005..0.005 rows=0 loops=1)
>   Recheck Cond: ((performance_attribution_model_fk = 2) AND (touchpoint_fk
> = 42))
>   ->  BitmapAnd  (cost=-25769803761.69..-25769803761.69 rows=1 width=0)
> (actual time=0.004..0.004 rows=0 loops=1)
>         ->  Bitmap Index Scan on
> converting_touchpoints_attribution__performance_attribution_mod
>                 (cost=0.00..-12884901880.97 rows=1 width=0) (actual
> time=0.003..0.003 rows=0 loops=1)
>               Index Cond: (performance_attribution_model_fk = 2)
>         ->  Bitmap Index Scan on
> converting_touchpoints_attribution__touchpoint_fk
>                 (cost=0.00..-12884901880.97 rows=1 width=0) (never executed)
>               Index Cond: (touchpoint_fk = 42)
> Planning time: 0.122 ms
> Execution time: 0.036 ms

Hmph.  That just raises even more questions --- for instance, why is the
condition (performance_attribution_model_fk = 2) now estimated to select
just 1 row, when previously it was estimated to select 1850 rows?
Still, the fact that you can get a silly answer with just one table
does eliminate some theories I'd been toying with.

The index cost estimation code does have some potential for
garbage-in-garbage-out results, but nothing very promising.
One question is whether these indexes are on a non-default
tablespace with a non-default random_page_cost.  PG should
prevent you from putting in a negative random_page_cost, but
it's worth checking that.

I looked at some other possibilities like a corrupted tree_height
value, but none of them seem to fit the available info.  For
instance, the tree_height is only an int, so even if it were the
max negative value it would not explain the cost value you're getting.

Don't suppose you'd like to step through btcostestimate() with
a debugger and see where it's going off the rails?

            regards, tom lane


Re: BUG #14948: cost overflow

From
Jan Schulz
Date:
Hello Tom,

On 5 December 2017 at 22:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> The index cost estimation code does have some potential for
> garbage-in-garbage-out results, but nothing very promising.
> One question is whether these indexes are on a non-default
> tablespace with a non-default random_page_cost.  PG should
> prevent you from putting in a negative random_page_cost, but
> it's worth checking that.

How do I check?

> I looked at some other possibilities like a corrupted tree_height
> value, but none of them seem to fit the available info.  For
> instance, the tree_height is only an int, so even if it were the
> max negative value it would not explain the cost value you're getting.

I must confess that I have no clue what you are talking about.

> Don't suppose you'd like to step through btcostestimate() with
> a debugger and see where it's going off the rails?

I've never done debugging of a C program. If you give me some hints how to
do that, I might be able to (Ubuntu, PG from the apt repository).

Is there any debug log I might need to enable to get some similar information?

Best regards,

Jan
--
Jan Schulz
mail: jasc@gmx.net


Re: BUG #14948: cost overflow

From
Jan Schulz
Date:
Hello,

just some update on "how" that happend. Our current hypothesis:

* We have one parallel job was growing so big that postgresql consumed
too much memory (we use 'work_mem = 2GB'). This job is part of a
process which creates a 'm_dim_next' schema which in the end would be
switched to 'm_dim'. (note that the old 'm_dim' schema is not written
to during the whole process which creates m_dim_next, it gets dropped
after the schema switch in the last step in that process: m_dim ->
m_dim_old, m_dim_next -> m_dim,  drop m_dim_old)
* The OOM killer killed postgresql (please note that we have
configured postgres with almost no data security)
* This in turn would result "somehow" in some funny
data/statistics/whatever on the table in m_dim
* This in turn would result in wrong plans which in turn would result
in OOM when processes run which touched the table in m_dim

So our current understanding is that when we fix the memory issue in
the first step we will also not anymore have problems in the other
processes. We are currently looking into that.

Best regards,

Jan
--
Jan Schulz
mail: jasc@gmx.net
web: http://www.katzien.de


Re: BUG #14948: cost overflow

From
Tom Lane
Date:
Jan Schulz <jasc@gmx.net> writes:
> just some update on "how" that happend. Our current hypothesis:

> * We have one parallel job was growing so big that postgresql consumed
> too much memory (we use 'work_mem = 2GB'). This job is part of a
> process which creates a 'm_dim_next' schema which in the end would be
> switched to 'm_dim'. (note that the old 'm_dim' schema is not written
> to during the whole process which creates m_dim_next, it gets dropped
> after the schema switch in the last step in that process: m_dim ->
> m_dim_old, m_dim_next -> m_dim,  drop m_dim_old)
> * The OOM killer killed postgresql (please note that we have
> configured postgres with almost no data security)
> * This in turn would result "somehow" in some funny
> data/statistics/whatever on the table in m_dim
> * This in turn would result in wrong plans which in turn would result
> in OOM when processes run which touched the table in m_dim

I'm not entirely convinced by this theory.  In the first place, an OOM
kill shouldn't result in data corruption, no matter how you have your
installation configured.  You can turn off things that might result in
corruption after a power outage or other operating-system-level crash,
but not a process-level crash.  (Or at least that's the theory; there
could always be bugs of course.  But we developers crash Postgres
pretty regularly ;-), and we don't see corruption from that.)

In the second place, even if there were something wrong in pg_statistic,
that should manifest as a bogus rowcount estimate, which we're not seeing
in this EXPLAIN output.  The calculation that produces a cost estimate
given a rowcount estimate just doesn't have all that many other inputs,
which is why I didn't have very many theories about what could be wrong.

It's puzzling ...

            regards, tom lane


Re: BUG #14948: cost overflow

From
Jan Schulz
Date:
Hello,

in the meantime we managed to not trigger the OOM kill anymore and now
also don't have this problem with the negative costs anymore. We still
have one staging environment where this error is present, but we would
like to reclaim it. Is there some guidelines I can follow to debug
this?

Jan

On 7 December 2017 at 10:54, Jan Schulz <jasc@gmx.net> wrote:
> On 5 December 2017 at 22:30, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Don't suppose you'd like to step through btcostestimate() with
>> a debugger and see where it's going off the rails?
> I've never done debugging of a C program. If you give me some hints how to
> do that, I might be able to (Ubuntu, PG from the apt repository).
> Is there any debug log I might need to enable to get some similar information?


Re: BUG #14948: cost overflow

From
Tomas Vondra
Date:

On 12/12/2017 02:54 PM, Jan Schulz wrote:
> Hello,
> 
> in the meantime we managed to not trigger the OOM kill anymore and now
> also don't have this problem with the negative costs anymore. We still
> have one staging environment where this error is present, but we would
> like to reclaim it. Is there some guidelines I can follow to debug
> this?
> 

I think you're probably looking for this:

https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

Once you have GDB attached to the process, you can set breakpoints on
interesting places.

I believe the most interesting function to inspect is btcostestimate,
which is what produces estimates for the bitmap index scans:

https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/selfuncs.c#L6880

In particular, we're interested in parameter values passed to the
function, when it produces negative indexTotalCost (..-12884901880.97).
The value is updated on multiple places in the function, so you'll need
to step through and see at which point it gets negative.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services