Thread: BUG #14948: cost overflow
The following bug has been logged on the website: Bug reference: 14948 Logged by: Jan Schulz Email address: jasc@gmx.net PostgreSQL version: 10.1 Operating system: Linux (Ubuntu 16.04 LTS + HWE) Description: Hello, I was asked on twitter to post this issue to the bugs list. We have a server which reports negative costs in a query plan: GroupAggregate (cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00 rows=1600000000 width=254) On one system it looks funny, on other systems it looks normal (see query plans for the same query on different systems/environments). The background for this is a server getting killed by OOM killer and we are currently debugging this: we have work_mem set to 2GB and since a few days we have crashes with these settings which we haven't yet pin to a root cause -> this might be related to these crashes or it might be a result of this crashes, but it looks funny in any case. Thanks, Jan Full EXPLAIN (ANALYSE BUFFERS) on the funny looking system: GroupAggregate (cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00 rows=1600000000 width=254) (actual time=1.640..1.640 rows=0 loops=1) Group Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk, ('-1'::smallint), (('-1'::smallint)::integer) Buffers: shared hit=11 -> Sort (cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00 rows=16812456500 width=120) (actual time=1.638..1.638 rows=0 loops=1) Sort Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk, ('-1'::smallint), (('-1'::smallint)::integer) Sort Method: quicksort Memory: 25kB Buffers: shared hit=11 -> Append (cost=329985.38..-25142102027752040829944596389028691968.00 rows=16812456500 width=120) (actual time=1.621..1.621 rows=0 loops=1) Buffers: shared hit=11 -> Subquery Scan on "*SELECT* 1" (cost=329985.38..462708.80 rows=3537400 width=120) (actual time=0.077..0.077 rows=0 loops=1) Buffers: shared hit=6 -> Merge Right Join (cost=329985.38..418491.30 rows=3537400 width=118) (actual time=0.076..0.076 rows=0 loops=1) Merge Cond: (cm.campaign_name = ev.campaign_name) Buffers: shared hit=6 -> Index Scan using campaign__campaign_name on campaign cm (cost=0.15..67.80 rows=1310 width=34) (actual time=0.015..0.015 rows=1 loops=1) Buffers: shared hit=2 -> Sort (cost=329985.23..331335.38 rows=540061 width=74) (actual time=0.058..0.058 rows=0 loops=1) Sort Key: ev.campaign_name Sort Method: quicksort Memory: 25kB Buffers: shared hit=4 -> Merge Right Join (cost=270456.49..278563.96 rows=540061 width=74) (actual time=0.049..0.049 rows=0 loops=1) Merge Cond: (et.event_type_name = ev.event_type) Buffers: shared hit=4 -> Sort (cost=90.93..94.20 rows=1310 width=32) (actual time=0.025..0.025 rows=1 loops=1) Sort Key: et.event_type_name Sort Method: quicksort Memory: 25kB Buffers: shared hit=1 -> Seq Scan on event_type et (cost=0.00..23.10 rows=1310 width=32) (actual time=0.006..0.007 rows=5 loops=1) Buffers: shared hit=1 -> Sort (cost=270365.57..270571.70 rows=82452 width=74) (actual time=0.023..0.023 rows=0 loops=1) Sort Key: ev.event_type Sort Method: quicksort Memory: 25kB Buffers: shared hit=3 -> Hash Left Join (cost=40461.31..263632.84 rows=82452 width=74) (actual time=0.014..0.014 rows=0 loops=1) Hash Cond: (ev.click_subscriber_key = csu.subscriber_name) Buffers: shared hit=3 -> Hash Left Join (cost=21328.11..243458.68 rows=82452 width=102) (actual time=0.013..0.013 rows=0 loops=1) Hash Cond: (ev.open_subscriber_key = osu.subscriber_name) Buffers: shared hit=3 -> Hash Left Join (cost=2194.91..223284.53 rows=82452 width=130) (actual time=0.012..0.012 rows=0 loops=1) Hash Cond: (ev.sendid = si.sendid_name) Buffers: shared hit=3 -> Bitmap Heap Scan on event ev (cost=2155.44..222204.65 rows=82452 width=160) (actual time=0.012..0.012 rows=0 loops=1) Recheck Cond: (event_day_fk = 20171120) Buffers: shared hit=3 -> Bitmap Index Scan on event__event_day_fk (cost=0.00..2134.82 rows=82452 width=0) (actual time=0.009..0.009 rows=0 loops=1) Index Cond: (event_day_fk = 20171120) Buffers: shared hit=3 -> Hash (cost=23.10..23.10 rows=1310 width=34) (never executed) -> Seq Scan on sendid si (cost=0.00..23.10 rows=1310 width=34) (never executed) -> Hash (cost=11259.20..11259.20 rows=629920 width=36) (never executed) -> Seq Scan on subscriber osu (cost=0.00..11259.20 rows=629920 width=36) (never executed) -> Hash (cost=11259.20..11259.20 rows=629920 width=36) (never executed) -> Seq Scan on subscriber csu (cost=0.00..11259.20 rows=629920 width=36) (never executed) -> Merge Left Join (cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00 rows=16808919100 width=120) (actual time=1.543..1.543 rows=0 loops=1) Merge Cond: (t.customer_fk = cu.customer_id) Buffers: shared hit=5 -> Sort (cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00 rows=16808919100 width=82) (actual time=1.543..1.543 rows=0 loops=1) Sort Key: t.customer_fk Sort Method: quicksort Memory: 25kB Buffers: shared hit=5 -> Nested Loop (cost=-2021924571963149.75..-25142102027752040829944596389028691968.00 rows=16808919100 width=82) (actual time=1.537..1.537 rows=0 loops=1) Join Filter: (si_1.sendid_name = ev_1.sendid) Buffers: shared hit=5 -> HashAggregate (cost=239661.55..242267.55 rows=260600 width=34) (actual time=1.536..1.536 rows=0 loops=1) Group Key: ev_1.sendid, cm_1.campaign_id Buffers: shared hit=5 -> Merge Join (cost=228900.82..236989.10 rows=534491 width=34) (actual time=0.016..0.016 rows=0 loops=1) Merge Cond: (cm_1.campaign_name = ev_1.campaign_name) Buffers: shared hit=5 -> Index Scan using campaign__campaign_name on campaign cm_1 (cost=0.15..67.80 rows=1303 width=34) (actual time=0.007..0.007 rows=1 loops=1) Filter: (campaign_id IS NOT NULL) Buffers: shared hit=2 -> Sort (cost=228900.67..229105.77 rows=82040 width=64) (actual time=0.008..0.008 rows=0 loops=1) Sort Key: ev_1.campaign_name Sort Method: quicksort Memory: 25kB Buffers: shared hit=3 -> Bitmap Heap Scan on event ev_1 (cost=2155.34..222204.55 rows=82040 width=64) (actual time=0.002..0.002 rows=0 loops=1) Recheck Cond: (event_day_fk = 20171120) Filter: (sendid IS NOT NULL) Buffers: shared hit=3 -> Bitmap Index Scan on event__event_day_fk (cost=0.00..2134.82 rows=82452 width=0) (actual time=0.002..0.002 rows=0 loops=1) Index Cond: (event_day_fk = 20171120) Buffers: shared hit=3 -> Nested Loop (cost=-2021924572202811.25..-96477751449547350295223172333568.00 rows=84496365 width=144) (never executed) Join Filter: (ad.ad_group_name = si_1.sendid_name) -> Seq Scan on sendid si_1 (cost=0.00..23.10 rows=1303 width=34) (never executed) Filter: (sendid_id IS NOT NULL) -> Nested Loop (cost=-2021924572202811.25..-74042786991210548153116786688.00 rows=84929134 width=110) (never executed) Join Filter: (t.ad_fk = ad.ad_id) -> Index Scan using ad__ad_group_name on ad (cost=0.42..1415932.35 rows=2157 width=40) (never executed) Filter: (channel_name = 'Email'::text) -> Nested Loop Left Join (cost=-2021924572202811.75..-34326744084937666832891904.00 rows=16977262428 width=86) (never executed) -> Merge Left Join (cost=767470.10..256025144.48 rows=16977262428 width=92) (never executed) Merge Cond: (t.touchpoint_id = rp.touchpoint_fk) -> Merge Left Join (cost=595487.90..1108803.95 rows=34168763 width=68) (never executed) Merge Cond: (t.touchpoint_id = ap.touchpoint_fk) -> Sort (cost=509005.69..509397.99 rows=156921 width=44) (never executed) Sort Key: t.touchpoint_id -> Append (cost=0.00..495463.66 rows=156921 width=44) (never executed) -> Seq Scan on touchpoint t (cost=0.00..0.00 rows=1 width=44) (never executed) Filter: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_0 t_1 (cost=353.49..33026.70 rows=10460 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_0__day_fk (cost=0.00..350.88 rows=10460 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_1 t_2 (cost=353.31..32951.98 rows=10436 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_1__day_fk (cost=0.00..350.70 rows=10436 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_2 t_3 (cost=353.51..33033.17 rows=10462 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_2__day_fk (cost=0.00..350.89 rows=10462 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_3 t_4 (cost=353.21..32910.10 rows=10423 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_3__day_fk (cost=0.00..350.60 rows=10423 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_4 t_5 (cost=353.62..33075.49 rows=10476 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_4__day_fk (cost=0.00..351.00 rows=10476 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_5 t_6 (cost=358.10..33273.80 rows=10538 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_5__day_fk (cost=0.00..355.46 rows=10538 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_6 t_7 (cost=353.21..32913.40 rows=10424 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_6__day_fk (cost=0.00..350.61 rows=10424 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_7 t_8 (cost=353.48..33023.38 rows=10459 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_7__day_fk (cost=0.00..350.87 rows=10459 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_8 t_9 (cost=353.53..33042.77 rows=10465 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_8__day_fk (cost=0.00..350.92 rows=10465 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_9 t_10 (cost=353.66..33094.94 rows=10482 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_9__day_fk (cost=0.00..351.04 rows=10482 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_10 t_11 (cost=353.35..32968.57 rows=10442 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_10__day_fk (cost=0.00..350.74 rows=10442 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_11 t_12 (cost=353.66..33094.59 rows=10482 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_11__day_fk (cost=0.00..351.04 rows=10482 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_12 t_13 (cost=357.94..33211.98 rows=10518 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_12__day_fk (cost=0.00..355.31 rows=10518 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_13 t_14 (cost=352.93..32799.96 rows=10388 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_13__day_fk (cost=0.00..350.34 rows=10388 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Bitmap Heap Scan on touchpoint_14 t_15 (cost=353.53..33042.83 rows=10465 width=44) (never executed) Recheck Cond: (day_fk = 20171120) -> Bitmap Index Scan on touchpoint_14__day_fk (cost=0.00..350.92 rows=10465 width=0) (never executed) Index Cond: (day_fk = 20171120) -> Sort (cost=86482.21..86591.08 rows=43549 width=32) (never executed) Sort Key: ap.touchpoint_fk -> Append (cost=0.00..83126.68 rows=43549 width=32) (never executed) -> Seq Scan on acquisition_performance ap (cost=0.00..0.00 rows=1 width=32) (never executed) Filter: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_0 ap_1 (cost=50.91..5537.62 rows=2901 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_0__performance_attribution_model_fk (cost=0.00..50.18 rows=2901 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_1 ap_2 (cost=50.95..5548.79 rows=2907 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_1__performance_attribution_model_fk (cost=0.00..50.23 rows=2907 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_2 ap_3 (cost=50.95..5547.08 rows=2906 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_2__performance_attribution_model_fk (cost=0.00..50.22 rows=2906 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_3 ap_4 (cost=50.87..5528.44 rows=2896 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_3__performance_attribution_model_fk (cost=0.00..50.14 rows=2896 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_4 ap_5 (cost=50.91..5537.32 rows=2901 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_4__performance_attribution_model_fk (cost=0.00..50.18 rows=2901 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_5 ap_6 (cost=50.95..5546.79 rows=2906 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_5__performance_attribution_model_fk (cost=0.00..50.22 rows=2906 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_6 ap_7 (cost=50.92..5539.61 rows=2902 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_6__performance_attribution_model_fk (cost=0.00..50.19 rows=2902 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_7 ap_8 (cost=50.88..5530.15 rows=2897 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_7__performance_attribution_model_fk (cost=0.00..50.15 rows=2897 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_8 ap_9 (cost=50.88..5530.15 rows=2897 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_8__performance_attribution_model_fk (cost=0.00..50.15 rows=2897 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_9 ap_10 (cost=50.95..5547.08 rows=2906 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_9__performance_attribution_model_fk (cost=0.00..50.22 rows=2906 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_10 ap_11 (cost=50.88..5530.73 rows=2897 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_10__performance_attribution_model_fk (cost=0.00..50.15 rows=2897 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_11 ap_12 (cost=50.99..5555.96 rows=2911 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_11__performance_attribution_model_fk (cost=0.00..50.26 rows=2911 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_12 ap_13 (cost=50.99..5558.25 rows=2912 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_12__performance_attribution_model_fk (cost=0.00..50.27 rows=2912 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_13 ap_14 (cost=50.92..5539.91 rows=2902 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_13__performance_attribution_model_fk (cost=0.00..50.19 rows=2902 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on acquisition_performance_14 ap_15 (cost=50.95..5548.79 rows=2907 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on acquisition_performance_14__performance_attribution_model_fk (cost=0.00..50.23 rows=2907 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Sort (cost=171982.20..172230.63 rows=99373 width=32) (never executed) Sort Key: rp.touchpoint_fk -> Append (cost=0.00..163733.96 rows=99373 width=32) (never executed) -> Seq Scan on reactivation_performance rp (cost=0.00..0.00 rows=1 width=32) (never executed) Filter: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_0 rp_1 (cost=111.52..10860.88 rows=6592 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_0__performance_attribution_model_fk (cost=0.00..109.87 rows=6592 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_1 rp_2 (cost=111.11..10775.38 rows=6540 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_1__performance_attribution_model_fk (cost=0.00..109.48 rows=6540 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_2 rp_3 (cost=111.49..10855.79 rows=6589 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_2__performance_attribution_model_fk (cost=0.00..109.84 rows=6589 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_3 rp_4 (cost=111.30..10814.83 rows=6564 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_3__performance_attribution_model_fk (cost=0.00..109.66 rows=6564 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_4 rp_5 (cost=111.62..10883.29 rows=6606 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_4__performance_attribution_model_fk (cost=0.00..109.97 rows=6606 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_5 rp_6 (cost=119.65..11740.46 rows=7126 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_5__performance_attribution_model_fk (cost=0.00..117.87 rows=7126 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_6 rp_7 (cost=110.50..10646.17 rows=6461 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_6__performance_attribution_model_fk (cost=0.00..108.88 rows=6461 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_7 rp_8 (cost=111.42..10841.49 rows=6580 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_7__performance_attribution_model_fk (cost=0.00..109.78 rows=6580 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_8 rp_9 (cost=111.04..10761.08 rows=6531 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_8__performance_attribution_model_fk (cost=0.00..109.41 rows=6531 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_9 rp_10 (cost=116.87..11151.48 rows=6767 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_9__performance_attribution_model_fk (cost=0.00..115.18 rows=6767 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_10 rp_11 (cost=110.49..10644.25 rows=6460 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_10__performance_attribution_model_fk (cost=0.00..108.88 rows=6460 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_11 rp_12 (cost=117.28..11238.07 rows=6820 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_11__performance_attribution_model_fk (cost=0.00..115.58 rows=6820 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_12 rp_13 (cost=117.45..11273.68 rows=6842 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_12__performance_attribution_model_fk (cost=0.00..115.74 rows=6842 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_13 rp_14 (cost=109.69..10476.41 rows=6357 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_13__performance_attribution_model_fk (cost=0.00..108.10 rows=6357 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on reactivation_performance_14 rp_15 (cost=111.09..10770.70 rows=6537 width=32) (never executed) Recheck Cond: (performance_attribution_model_fk = 2) -> Bitmap Index Scan on reactivation_performance_14__performance_attribution_model_fk (cost=0.00..109.45 rows=6537 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Bitmap Heap Scan on converting_touchpoints_attribution conv (cost=-2021924572970281.75..-2021924572970244.00 rows=1 width=10) (never executed) Recheck Cond: ((t.touchpoint_id = touchpoint_fk) AND (performance_attribution_model_fk = 2)) -> BitmapAnd (cost=-2021924572970281.75..-2021924572970281.75 rows=1850 width=0) (never executed) -> Bitmap Index Scan on converting_touchpoints_attribution__touchpoint_fk (cost=0.00..-2021911688068401.00 rows=1850 width=0) (never executed) Index Cond: (t.touchpoint_id = touchpoint_fk) -> Bitmap Index Scan on converting_touchpoints_attribution__performance_attribution_mod (cost=0.00..-12884901880.97 rows=1850 width=0) (never executed) Index Cond: (performance_attribution_model_fk = 2) -> Sort (cost=696479.79..707910.70 rows=4572361 width=6) (never executed) Sort Key: cu.customer_id -> Seq Scan on customer cu (cost=0.00..190673.61 rows=4572361 width=6) (never executed) Planning time: 7.154 ms Execution time: 6.358 ms ############## # query (which is usually run in parallel for multiple day_fks): ############## EXPLAIN ( ANALYSE, BUFFERS ) SELECT -- dimensions 20170920 AS day_fk, sendID_fk AS sendID_fk, campaign_fk AS campaign_fk, region_fk AS region_fk, customer_segment_fk AS customer_segment_fk, -- touchpoint hll_union_agg(hll_add(hll_empty(14, 5), hll_hash_bigint( visitor_id))) :: hll(14, 5) AS visitors, hll_union_agg(hll_add(hll_empty(14, 5), hll_hash_bigint( customer_fk))) :: hll(14, 5) AS customers, sum( number_of_touchpoints) AS number_of_touchpoints, sum( number_of_converting_touchpoints) AS number_of_converting_touchpoints, sum( number_of_bounces) AS number_of_bounces, sum( number_of_sent) AS number_of_sent, sum( number_of_open) AS number_of_opens, sum( number_of_clicks) AS number_of_clicks, -- this is https://github.com/citusdata/postgresql-hll hll_union_agg(hll_add(hll_empty(12, 5), hll_hash_integer(click_subscriber_fk))) :: hll (12, 5) AS number_of_unique_clicks, hll_union_agg(hll_add(hll_empty(12, 5), hll_hash_integer(open_subscriber_fk))) :: hll (12, 5) AS number_of_unique_opens, sum( number_of_net_orders) AS number_of_net_orders, sum( number_of_first_net_orders) AS number_of_first_net_orders, sum( number_of_recurring_net_orders) AS number_of_recurring_net_orders, sum( contribution_margin_1) AS contribution_margin_1, sum( net_revenue) AS net_revenue, sum( net_revenue_first_net_orders) AS net_revenue_first_net_orders, sum( net_revenue_recurring_net_orders) AS net_revenue_recurring_net_orders, sum( promotion_discount_value) AS promotion_discount_value FROM ( -- data from salesforce SELECT si.sendID_id AS sendID_fk, cm.campaign_id AS campaign_fk, -1 :: SMALLINT AS region_fk, -1 :: SMALLINT AS customer_segment_fk, NULL AS visitor_id, NULL AS customer_fk, NULL AS number_of_touchpoints, NULL AS number_of_converting_touchpoints, CASE WHEN et.event_type_name = 'Bounce' THEN 1 ELSE 0 END AS number_of_bounces, CASE WHEN et.event_type_name = 'Sent' THEN 1 ELSE 0 END AS number_of_sent, CASE WHEN et.event_type_name = 'Open' THEN 1 ELSE 0 END AS number_of_open, CASE WHEN et.event_type_name = 'Click' THEN 1 ELSE 0 END AS number_of_clicks, csu.subscriber_id AS click_subscriber_fk, osu.subscriber_id AS open_subscriber_fk, NULL AS number_of_net_orders, NULL AS number_of_first_net_orders, NULL AS number_of_recurring_net_orders, NULL AS contribution_margin_1, NULL AS net_revenue, NULL AS net_revenue_first_net_orders, NULL AS net_revenue_recurring_net_orders, NULL AS promotion_discount_value FROM sf_tmp.event ev LEFT JOIN sf_dim.campaign cm ON ev.campaign_name = cm.campaign_name LEFT JOIN sf_dim.event_type et ON ev.event_type = et.event_type_name LEFT JOIN sf_dim.sendid si ON ev.Sendid = si.sendid_name LEFT JOIN sf_dim.subscriber osu ON ev.open_subscriber_key = osu.subscriber_name LEFT JOIN sf_dim.subscriber csu ON ev.click_subscriber_key = csu.subscriber_name WHERE ev.event_day_fk = 20171120 -- data from ad performance UNION ALL SELECT si.sendid_id AS sendID_fk, sendid_campaign_match.campaign_fk AS campaign_fk, t.region_fk, COALESCE(cu.customer_segment_fk, -10), t.visitor_id, t.customer_fk, t.number_of_touchpoints, conv.converting_click :: INTEGER AS number_of_converting_touchpoints, NULL AS number_of_bounces, NULL AS number_of_sent, NULL AS number_of_open, NULL AS number_of_clicks, NULL AS click_subscriber_fk, NULL AS open_subscriber_fk, coalesce(ap.number_of_first_net_orders, 0.0) + coalesce(rp.number_of_recurring_net_orders, 0.0) AS number_of_net_orders, ap.number_of_first_net_orders AS number_of_first_net_orders, rp.number_of_recurring_net_orders AS number_of_recurring_net_orders, coalesce(ap.contribution_margin_1_first_net_orders, 0.0) + coalesce(rp.contribution_margin_1_recurring_net_orders, 0.0) AS contribution_margin_1, coalesce(ap.net_revenue_first_net_orders, 0.0) + coalesce(rp.net_revenue_recurring_net_orders, 0.0) AS net_revenue, ap.net_revenue_first_net_orders AS net_revenue_first_net_orders, rp.net_revenue_recurring_net_orders AS net_revenue_recurring_net_orders, t.promotion_discount_value FROM m_dim.touchpoint t JOIN m_dim.ad ON ad.ad_id = t.ad_fk AND ad.channel_name = 'Email' LEFT JOIN sf_dim.sendid si ON si.Sendid_name = ad.ad_group_name LEFT JOIN ( SELECT DISTINCT ev.sendid, cm.campaign_id AS campaign_fk FROM sf_tmp.event ev LEFT JOIN sf_dim.campaign cm ON ev.campaign_name = cm.campaign_name WHERE EVENT_day_fk = 20171120 AND ev.sendid IS NOT NULL) sendid_campaign_match ON sendid_campaign_match.sendid = si.sendid_name LEFT JOIN os_dim.customer cu ON cu.customer_id = t.customer_fk LEFT JOIN m_dim.acquisition_performance ap ON ap.touchpoint_fk = t.touchpoint_id AND ap.performance_attribution_model_fk = 2 LEFT JOIN m_dim.reactivation_performance rp ON rp.touchpoint_fk = t.touchpoint_id AND rp.performance_attribution_model_fk = 2 LEFT JOIN m_dim.converting_touchpoints_attribution conv ON t.touchpoint_id = conv.touchpoint_fk AND conv.performance_attribution_model_fk = 2 WHERE t.day_fk = 20171120 AND si.sendid_id IS NOT NULL AND campaign_fk IS NOT NULL ) q GROUP BY sendID_fk, campaign_fk, region_fk, customer_segment_fk; ############## # On a different system it looks like this (much more sane): ############## GroupAggregate (cost=384347.08..414413.11 rows=37349 width=254) (actual time=5343.214..5417.604 rows=1194 loops=1) Group Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk, ('-1'::smallint), (('-1'::smallint)::integer) Buffers: shared hit=421793 read=26153, temp read=1561 written=1717 -> Sort (cost=384347.08..385280.81 rows=373491 width=120) (actual time=5343.180..5351.390 rows=55816 loops=1) Sort Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk, ('-1'::smallint), (('-1'::smallint)::integer) Sort Method: external merge Disk: 2880kB Buffers: shared hit=421793 read=26153, temp read=1561 written=1717 -> Append (cost=140964.58..319780.25 rows=373491 width=120) (actual time=2319.179..5287.481 rows=55816 loops=1) Buffers: shared hit=421787 read=26153, temp read=1201 written=1356 -> Subquery Scan on "*SELECT* 1" (cost=140964.58..155107.05 rows=373160 width=120) (actual time=2319.178..2358.269 rows=51664 loops=1) Buffers: shared hit=19573 read=20783, temp read=1201 written=1356 -> Merge Right Join (cost=140964.58..150442.55 rows=373160 width=118) (actual time=2319.176..2350.994 rows=51664 loops=1) Merge Cond: (et.event_type_name = ev.event_type) Buffers: shared hit=19573 read=20783, temp read=1201 written=1356 -> Sort (cost=90.93..94.20 rows=1310 width=32) (actual time=1.088..1.090 rows=5 loops=1) Sort Key: et.event_type_name Sort Method: quicksort Memory: 25kB Buffers: shared read=1 -> Seq Scan on event_type et (cost=0.00..23.10 rows=1310 width=32) (actual time=1.065..1.066 rows=5 loops=1) Buffers: shared read=1 -> Materialize (cost=140873.65..141158.50 rows=56971 width=17) (actual time=2318.081..2331.993 rows=51664 loops=1) Buffers: shared hit=19573 read=20782, temp read=1201 written=1356 -> Sort (cost=140873.65..141016.08 rows=56971 width=17) (actual time=2318.076..2323.706 rows=51664 loops=1) Sort Key: ev.event_type Sort Method: external merge Disk: 1312kB Buffers: shared hit=19573 read=20782, temp read=1201 written=1205 -> Hash Left Join (cost=90861.89..135705.53 rows=56971 width=17) (actual time=1897.747..2291.303 rows=51664 loops=1) Hash Cond: (ev.campaign_name = cm.campaign_name) Buffers: shared hit=19573 read=20782, temp read=1037 written=1040 -> Hash Left Join (cost=90858.44..134918.72 rows=56971 width=38) (actual time=1897.709..2277.430 rows=51664 loops=1) Hash Cond: (ev.sendid = si.sendid_name) Buffers: shared hit=19572 read=20782, temp read=1037 written=1040 -> Merge Right Join (cost=90848.40..134190.85 rows=56971 width=42) (actual time=1897.586..2263.942 rows=51664 loops=1) Merge Cond: (osu.subscriber_name = ev.open_subscriber_key) Buffers: shared hit=19571 read=20781, temp read=1037 written=1040 -> Index Scan using subscriber__subscriber_name on subscriber osu (cost=0.43..39487.86 rows=1323362 width=36) (actual time=0.010..216.141 rows=1323194 loops=1) Buffers: shared hit=19562 -> Materialize (cost=90847.98..91132.83 rows=56971 width=70) (actual time=1897.564..1916.819 rows=51664 loops=1) Buffers: shared hit=9 read=20781, temp read=1037 written=1040 -> Sort (cost=90847.98..90990.40 rows=56971 width=70) (actual time=1897.561..1911.064 rows=51664 loops=1) Sort Key: ev.open_subscriber_key Sort Method: external merge Disk: 4112kB Buffers: shared hit=9 read=20781, temp read=1037 written=1040 -> Merge Right Join (cost=41811.94..85011.85 rows=56971 width=70) (actual time=215.181..1861.479 rows=51664 loops=1) Merge Cond: (csu.subscriber_name = ev.click_subscriber_key) Buffers: shared hit=9 read=20781, temp read=523 written=523 -> Index Scan using subscriber__subscriber_name on subscriber csu (cost=0.43..39487.86 rows=1323362 width=36) (actual time=0.005..1510.625 rows=1323194 loops=1) Buffers: shared hit=9 read=19553 -> Materialize (cost=41778.04..42062.89 rows=56971 width=98) (actual time=214.084..225.830 rows=51664 loops=1) Buffers: shared read=1228, temp read=523 written=523 -> Sort (cost=41778.04..41920.47 rows=56971 width=98) (actual time=214.080..220.578 rows=51664 loops=1) Sort Key: ev.click_subscriber_key Sort Method: external sort Disk: 4184kB Buffers: shared read=1228, temp read=523 written=523 -> Index Scan using event__event_day_fk on event ev (cost=0.56..35495.92 rows=56971 width=98) (actual time=3.877..192.397 rows=51664 loops=1) Index Cond: (event_day_fk = 20171120) Buffers: shared read=1228 -> Hash (cost=5.57..5.57 rows=357 width=7) (actual time=0.103..0.103 rows=357 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 22kB Buffers: shared hit=1 read=1 -> Seq Scan on sendid si (cost=0.00..5.57 rows=357 width=7) (actual time=0.013..0.059 rows=357 loops=1) Buffers: shared hit=1 read=1 -> Hash (cost=2.09..2.09 rows=109 width=26) (actual time=0.032..0.032 rows=109 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 15kB Buffers: shared hit=1 -> Seq Scan on campaign cm (cost=0.00..2.09 rows=109 width=26) (actual time=0.007..0.013 rows=109 loops=1) Buffers: shared hit=1 -> Hash Join (cost=37715.40..164669.88 rows=331 width=120) (actual time=264.900..2924.332 rows=4152 loops=1) Hash Cond: (ad.ad_group_name = si_1.sendid_name) Buffers: shared hit=402214 read=5370 -> Nested Loop Left Join (cost=37705.37..164655.25 rows=37 width=115) (actual time=264.770..2919.386 rows=4152 loops=1) Buffers: shared hit=402212 read=5370 -> Nested Loop Left Join (cost=37705.37..163946.79 rows=16 width=99) (actual time=210.937..2437.542 rows=4152 loops=1) Buffers: shared hit=214029 read=5076 -> Nested Loop Left Join (cost=37705.37..163597.96 rows=8 width=75) (actual time=173.726..2018.663 rows=4152 loops=1) Buffers: shared hit=27013 read=4975 -> Nested Loop Left Join (cost=37704.81..163579.72 rows=8 width=73) (actual time=96.881..1909.229 rows=4152 loops=1) Buffers: shared hit=9961 read=4896 -> Hash Join (cost=37704.38..163562.73 rows=8 width=71) (actual time=96.873..1877.702 rows=4152 loops=1) Hash Cond: (ad.ad_group_name = sendid_campaign_match.sendid) Buffers: shared hit=1232 read=4896 -> Hash Join (cost=304.31..126040.79 rows=38 width=63) (actual time=60.935..1839.130 rows=4285 loops=1) Hash Cond: (t.ad_fk = ad.ad_id) Buffers: shared hit=3 read=4896 -> Append (cost=0.00..124157.87 rows=125038 width=44) (actual time=48.277..1810.384 rows=123117 loops=1) Buffers: shared read=4855 -> Seq Scan on touchpoint t (cost=0.00..0.00 rows=1 width=44) (actual time=0.002..0.002 rows=0 loops=1) Filter: (day_fk = 20171120) -> Index Scan using touchpoint_0__day_fk on touchpoint_0 t_1 (cost=0.43..7616.82 rows=7654 width=44) (actual time=48.273..237.069 rows=8295 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=331 -> Index Scan using touchpoint_1__day_fk on touchpoint_1 t_2 (cost=0.43..8099.65 rows=8152 width=44) (actual time=3.415..83.608 rows=8257 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=341 -> Index Scan using touchpoint_2__day_fk on touchpoint_2 t_3 (cost=0.43..8207.70 rows=8262 width=44) (actual time=3.172..49.089 rows=8168 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=325 -> Index Scan using touchpoint_3__day_fk on touchpoint_3 t_4 (cost=0.43..8639.32 rows=8712 width=44) (actual time=3.443..141.148 rows=8377 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=330 -> Index Scan using touchpoint_4__day_fk on touchpoint_4 t_5 (cost=0.43..8540.24 rows=8608 width=44) (actual time=3.038..147.760 rows=8197 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=314 -> Index Scan using touchpoint_5__day_fk on touchpoint_5 t_6 (cost=0.43..7897.13 rows=7940 width=44) (actual time=3.870..42.844 rows=8264 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=324 -> Index Scan using touchpoint_6__day_fk on touchpoint_6 t_7 (cost=0.43..8156.89 rows=8211 width=44) (actual time=3.439..144.567 rows=8194 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=328 -> Index Scan using touchpoint_7__day_fk on touchpoint_7 t_8 (cost=0.43..8830.66 rows=8908 width=44) (actual time=4.362..143.630 rows=8068 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=322 -> Index Scan using touchpoint_8__day_fk on touchpoint_8 t_9 (cost=0.43..9695.86 rows=9808 width=44) (actual time=3.337..142.971 rows=8273 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=329 -> Index Scan using touchpoint_9__day_fk on touchpoint_9 t_10 (cost=0.43..8124.32 rows=8179 width=44) (actual time=3.349..150.226 rows=8167 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=325 -> Index Scan using touchpoint_10__day_fk on touchpoint_10 t_11 (cost=0.43..8356.39 rows=8418 width=44) (actual time=4.030..39.337 rows=8233 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=325 -> Index Scan using touchpoint_11__day_fk on touchpoint_11 t_12 (cost=0.43..7450.94 rows=7481 width=44) (actual time=3.529..45.088 rows=8140 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=319 -> Index Scan using touchpoint_12__day_fk on touchpoint_12 t_13 (cost=0.43..7753.21 rows=7790 width=44) (actual time=3.935..153.074 rows=8131 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=313 -> Index Scan using touchpoint_13__day_fk on touchpoint_13 t_14 (cost=0.43..8589.00 rows=8660 width=44) (actual time=3.264..145.147 rows=8255 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=318 -> Index Scan using touchpoint_14__day_fk on touchpoint_14 t_15 (cost=0.43..8199.74 rows=8254 width=44) (actual time=2.911..135.576 rows=8098 loops=1) Index Cond: (day_fk = 20171120) Buffers: shared read=311 -> Hash (cost=300.40..300.40 rows=313 width=35) (actual time=11.881..11.881 rows=1145 loops=1) Buckets: 2048 (originally 1024) Batches: 1 (originally 1) Memory Usage: 109kB Buffers: shared read=41 -> Index Scan using ad__channel_name on ad (cost=0.42..300.40 rows=313 width=35) (actual time=2.580..11.649 rows=1145 loops=1) Index Cond: (channel_name = 'Email'::text) Buffers: shared read=41 -> Hash (cost=37079.88..37079.88 rows=25615 width=8) (actual time=35.890..35.890 rows=115 loops=1) Buckets: 32768 Batches: 1 Memory Usage: 261kB Buffers: shared hit=1229 -> Subquery Scan on sendid_campaign_match (cost=36567.58..37079.88 rows=25615 width=8) (actual time=35.713..35.857 rows=115 loops=1) Buffers: shared hit=1229 -> HashAggregate (cost=36567.58..36823.73 rows=25615 width=8) (actual time=35.712..35.843 rows=115 loops=1) Group Key: ev_1.sendid, cm_1.campaign_id Buffers: shared hit=1229 -> Hash Join (cost=4.02..36282.72 rows=56971 width=8) (actual time=0.071..25.206 rows=51664 loops=1) Hash Cond: (ev_1.campaign_name = cm_1.campaign_name) Buffers: shared hit=1229 -> Index Scan using event__event_day_fk on event ev_1 (cost=0.56..35495.92 rows=56971 width=29) (actual time=0.024..12.175 rows=51664 loops=1) Index Cond: (event_day_fk = 20171120) Filter: (sendid IS NOT NULL) Buffers: shared hit=1228 -> Hash (cost=2.09..2.09 rows=109 width=26) (actual time=0.038..0.038 rows=109 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 15kB Buffers: shared hit=1 -> Seq Scan on campaign cm_1 (cost=0.00..2.09 rows=109 width=26) (actual time=0.010..0.020 rows=109 loops=1) Filter: (campaign_id IS NOT NULL) Buffers: shared hit=1 -> Index Scan using customer_pkey on customer cu (cost=0.43..2.12 rows=1 width=6) (actual time=0.007..0.007 rows=1 loops=4152) Index Cond: (customer_id = t.customer_fk) Buffers: shared hit=8729 -> Index Scan using converting_touchpoints_attribution__touchpoint_fk_performance_a on converting_touchpoints_attribution conv (cost=0.56..2.27 rows=1 width=10) (actual time=0.026..0.026 rows=0 loops=4152) Index Cond: ((t.touchpoint_id = touchpoint_fk) AND (performance_attribution_model_fk = 2)) Buffers: shared hit=17052 read=79 -> Append (cost=0.00..43.44 rows=16 width=32) (actual time=0.099..0.099 rows=0 loops=4152) Buffers: shared hit=187016 read=101 -> Seq Scan on acquisition_performance ap (cost=0.00..0.00 rows=1 width=32) (actual time=0.000..0.000 rows=0 loops=4152) Filter: ((performance_attribution_model_fk = 2) AND (touchpoint_fk = t.touchpoint_id)) -> Index Scan using acquisition_performance_0__touchpoint_fk on acquisition_performance_0 ap_1 (cost=0.42..2.90 rows=1 width=32) (actual time=0.027..0.027 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12456 read=9 -> Index Scan using acquisition_performance_1__touchpoint_fk on acquisition_performance_1 ap_2 (cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12500 read=7 -> Index Scan using acquisition_performance_2__touchpoint_fk on acquisition_performance_2 ap_3 (cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12451 read=6 -> Index Scan using acquisition_performance_3__touchpoint_fk on acquisition_performance_3 ap_4 (cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12455 read=8 -> Index Scan using acquisition_performance_4__touchpoint_fk on acquisition_performance_4 ap_5 (cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12480 read=9 -> Index Scan using acquisition_performance_5__touchpoint_fk on acquisition_performance_5 ap_6 (cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Buffers: shared hit=12451 read=5 -> Index Scan using acquisition_performance_6__touchpoint_fk on acquisition_performance_6 ap_7 (cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Buffers: shared hit=12516 read=6 -> Index Scan using acquisition_performance_7__touchpoint_fk on acquisition_performance_7 ap_8 (cost=0.42..2.90 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12451 read=6 -> Index Scan using acquisition_performance_8__touchpoint_fk on acquisition_performance_8 ap_9 (cost=0.42..2.89 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12455 read=8 -> Index Scan using acquisition_performance_9__touchpoint_fk on acquisition_performance_9 ap_10 (cost=0.42..2.89 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12473 read=7 -> Index Scan using acquisition_performance_10__touchpoint_fk on acquisition_performance_10 ap_11 (cost=0.42..2.89 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Buffers: shared hit=12451 read=5 -> Index Scan using acquisition_performance_11__touchpoint_fk on acquisition_performance_11 ap_12 (cost=0.42..2.90 rows=1 width=32) (actual time=0.027..0.027 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12451 read=6 -> Index Scan using acquisition_performance_12__touchpoint_fk on acquisition_performance_12 ap_13 (cost=0.42..2.89 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12523 read=8 -> Index Scan using acquisition_performance_13__touchpoint_fk on acquisition_performance_13 ap_14 (cost=0.42..2.89 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Buffers: shared hit=12451 read=5 -> Index Scan using acquisition_performance_14__touchpoint_fk on acquisition_performance_14 ap_15 (cost=0.42..2.89 rows=1 width=32) (actual time=0.003..0.003 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12452 read=6 -> Append (cost=0.00..44.12 rows=16 width=32) (actual time=0.105..0.115 rows=0 loops=4152) Buffers: shared hit=188183 read=294 -> Seq Scan on reactivation_performance rp (cost=0.00..0.00 rows=1 width=32) (actual time=0.000..0.000 rows=0 loops=4152) Filter: ((performance_attribution_model_fk = 2) AND (touchpoint_fk = t.touchpoint_id)) -> Index Scan using reactivation_performance_0__touchpoint_fk on reactivation_performance_0 rp_1 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12549 read=20 -> Index Scan using reactivation_performance_1__touchpoint_fk on reactivation_performance_1 rp_2 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12574 read=21 -> Index Scan using reactivation_performance_2__touchpoint_fk on reactivation_performance_2 rp_3 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12544 read=20 -> Index Scan using reactivation_performance_3__touchpoint_fk on reactivation_performance_3 rp_4 (cost=0.43..2.94 rows=1 width=32) (actual time=0.005..0.005 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12552 read=19 -> Index Scan using reactivation_performance_4__touchpoint_fk on reactivation_performance_4 rp_5 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12516 read=18 -> Index Scan using reactivation_performance_5__touchpoint_fk on reactivation_performance_5 rp_6 (cost=0.43..2.95 rows=1 width=32) (actual time=0.028..0.028 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12562 read=21 -> Index Scan using reactivation_performance_6__touchpoint_fk on reactivation_performance_6 rp_7 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12530 read=18 -> Index Scan using reactivation_performance_7__touchpoint_fk on reactivation_performance_7 rp_8 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12535 read=19 -> Index Scan using reactivation_performance_8__touchpoint_fk on reactivation_performance_8 rp_9 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12529 read=20 -> Index Scan using reactivation_performance_9__touchpoint_fk on reactivation_performance_9 rp_10 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12561 read=19 -> Index Scan using reactivation_performance_10__touchpoint_fk on reactivation_performance_10 rp_11 (cost=0.43..2.94 rows=1 width=32) (actual time=0.028..0.028 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12533 read=21 -> Index Scan using reactivation_performance_11__touchpoint_fk on reactivation_performance_11 rp_12 (cost=0.43..2.95 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12591 read=20 -> Index Scan using reactivation_performance_12__touchpoint_fk on reactivation_performance_12 rp_13 (cost=0.43..2.95 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12527 read=18 -> Index Scan using reactivation_performance_13__touchpoint_fk on reactivation_performance_13 rp_14 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12531 read=20 -> Index Scan using reactivation_performance_14__touchpoint_fk on reactivation_performance_14 rp_15 (cost=0.43..2.94 rows=1 width=32) (actual time=0.004..0.004 rows=0 loops=4152) Index Cond: (touchpoint_fk = t.touchpoint_id) Filter: (performance_attribution_model_fk = 2) Rows Removed by Filter: 0 Buffers: shared hit=12549 read=20 -> Hash (cost=5.57..5.57 rows=357 width=7) (actual time=0.105..0.105 rows=357 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 22kB Buffers: shared hit=2 -> Seq Scan on sendid si_1 (cost=0.00..5.57 rows=357 width=7) (actual time=0.014..0.057 rows=357 loops=1) Filter: (sendid_id IS NOT NULL) Buffers: shared hit=2 Planning time: 635.005 ms Execution time: 5423.894 ms
Hi Jan, it seems pretty likely that these are indeed overflows. However, because of the whitespace mangling these plans are pretty much impossible to read. Would you mind pasting these again, as text/plain attachment instead? Does the query actually take 6ms in the system with the funny explain, and 5s on the other? Thanks jasc@gmx.net wrote: > Full EXPLAIN (ANALYSE BUFFERS) on the funny looking system: > > GroupAggregate > (cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00 > rows=1600000000 width=254) (actual time=1.640..1.640 rows=0 loops=1) > Group Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk, > ('-1'::smallint), (('-1'::smallint)::integer) > Buffers: shared hit=11 > -> Sort > (cost=-25142102027752040829944596389028691968.00..-25142102027752040829944596389028691968.00 > rows=16812456500 width=120) (actual time=1.638..1.638 rows=0 loops=1) > Sort Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk, > ('-1'::smallint), (('-1'::smallint)::integer) > Sort Method: quicksort Memory: 25kB > Buffers: shared hit=11 > -> Append > (cost=329985.38..-25142102027752040829944596389028691968.00 rows=16812456500 > width=120) (actual time=1.621..1.621 rows=0 loops=1) > Buffers: shared hit=11 [...] > Planning time: 7.154 ms > Execution time: 6.358 ms > ############## > # On a different system it looks like this (much more sane): > ############## > > GroupAggregate (cost=384347.08..414413.11 rows=37349 width=254) (actual > time=5343.214..5417.604 rows=1194 loops=1) > Group Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk, > ('-1'::smallint), (('-1'::smallint)::integer) > Buffers: shared hit=421793 read=26153, temp read=1561 written=1717 > -> Sort (cost=384347.08..385280.81 rows=373491 width=120) (actual > time=5343.180..5351.390 rows=55816 loops=1) > Sort Key: "*SELECT* 1".sendid_fk, "*SELECT* 1".campaign_fk, > ('-1'::smallint), (('-1'::smallint)::integer) > Sort Method: external merge Disk: 2880kB > Buffers: shared hit=421793 read=26153, temp read=1561 written=1717 > -> Append (cost=140964.58..319780.25 rows=373491 width=120) > (actual time=2319.179..5287.481 rows=55816 loops=1) > Buffers: shared hit=421787 read=26153, temp read=1201 > written=1356 > Planning time: 635.005 ms > Execution time: 5423.894 ms -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
jasc@gmx.net writes: > We have a server which reports negative costs in a query plan: The root of the weirdness seems to be here: > -> Bitmap Heap Scan on converting_touchpoints_attribution conv (cost=-2021924572970281.75..-2021924572970244.00rows=1 width=10) (never executed) > Recheck Cond: ((t.touchpoint_id = touchpoint_fk) AND (performance_attribution_model_fk = 2)) > -> BitmapAnd (cost=-2021924572970281.75..-2021924572970281.75 rows=1850 width=0) (never executed) > -> Bitmap Index Scan on converting_touchpoints_attribution__touchpoint_fk (cost=0.00..-2021911688068401.00rows=1850 width=0) (never executed) > Index Cond: (t.touchpoint_id = touchpoint_fk) > -> Bitmap Index Scan on converting_touchpoints_attribution__performance_attribution_mod (cost=0.00..-12884901880.97rows=1850 width=0) (never executed) > Index Cond: (performance_attribution_model_fk = 2) I'm not sure what to make of that, but it seems to be broken in more ways than just the wacko cost estimates. For one thing, the planner is supposed to set up index qual conditions in the form "indexedvar op something", and the (t.touchpoint_id = touchpoint_fk) condition appears to be backwards. The row estimate at the intermediate BitmapAnd node seems to be out of line as well, why doesn't it match the estimate for the heapscan node? Can you get a similarly broken plan if you try something involving just this table, say "select * from converting_touchpoints_attribution where touchpoint_fk = 42 and performance_attribution_model_fk = 2" ? Could we see the pg_stats rows for those two columns? What is in pg_class.reltuples and relpages for this table and these two indexes? regards, tom lane
Hello!
On 5 December 2017 at 16:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not sure what to make of that, but it seems to be broken in more ways
> than just the wacko cost estimates.
> Could we see the pg_stats rows for those two columns? What is in
> pg_class.reltuples and relpages for this table and these two indexes?
On a good system it looks like this:
Best regards,
Jan
--
Jan Schulz
mail: jasc@gmx.net
>
> regards, tom lane
On 5 December 2017 at 16:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I'm not sure what to make of that, but it seems to be broken in more ways
> than just the wacko cost estimates.
We run our server with basically no data security in mind as we do regenerate all tables from the source systems each time. So the OOM kill might have broken some tables/data? m_dim wasn't regenerated since then because the regeneration of that schema triggers the OOM.
relevant postgresql.conf entries:
wal_level = minimal
# doesn't start if wal_level=minimal and this is >0
max_wal_senders = 0
fsync = off
synchronous_commit = off
full_page_writes = off
wal_buffers = -1
autovacuum_max_workers = 1
> Can you get a similarly broken plan if you try something involving just
> this table, say "select * from converting_touchpoints_attribution where
> touchpoint_fk = 42 and performance_attribution_model_fk = 2" ?
EXPLAIN ( ANALYSE, BUFFERS )
select * from m_dim.converting_touchpoints_attribution where
touchpoint_fk = 42 and performance_attribution_model_fk = 2
Bitmap Heap Scan on converting_touchpoints_attribution
> this table, say "select * from converting_touchpoints_attribution where
> touchpoint_fk = 42 and performance_attribution_model_fk = 2" ?
EXPLAIN ( ANALYSE, BUFFERS )
select * from m_dim.converting_touchpoints_attribution where
touchpoint_fk = 42 and performance_attribution_model_fk = 2
Bitmap Heap Scan on converting_touchpoints_attribution
(cost=-25769803761.69..-25769803757.67 rows=1 width=16) (actual time=0.005..0.005 rows=0 loops=1)
Recheck Cond: ((performance_attribution_model_fk = 2) AND (touchpoint_fk = 42))
-> BitmapAnd (cost=-25769803761.69..-25769803761.69 rows=1 width=0) (actual time=0.004..0.004 rows=0 loops=1)
-> Bitmap Index Scan on converting_touchpoints_attribution__performance_attribution_mod
-> BitmapAnd (cost=-25769803761.69..-25769803761.69 rows=1 width=0) (actual time=0.004..0.004 rows=0 loops=1)
-> Bitmap Index Scan on converting_touchpoints_attribution__performance_attribution_mod
(cost=0.00..-12884901880.97 rows=1 width=0) (actual time=0.003..0.003 rows=0 loops=1)
Index Cond: (performance_attribution_model_fk = 2)
-> Bitmap Index Scan on converting_touchpoints_attribution__touchpoint_fk
-> Bitmap Index Scan on converting_touchpoints_attribution__touchpoint_fk
(cost=0.00..-12884901880.97 rows=1 width=0) (never executed)
Index Cond: (touchpoint_fk = 42)
Planning time: 0.122 ms
Execution time: 0.036 ms
This is how it looks on a working system (generated with 2MB work_mem...)
Planning time: 0.122 ms
Execution time: 0.036 ms
This is how it looks on a working system (generated with 2MB work_mem...)
Index Scan using converting_touchpoints_attribution__touchpoint_fk_performance_a on converting_touchpoints_attribution
(cost=0.56..2.58 rows=1 width=16) (actual time=2.314..2.314 rows=0 loops=1)
Index Cond: ((touchpoint_fk = 42) AND (performance_attribution_model_fk = 2))
Buffers: shared hit=6 read=4
Planning time: 5.371 ms
Execution time: 2.342 ms
> pg_class.reltuples and relpages for this table and these two indexes?
SELECT
nspname || '.' ||relname,
relpages,
reltuples
FROM pg_class
JOIN pg_catalog.pg_namespace n
ON n.oid = pg_class.relnamespace
WHERE relname ILIKE 'converting_touchpoints_attributio%';
+-----------------------------------------------------------------------+----------+-----------+
| nspname+relname | relpages | reltuples |
+-----------------------------------------------------------------------+----------+-----------+
| m_dim.converting_touchpoints_attribution__touchpoint_fk_performance_a | 84153 | 24282896 |
| m_dim.converting_touchpoints_attribution___day_id_performance_attribu | 84153 | 24282896 |
| m_dim.converting_touchpoints_attribution__touchpoint_fk | 6 | 1026 |
| m_dim.converting_touchpoints_attribution__performance_attribution_mod | 5 | 1026 |
| m_dim.converting_touchpoints_attribution | 131259 | 24282892 |
+-----------------------------------------------------------------------+----------+-----------+
5 rows in set.
On a good system it looks like this:
+-----------------------------------------------------------------------+----------+-----------+
| nspname+relname | relpages | reltuples |
+-----------------------------------------------------------------------+----------+-----------+
| m_dim.converting_touchpoints_attribution___day_id_performance_attribu | 84656 | 24428338 |
| m_dim.converting_touchpoints_attribution | 132046 | 24428338 |
| m_dim.converting_touchpoints_attribution__touchpoint_fk_performance_a | 84656 | 24428338 |
| m_dim.converting_touchpoints_attribution__touchpoint_fk | 6 | 1032 |
| m_dim.converting_touchpoints_attribution__performance_attribution_mod | 5 | 1032 |
| m_tmp.converting_touchpoints_attribution | 132046 | 24428338 |
+-----------------------------------------------------------------------+----------+-----------+
6 rows in set.
-> Seems like no real difference (the data has a few days differences now + m_tmp.converting_touchpoints_attribution is an UNLOGGED table so won't survive the crash)
Best regards,
Jan
--
Jan Schulz
mail: jasc@gmx.net
>
> regards, tom lane
[resent to the list, sorry Alvaro] Hello, On 5 December 2017 at 13:39, Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > it seems pretty likely that these are indeed overflows. However, > because of the whitespace mangling these plans are pretty much > impossible to read. Would you mind pasting these again, as text/plain > attachment instead? I attached the plans in plain text. > Does the query actually take 6ms in the system with the funny explain, > and 5s on the other? I didn't really noticed the timings, could be. I was more surprised that the broken plan actually resulted in zero rows because it should have had some. Kind regards, Jan -- Jan Schulz mail: jasc@gmx.net
Attachment
Jan Schulz <jasc@gmx.net> writes: > On 5 December 2017 at 16:10, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Can you get a similarly broken plan if you try something involving just >> this table, say "select * from converting_touchpoints_attribution where >> touchpoint_fk = 42 and performance_attribution_model_fk = 2" ? > EXPLAIN ( ANALYSE, BUFFERS ) > select * from m_dim.converting_touchpoints_attribution where > touchpoint_fk = 42 and performance_attribution_model_fk = 2 > Bitmap Heap Scan on converting_touchpoints_attribution > (cost=-25769803761.69..-25769803757.67 rows=1 width=16) (actual > time=0.005..0.005 rows=0 loops=1) > Recheck Cond: ((performance_attribution_model_fk = 2) AND (touchpoint_fk > = 42)) > -> BitmapAnd (cost=-25769803761.69..-25769803761.69 rows=1 width=0) > (actual time=0.004..0.004 rows=0 loops=1) > -> Bitmap Index Scan on > converting_touchpoints_attribution__performance_attribution_mod > (cost=0.00..-12884901880.97 rows=1 width=0) (actual > time=0.003..0.003 rows=0 loops=1) > Index Cond: (performance_attribution_model_fk = 2) > -> Bitmap Index Scan on > converting_touchpoints_attribution__touchpoint_fk > (cost=0.00..-12884901880.97 rows=1 width=0) (never executed) > Index Cond: (touchpoint_fk = 42) > Planning time: 0.122 ms > Execution time: 0.036 ms Hmph. That just raises even more questions --- for instance, why is the condition (performance_attribution_model_fk = 2) now estimated to select just 1 row, when previously it was estimated to select 1850 rows? Still, the fact that you can get a silly answer with just one table does eliminate some theories I'd been toying with. The index cost estimation code does have some potential for garbage-in-garbage-out results, but nothing very promising. One question is whether these indexes are on a non-default tablespace with a non-default random_page_cost. PG should prevent you from putting in a negative random_page_cost, but it's worth checking that. I looked at some other possibilities like a corrupted tree_height value, but none of them seem to fit the available info. For instance, the tree_height is only an int, so even if it were the max negative value it would not explain the cost value you're getting. Don't suppose you'd like to step through btcostestimate() with a debugger and see where it's going off the rails? regards, tom lane
Hello Tom, On 5 December 2017 at 22:30, Tom Lane <tgl@sss.pgh.pa.us> wrote: > The index cost estimation code does have some potential for > garbage-in-garbage-out results, but nothing very promising. > One question is whether these indexes are on a non-default > tablespace with a non-default random_page_cost. PG should > prevent you from putting in a negative random_page_cost, but > it's worth checking that. How do I check? > I looked at some other possibilities like a corrupted tree_height > value, but none of them seem to fit the available info. For > instance, the tree_height is only an int, so even if it were the > max negative value it would not explain the cost value you're getting. I must confess that I have no clue what you are talking about. > Don't suppose you'd like to step through btcostestimate() with > a debugger and see where it's going off the rails? I've never done debugging of a C program. If you give me some hints how to do that, I might be able to (Ubuntu, PG from the apt repository). Is there any debug log I might need to enable to get some similar information? Best regards, Jan -- Jan Schulz mail: jasc@gmx.net
Hello, just some update on "how" that happend. Our current hypothesis: * We have one parallel job was growing so big that postgresql consumed too much memory (we use 'work_mem = 2GB'). This job is part of a process which creates a 'm_dim_next' schema which in the end would be switched to 'm_dim'. (note that the old 'm_dim' schema is not written to during the whole process which creates m_dim_next, it gets dropped after the schema switch in the last step in that process: m_dim -> m_dim_old, m_dim_next -> m_dim, drop m_dim_old) * The OOM killer killed postgresql (please note that we have configured postgres with almost no data security) * This in turn would result "somehow" in some funny data/statistics/whatever on the table in m_dim * This in turn would result in wrong plans which in turn would result in OOM when processes run which touched the table in m_dim So our current understanding is that when we fix the memory issue in the first step we will also not anymore have problems in the other processes. We are currently looking into that. Best regards, Jan -- Jan Schulz mail: jasc@gmx.net web: http://www.katzien.de
Jan Schulz <jasc@gmx.net> writes: > just some update on "how" that happend. Our current hypothesis: > * We have one parallel job was growing so big that postgresql consumed > too much memory (we use 'work_mem = 2GB'). This job is part of a > process which creates a 'm_dim_next' schema which in the end would be > switched to 'm_dim'. (note that the old 'm_dim' schema is not written > to during the whole process which creates m_dim_next, it gets dropped > after the schema switch in the last step in that process: m_dim -> > m_dim_old, m_dim_next -> m_dim, drop m_dim_old) > * The OOM killer killed postgresql (please note that we have > configured postgres with almost no data security) > * This in turn would result "somehow" in some funny > data/statistics/whatever on the table in m_dim > * This in turn would result in wrong plans which in turn would result > in OOM when processes run which touched the table in m_dim I'm not entirely convinced by this theory. In the first place, an OOM kill shouldn't result in data corruption, no matter how you have your installation configured. You can turn off things that might result in corruption after a power outage or other operating-system-level crash, but not a process-level crash. (Or at least that's the theory; there could always be bugs of course. But we developers crash Postgres pretty regularly ;-), and we don't see corruption from that.) In the second place, even if there were something wrong in pg_statistic, that should manifest as a bogus rowcount estimate, which we're not seeing in this EXPLAIN output. The calculation that produces a cost estimate given a rowcount estimate just doesn't have all that many other inputs, which is why I didn't have very many theories about what could be wrong. It's puzzling ... regards, tom lane
Hello, in the meantime we managed to not trigger the OOM kill anymore and now also don't have this problem with the negative costs anymore. We still have one staging environment where this error is present, but we would like to reclaim it. Is there some guidelines I can follow to debug this? Jan On 7 December 2017 at 10:54, Jan Schulz <jasc@gmx.net> wrote: > On 5 December 2017 at 22:30, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Don't suppose you'd like to step through btcostestimate() with >> a debugger and see where it's going off the rails? > I've never done debugging of a C program. If you give me some hints how to > do that, I might be able to (Ubuntu, PG from the apt repository). > Is there any debug log I might need to enable to get some similar information?
On 12/12/2017 02:54 PM, Jan Schulz wrote: > Hello, > > in the meantime we managed to not trigger the OOM kill anymore and now > also don't have this problem with the negative costs anymore. We still > have one staging environment where this error is present, but we would > like to reclaim it. Is there some guidelines I can follow to debug > this? > I think you're probably looking for this: https://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD Once you have GDB attached to the process, you can set breakpoints on interesting places. I believe the most interesting function to inspect is btcostestimate, which is what produces estimates for the bitmap index scans: https://github.com/postgres/postgres/blob/master/src/backend/utils/adt/selfuncs.c#L6880 In particular, we're interested in parameter values passed to the function, when it produces negative indexTotalCost (..-12884901880.97). The value is updated on multiple places in the function, so you'll need to step through and see at which point it gets negative. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services