[PERFORM] Regression from 9.4-9.6 - Mailing list pgsql-performance

From Jim Nasby
Subject [PERFORM] Regression from 9.4-9.6
Date
Msg-id 5d7c0f80-3767-1531-d911-197fcc147173@nasby.net
Whole thread Raw
Responses Re: [PERFORM] Regression from 9.4-9.6  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
I've got a query that's regressed from 9.4 to 9.6. I suspect it has 
something to do with the work done around bad plans from single-row 
estimates. There's a SeqScan happening even though the join is to the PK 
of bd_ident. Full plans are at [1,2,3], but here's the relevant bits...

9.4:
>    ->  Nested Loop Left Join  (cost=1.00..50816.55 rows=1 width=27) (actual time=979.406..3213.286 rows=508 loops=1)
>          ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..50807.51 rows=1 width=36) (actual
time=979.381..3207.777rows=508 loops=1)
 
>                Index Cond: ((filed_departuretime >= '2017-07-20 05:00:00'::timestamp without time zone) AND
(filed_departuretime<= '2017-07-30 04:59:59'::timestamp without time zone))
 
...
>          ->  Index Scan using bd_ident_pkey on bd_ident i  (cost=0.43..4.45 rows=1 width=11) (actual
time=0.006..0.006rows=1 loops=508)
 
>                Index Cond: (bdata_forks.ident_id = id)
>          SubPlan 1
>            ->  Index Scan using bd_airport_pkey on bd_airport  (cost=0.56..4.58 rows=1 width=20) (actual
time=0.003..0.003rows=1 loops=508)
 
>                  Index Cond: (id = bdata_forks.destination_id)

9.6:
>    ->  Nested Loop Left Join  (cost=0.57..14994960.40 rows=1 width=71) (actual time=931.479..327972.891 rows=508
loops=1)
>          Join Filter: (bdata_forks.ident_id = i.id)
>          Rows Removed by Join Filter: 1713127892
>          ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..14894236.06 rows=1 width=36)
(actualtime=892.664..3025.653 rows=508 loops=1)
 
...
>          ->  Seq Scan on bd_ident i  (cost=0.00..58566.00 rows=3372300 width=11) (actual time=0.002..280.966
rows=3372300loops=508)               ^^^^^^^^
 
>          SubPlan 1
>            ->  Index Scan using bd_airport_pkey on bd_airport  (cost=0.56..4.58 rows=1 width=20) (actual
time=0.009..0.009rows=1 loops=508)
 
>                  Index Cond: (id = bdata_forks.destination_id)

Altering the predicates somewhat (removing one of the timestamp 
conditions) results in the input to the outer part of the nested loop 
estimating at 326 rows instead of 1, which generates a good plan:

>    ->  Nested Loop Left Join  (cost=1.00..14535906.91 rows=326 width=71) (actual time=23.670..4558.273 rows=3543
loops=1)
>          ->  Index Scan using bdata_filed_departuretime on bdata_forks  (cost=0.57..14532973.05 rows=326 width=36)
(actualtime=23.647..4522.428 rows=3543 loops=1)                          ^^^^^^^^
 
...
>          ->  Index Scan using bd_ident_pkey on bd_ident i  (cost=0.43..4.40 rows=1 width=11) (actual
time=0.005..0.006rows=1 loops=3543)
 
>                Index Cond: (bdata_forks.ident_id = id)
>          SubPlan 1
>            ->  Index Scan using bd_airport_pkey on bd_airport  (cost=0.56..4.58 rows=1 width=20) (actual
time=0.003..0.003rows=1 loops=3543)
 
>                  Index Cond: (id = bdata_forks.destination_id)

1: https://explain.depesz.com/s/2A90
2: https://explain.depesz.com/s/jKdr
3: https://explain.depesz.com/s/nFh
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net


-- 
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

pgsql-performance by date:

Previous
From: David Rowley
Date:
Subject: Re: [PERFORM] How does max_parallel_workers_per_gather change load averages?
Next
From: Tom Lane
Date:
Subject: Re: [PERFORM] Regression from 9.4-9.6