analyze causes query planner to choose suboptimal plan for a selectquery in separate transaction - Mailing list pgsql-general

Hi group,

I have a Jenkins build server which runs integration tests of a Java application against PostgreSQL 11.2. The tests run in serial fashion and are reproducible because each test runs in separate DB transaction which is rolled back at the end of each test. Each full build (with 1400+ tests) takes approx 1h:45m.

My problem is that sometimes the build gets stuck (for several hours, often 10+) in random test waiting for random select query with postgres process consuming 100% CPU (1 core).
Well, at least I thought it was random, then I figured out it was triggered by autovacuum daemon. The problem is not the vacuum process itself, but in the analyze part that runs together with autovacuum.

To illustrate my situation let's consider my tests look like this:

BEGIN;

-- A: insert data for the test

-- B: i'll refer to this point later

-- C: select(s)

ROLLBACK;

Everything is fine, until autovacuum (analyze) runs when the test is at point B. After that the query planner at point C chooses wrong plan and the query takes a long time to complete, blocking one CPU core for a long time. It seems like the planner statistics inside running transaction are affected by analyze task running outside of the transaction. In this case after running analyze (outside the transaction) when the transaction is at B, causes query planner to think there are no rows (because the inserts at point A were not yet committed).

I did prepare a simple test case to reproduce this behavior:

First you need to create a table:

create table a (id bigint primary key);

Then run this transaction:

begin;
insert into a
select * from generate_series(1, 1000);

-- during sleep execute analyze on this db in separate connection
select pg_sleep(10);

explain analyze select count(*) from (
select distinct a1, a2, a3, a4 from a a1
left join a a2 on a1.id > 900
left join a a3 on a2.id = a3.id
left join a a4 on a3.id = a4.id
left join a a5 on a5.id = a4.id
left join a a6 on a6.id = a5.id
left join a a7 on a7.id = a6.id
left join a a8 on a8.id = a7.id) temp;

rollback;


The query plan for the select is as follows and the query takes 865ms to finish.

                                                                                QUERY PLAN                                                                                
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=1987304286213.68..1987304286213.69 rows=1 width=8) (actual time=862.406..862.407 rows=1 loops=1)
   ->  Unique  (cost=1942976891422.08..1987284286213.68 rows=1600000000 width=128) (actual time=783.281..858.261 rows=100900 loops=1)
         ->  Sort  (cost=1942976891422.08..1951838370380.40 rows=3544591583328 width=128) (actual time=783.280..825.055 rows=100900 loops=1)
               Sort Key: a1.*, a2.*, a3.*, a4.*
               Sort Method: external merge  Disk: 12368kB
               ->  Merge Right Join  (cost=5497387.28..53175003562.02 rows=3544591583328 width=128) (actual time=171.612..196.342 rows=100900 loops=1)
                     Merge Cond: (a4.id = a3.id)
                     ->  Merge Left Join  (cost=449798.05..1010691.54 rows=36848656 width=40) (actual time=8.257..8.815 rows=1000 loops=1)
                           Merge Cond: (a4.id = a5.id)
                           ->  Sort  (cost=158.51..164.16 rows=2260 width=40) (actual time=1.174..1.241 rows=1000 loops=1)
                                 Sort Key: a4.id
                                 Sort Method: quicksort  Memory: 103kB
                                 ->  Seq Scan on a a4  (cost=0.00..32.60 rows=2260 width=40) (actual time=0.040..0.639 rows=1000 loops=1)
                           ->  Materialize  (cost=449639.54..465944.26 rows=3260943 width=8) (actual time=7.077..7.254 rows=1000 loops=1)
                                 ->  Sort  (cost=449639.54..457791.90 rows=3260943 width=8) (actual time=7.066..7.148 rows=1000 loops=1)
                                       Sort Key: a5.id
                                       Sort Method: quicksort  Memory: 71kB
                                       ->  Merge Right Join  (cost=2897.82..52270.24 rows=3260943 width=8) (actual time=4.352..6.630 rows=1000 loops=1)
                                             Merge Cond: (a7.id = a6.id)
                                             ->  Merge Left Join  (cost=317.01..711.38 rows=25538 width=8) (actual time=1.236..2.482 rows=1000 loops=1)
                                                   Merge Cond: (a7.id = a8.id)
                                                   ->  Sort  (cost=158.51..164.16 rows=2260 width=8) (actual time=0.618..0.792 rows=1000 loops=1)
                                                         Sort Key: a7.id
                                                         Sort Method: quicksort  Memory: 71kB
                                                         ->  Seq Scan on a a7  (cost=0.00..32.60 rows=2260 width=8) (actual time=0.017..0.265 rows=1000 loops=1)
                                                   ->  Sort  (cost=158.51..164.16 rows=2260 width=8) (actual time=0.605..0.773 rows=1000 loops=1)
                                                         Sort Key: a8.id
                                                         Sort Method: quicksort  Memory: 71kB
                                                         ->  Seq Scan on a a8  (cost=0.00..32.60 rows=2260 width=8) (actual time=0.024..0.261 rows=1000 loops=1)
                                             ->  Sort  (cost=2580.81..2644.65 rows=25538 width=16) (actual time=3.112..3.278 rows=1000 loops=1)
                                                   Sort Key: a6.id
                                                   Sort Method: quicksort  Memory: 71kB
                                                   ->  Merge Left Join  (cost=317.01..711.38 rows=25538 width=16) (actual time=1.386..2.653 rows=1000 loops=1)
                                                         Merge Cond: (a5.id = a6.id)
                                                         ->  Sort  (cost=158.51..164.16 rows=2260 width=8) (actual time=0.697..0.870 rows=1000 loops=1)
                                                               Sort Key: a5.id
                                                               Sort Method: quicksort  Memory: 71kB
                                                               ->  Seq Scan on a a5  (cost=0.00..32.60 rows=2260 width=8) (actual time=0.026..0.303 rows=1000 loops=1)
                                                         ->  Sort  (cost=158.51..164.16 rows=2260 width=8) (actual time=0.681..0.851 rows=1000 loops=1)
                                                               Sort Key: a6.id
                                                               Sort Method: quicksort  Memory: 71kB
                                                               ->  Seq Scan on a a6  (cost=0.00..32.60 rows=2260 width=8) (actual time=0.018..0.286 rows=1000 loops=1)
                     ->  Materialize  (cost=5047589.24..5143782.35 rows=19238623 width=104) (actual time=163.351..175.058 rows=100900 loops=1)
                           ->  Sort  (cost=5047589.24..5095685.80 rows=19238623 width=104) (actual time=163.343..167.508 rows=100900 loops=1)
                                 Sort Key: a3.id
                                 Sort Method: quicksort  Memory: 16766kB
                                 ->  Merge Right Join  (cost=322881.15..615728.13 rows=19238623 width=104) (actual time=119.338..144.270 rows=100900 loops=1)
                                       Merge Cond: (a3.id = a2.id)
                                       ->  Sort  (cost=158.51..164.16 rows=2260 width=40) (actual time=1.153..1.230 rows=1000 loops=1)
                                             Sort Key: a3.id
                                             Sort Method: quicksort  Memory: 103kB
                                             ->  Seq Scan on a a3  (cost=0.00..32.60 rows=2260 width=40) (actual time=0.039..0.629 rows=1000 loops=1)
                                       ->  Materialize  (cost=322722.64..331235.31 rows=1702533 width=72) (actual time=118.179..131.501 rows=100900 loops=1)
                                             ->  Sort  (cost=322722.64..326978.97 rows=1702533 width=72) (actual time=118.173..122.691 rows=100900 loops=1)
                                                   Sort Key: a2.id
                                                   Sort Method: quicksort  Memory: 16780kB
                                                   ->  Nested Loop Left Join  (cost=0.00..76684.85 rows=1702533 width=72) (actual time=1.331..100.252 rows=100900 loops=1)
                                                         Join Filter: (a1.id > 900)
                                                         Rows Removed by Join Filter: 900000
                                                         ->  Seq Scan on a a1  (cost=0.00..32.60 rows=2260 width=40) (actual time=0.019..0.292 rows=1000 loops=1)
                                                         ->  Materialize  (cost=0.00..43.90 rows=2260 width=40) (actual time=0.000..0.047 rows=1000 loops=1000)
                                                               ->  Seq Scan on a a2  (cost=0.00..32.60 rows=2260 width=40) (actual time=0.016..0.606 rows=1000 loops=1)
 Planning Time: 2.234 ms
 Execution Time: 865.635 ms
(64 rows)



Now run the transaction again, but this time, run analyze in separate DB connection when the transactions sleeps at select pg_sleep(10).
The query plan is as follows and the select query takes 81s to complete. As you can see the planner greatly underestimated number of rows (1 vs 100900) and chosen nested loop plan which took a long time to finish.

                                                                            QUERY PLAN                                                                           
------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=40.12..40.13 rows=1 width=8) (actual time=81482.190..81482.190 rows=1 loops=1)
   ->  Unique  (cost=40.10..40.11 rows=1 width=128) (actual time=81418.552..81477.211 rows=100900 loops=1)
         ->  Sort  (cost=40.10..40.10 rows=1 width=128) (actual time=81418.551..81440.453 rows=100900 loops=1)
               Sort Key: a1.*, a2.*, a3.*, a4.*
               Sort Method: external merge  Disk: 12376kB
               ->  Nested Loop Left Join  (cost=0.00..40.09 rows=1 width=128) (actual time=1.424..81324.707 rows=100900 loops=1)
                     Join Filter: (a8.id = a7.id)
                     Rows Removed by Join Filter: 100800000
                     ->  Nested Loop Left Join  (cost=0.00..35.08 rows=1 width=136) (actual time=1.271..70436.483 rows=100900 loops=1)
                           Join Filter: (a7.id = a6.id)
                           Rows Removed by Join Filter: 100800000
                           ->  Nested Loop Left Join  (cost=0.00..30.06 rows=1 width=136) (actual time=1.117..59388.674 rows=100900 loops=1)
                                 Join Filter: (a6.id = a5.id)
                                 Rows Removed by Join Filter: 100800000
                                 ->  Nested Loop Left Join  (cost=0.00..25.05 rows=1 width=136) (actual time=0.962..48361.409 rows=100900 loops=1)
                                       Join Filter: (a5.id = a4.id)
                                       Rows Removed by Join Filter: 100800000
                                       ->  Nested Loop Left Join  (cost=0.00..20.04 rows=1 width=136) (actual time=0.807..37348.411 rows=100900 loops=1)
                                             Join Filter: (a3.id = a4.id)
                                             Rows Removed by Join Filter: 100800000
                                             ->  Nested Loop Left Join  (cost=0.00..15.02 rows=1 width=104) (actual time=0.544..18687.670 rows=100900 loops=1)
                                                   Join Filter: (a2.id = a3.id)
                                                   Rows Removed by Join Filter: 100800000
                                                   ->  Nested Loop Left Join  (cost=0.00..10.01 rows=1 width=72) (actual time=0.283..190.095 rows=100900 loops=1)
                                                         Join Filter: (a1.id > 900)
                                                         Rows Removed by Join Filter: 900000
                                                         ->  Seq Scan on a a1  (cost=0.00..5.00 rows=1 width=40) (actual time=0.013..0.352 rows=1000 loops=1)
                                                         ->  Seq Scan on a a2  (cost=0.00..5.00 rows=1 width=40) (actual time=0.003..0.138 rows=1000 loops=1000)
                                                   ->  Seq Scan on a a3  (cost=0.00..5.00 rows=1 width=40) (actual time=0.003..0.140 rows=1000 loops=100900)
                                             ->  Seq Scan on a a4  (cost=0.00..5.00 rows=1 width=40) (actual time=0.003..0.141 rows=1000 loops=100900)
                                       ->  Seq Scan on a a5  (cost=0.00..5.00 rows=1 width=8) (actual time=0.003..0.057 rows=1000 loops=100900)
                                 ->  Seq Scan on a a6  (cost=0.00..5.00 rows=1 width=8) (actual time=0.003..0.057 rows=1000 loops=100900)
                           ->  Seq Scan on a a7  (cost=0.00..5.00 rows=1 width=8) (actual time=0.003..0.057 rows=1000 loops=100900)
                     ->  Seq Scan on a a8  (cost=0.00..5.00 rows=1 width=8) (actual time=0.003..0.057 rows=1000 loops=100900)
 Planning Time: 0.522 ms
 Execution Time: 81483.797 ms
(36 rows)



What do you think, is this intended behavior or a potential bug ?

Regards, Martin
Martin Kováčik
CEO
redByte, s.r.o.
+421 904 236 791
kovacik@redbyte.eu, www.redbyte.eu

pgsql-general by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Does "ON UPDATE" for foreign keys require index?
Next
From: Michael Lewis
Date:
Subject: Re: analyze causes query planner to choose suboptimal plan for aselect query in separate transaction