Home > mailing lists

Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE - Mailing list pgsql-bugs

From	Jan Kort
Subject	Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE
Date	May 18, 2021 15:46:24
Msg-id	AM0PR0502MB3620D231418BE0B7B8A9E3E09B2C9@AM0PR0502MB3620.eurprd05.prod.outlook.com Whole thread Raw
Responses	Re: Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE (Pavel Stehule <pavel.stehule@gmail.com>) Re: Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE (David Rowley <dgrowleyml@gmail.com>)
List	pgsql-bugs

Tree view

Hi,

I tried this on both PostgreSQL 12.7 and 13.3, I tried both VACUUM ANALYZE and VACUUM FULL ANALYZE, they both seem to change a straightforward plan to a reversed and complex plan that make the query slow. The only thing that works is to do a pg_dump, pg_restore, but that is not feasible.

It seems to depend on specific data, I tried generating other variations of around 1M records, but so far the attached DB (also generated) is the only one where I could reproduce it, I can’t figure out what is so special about this data.

Given the following record counts:

GFO_ZAKEN: 1048587

GFO_ZAKEN_KOSTEN: 3

GFO_ZAKEN_TYPECODE: 4

GFO_ZAKEN.ZAAKTYPECODE_ID has a value 9 times and the rest is null

When I run this query:

EXPLAIN ANALYZE

SELECT GFO_ZAKEN_KOSTEN.ID AS GFO_ZAKEN_KOSTEN_ID,

GFO_ZAKEN.ID AS GFO_ZAKEN_ID,

GFO_ZAKEN_TYPECODE.OMSCHRIJVING AS ZAAKTYPECODE_ID

FROM TRIAL.GFO_ZAKEN_KOSTEN

JOIN TRIAL.GFO_ZAKEN ON GFO_ZAKEN_KOSTEN.GFO_ZAKEN_ID = GFO_ZAKEN.ID

JOIN TRIAL.GFO_ZAKEN_TYPECODE ON GFO_ZAKEN.ZAAKTYPECODE_ID = GFO_ZAKEN_TYPECODE.ID

It is taking 500ms or so, which I associate with a full table scan, but they are just simple referential constraints and corresponding indexes:

CREATE TABLE TRIAL.GFO_ZAKEN_TYPECODE (ID INTEGER PRIMARY KEY, OMSCHRIJVING CHARACTER VARYING(4000));

CREATE TABLE TRIAL.GFO_ZAKEN (ID INTEGER PRIMARY KEY, ZAAKTYPECODE_ID INTEGER,

CONSTRAINT ZAAKTYPECODE_IDC1 FOREIGN KEY (ZAAKTYPECODE_ID) REFERENCES TRIAL.GFO_ZAKEN_TYPECODE (ID));

CREATE INDEX GFO_ZAKENO18 ON TRIAL.GFO_ZAKEN USING BTREE (ZAAKTYPECODE_ID);

CREATE TABLE TRIAL.GFO_ZAKEN_KOSTEN (ID INTEGER PRIMARY KEY, GFO_ZAKEN_ID INTEGER,

CONSTRAINT GFO_ZAKEN_IDC1 FOREIGN KEY (GFO_ZAKEN_ID) REFERENCES TRIAL.GFO_ZAKEN (ID));

CREATE INDEX GFO_ZAKEN_KOSTENO14 ON TRIAL.GFO_ZAKEN_KOSTEN USING BTREE (GFO_ZAKEN_ID ASC NULLS LAST);

After pg_restore it gives a straightforward plan, starting with the gfo_zaken_kosten primary key and continuing with the join on gfo_zaken_kosten.gfo_zaken_id:

Nested Loop (cost=0.56..3.64 rows=1 width=524) (actual time=0.036..0.037 rows=1 loops=1)

-> Nested Loop (cost=0.43..3.48 rows=1 width=12) (actual time=0.030..0.030 rows=1 loops=1)

-> Seq Scan on gfo_zaken_kosten (cost=0.00..1.04 rows=1 width=8) (actual time=0.016..0.016 rows=1 loops=1)

Filter: (id = 13)

Rows Removed by Filter: 2

-> Index Scan using gfo_zakenp on gfo_zaken (cost=0.43..2.45 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=1)

Index Cond: (id = gfo_zaken_kosten.gfo_zaken_id)

-> Index Scan using gfo_zaken_typecodep on gfo_zaken_typecode (cost=0.13..0.15 rows=1 width=520) (actual time=0.005..0.005 rows=1 loops=1)

Index Cond: (id = gfo_zaken.zaaktypecode_id)

Planning Time: 1.538 ms

Execution Time: 0.095 ms

After VACUUM ANALYZE the plan becomes inefficient again, and does not start with the gfo_zaken_kosten primary key, the plan starts at the wrong end with an index scan on 1M rows:

Merge Join (cost=1.48..1.59 rows=1 width=159) (actual time=619.374..619.376 rows=1 loops=1)

Merge Cond: (gfo_zaken.id = gfo_zaken_kosten.gfo_zaken_id)

-> Nested Loop (cost=0.43..96503.47 rows=1048587 width=155) (actual time=0.022..619.359 rows=9 loops=1)

Join Filter: (gfo_zaken.zaaktypecode_id = gfo_zaken_typecode.id)

Rows Removed by Join Filter: 4194316

-> Index Scan using gfo_zakenp on gfo_zaken (cost=0.43..33587.23 rows=1048587 width=8) (actual time=0.006..141.167 rows=1048587 loops=1)

-> Materialize (cost=0.00..1.06 rows=4 width=155) (actual time=0.000..0.000 rows=4 loops=1048587)

-> Seq Scan on gfo_zaken_typecode (cost=0.00..1.04 rows=4 width=155) (actual time=0.011..0.012 rows=4 loops=1)

-> Sort (cost=1.05..1.05 rows=1 width=8) (actual time=0.012..0.012 rows=1 loops=1)

Sort Key: gfo_zaken_kosten.gfo_zaken_id

Sort Method: quicksort Memory: 25kB

-> Seq Scan on gfo_zaken_kosten (cost=0.00..1.04 rows=1 width=8) (actual time=0.008..0.008 rows=1 loops=1)

Filter: (id = 13)

Rows Removed by Filter: 2

Planning Time: 69.151 ms

Execution Time: 619.410 ms

So 6520 times slower after vacuum analyze then before.

Doing VACUUM ANALYZE again doesn’t make it better, only pg_dump + pg_restore will go back to the original plan, but then it will break again on the first VACUUM ANALYZE.

I attached a 800K test DB with all sensitive data removed.

I tried both the default config without changes, and the default config with our settings appended:

max_connections = 1000

shared_buffers = 512MB

effective_cache_size = 6GB

work_mem = 10485kB

maintenance_work_mem = 512MB

min_wal_size = 1GB

max_wal_size = 2GB

checkpoint_completion_target = 0.7

wal_buffers = 16MB

default_statistics_target = 100

random_page_cost = 1

wal_sync_method = open_datasync

fsync = on

synchronous_commit = off

Doing a VACUUM ANALYZE shouldn’t change a straightforward plan.

Regards,

Jan Kort

Attachment

trialdb.zip

pgsql-bugs by date:

From: PG Bug reporting form
Date: 18 May 2021, 10:42:03
Subject: BUG #17016: Cannot sync pgdg-common repo with reposync due to failed signature check

From: PG Bug reporting form
Date: 18 May 2021, 16:26:24
Subject: BUG #17017: Two versions of the same row of records are returned in one query

Query with straightforward plan changes and becomes 6520 times slower after VACUUM ANALYZE - Mailing list pgsql-bugs

Attachment

Previous

Next