Postgres chooses slow query plan from time to time - Mailing list pgsql-performance

From Kristjan Mustkivi
Subject Postgres chooses slow query plan from time to time
Date
Msg-id CAOQPKavhdTrgQo8Uss0M+K0Du+_WYmdoGyMpC0hJQkKM+-X25Q@mail.gmail.com
Whole thread Raw
Responses Re: Postgres chooses slow query plan from time to time  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Re: Postgres chooses slow query plan from time to time  (Jeff Janes <jeff.janes@gmail.com>)
Re: Postgres chooses slow query plan from time to time  (Jeff Janes <jeff.janes@gmail.com>)
List pgsql-performance
Dear community,

I have a query that most of the time gets executed in a few
milliseconds yet occasionally takes ~20+ seconds. The difference, as
far as I am able to tell, comes whether it uses the table Primary Key
(fast) or an additional index with smaller size. The table in question
is INSERT ONLY - no updates or deletes done there.

Pg 11.12, total OS mem 124G

shared_buffers: 31GB
work_mem: 27MB
effective_cache_size: 93GB

The query:

SELECT
    *
FROM
    myschema.mytable pbh
WHERE
    pbh.product_code = $1
    AND pbh.cage_player_id = $2
    AND pbh.cage_code = $3
    AND balance_type = $4
    AND pbh.modified_time < $5
ORDER BY
    pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY;

\d myschema.mytable
                    Table "myschema.mytable"
     Column     │            Type             │ Collation │ Nullable │ Default
────────────────┼─────────────────────────────┼───────────┼──────────┼─────────
 cage_code      │ integer                     │           │ not null │
 cage_player_id │ bigint                      │           │ not null │
 product_code   │ character varying(30)       │           │ not null │
 balance_type   │ character varying(30)       │           │ not null │
 version        │ bigint                      │           │ not null │
 modified_time  │ timestamp(3) with time zone │           │ not null │
 amount         │ numeric(38,8)               │           │ not null │
 change         │ numeric(38,8)               │           │ not null │
 transaction_id │ bigint                      │           │ not null │
Indexes:
    "mytable_pk" PRIMARY KEY, btree (cage_code, cage_player_id,
product_code, balance_type, version)
    "mytable_idx1" btree (modified_time)
    "mytable_idx2" btree (cage_code, cage_player_id, modified_time)

SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts,
relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE
relname='mytable';
─[ RECORD 1 ]──┬───────────────────────
relname        │ mytable
relpages       │ 18630554
reltuples      │ 1.45045e+09
relallvisible  │ 18629741
relkind        │ r
relnatts       │ 9
relhassubclass │ f
reloptions     │ ¤
pg_table_size  │ 152695029760 (142 GB)

I have caught this with AUTOEXPLAIN:

Query Text: SELECT *   FROM myschema.mytable pbh WHERE
pbh.product_code = $1   AND pbh.cage_player_id = $2   AND
pbh.cage_code = $3   AND balance_type = $4   AND pbh.modified_time <
$5 ORDER BY pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY
  Limit  (cost=0.70..6.27 rows=1 width=66)
    ->  Index Scan Backward using mytable_idx2 on mytable pbh
(cost=0.70..21552.55 rows=3869 width=66)
          Index Cond: ((cage_code = $3) AND (cage_player_id = $2) AND
(modified_time < $5))
          Filter: (((product_code)::text = ($1)::text) AND
((balance_type)::text = ($4)::text))

And when I run EXPLAIN ANALYZE on the same query with the same
parameters manually:

 Limit  (cost=177.75..177.75 rows=1 width=66) (actual
time=8.635..8.635 rows=1 loops=1)
   ->  Sort  (cost=177.75..178.21 rows=186 width=66) (actual
time=8.634..8.634 rows=1 loops=1)
         Sort Key: modified_time DESC
         Sort Method: top-N heapsort  Memory: 25kB
         ->  Index Scan using mytable_pk on mytable pbh
(cost=0.70..176.82 rows=186 width=66) (actual time=1.001..8.610
rows=25 loops=1)
               Index Cond: ((cage_code = 123) AND (cage_player_id =
'12345'::bigint) AND ((product_code)::text = 'PRODUCT'::text) AND
((balance_type)::text = 'TOTAL'::text))
               Filter: (modified_time < '2021-09-13
04:00:00+00'::timestamp with time zone)
 Planning Time: 2.117 ms
 Execution Time: 8.658 ms

I have played around with SET STATISTICS, work_mem and even tried
CREATE STATISTICS although there is no functional dependency on the
table columns in questions, but nothing seems to work.

Any ideas, hints are very much appreciated!


With best regards,
--
Kristjan Mustkivi

Email: kristjan.mustkivi@gmail.com



pgsql-performance by date:

Previous
From: Julien Rouhaud
Date:
Subject: Re: sql execution time in pg_stat_statements
Next
From: Tomas Vondra
Date:
Subject: Re: Postgres chooses slow query plan from time to time