Thread: Postgres chooses slow query plan from time to time
Dear community, I have a query that most of the time gets executed in a few milliseconds yet occasionally takes ~20+ seconds. The difference, as far as I am able to tell, comes whether it uses the table Primary Key (fast) or an additional index with smaller size. The table in question is INSERT ONLY - no updates or deletes done there. Pg 11.12, total OS mem 124G shared_buffers: 31GB work_mem: 27MB effective_cache_size: 93GB The query: SELECT * FROM myschema.mytable pbh WHERE pbh.product_code = $1 AND pbh.cage_player_id = $2 AND pbh.cage_code = $3 AND balance_type = $4 AND pbh.modified_time < $5 ORDER BY pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY; \d myschema.mytable Table "myschema.mytable" Column │ Type │ Collation │ Nullable │ Default ────────────────┼─────────────────────────────┼───────────┼──────────┼───────── cage_code │ integer │ │ not null │ cage_player_id │ bigint │ │ not null │ product_code │ character varying(30) │ │ not null │ balance_type │ character varying(30) │ │ not null │ version │ bigint │ │ not null │ modified_time │ timestamp(3) with time zone │ │ not null │ amount │ numeric(38,8) │ │ not null │ change │ numeric(38,8) │ │ not null │ transaction_id │ bigint │ │ not null │ Indexes: "mytable_pk" PRIMARY KEY, btree (cage_code, cage_player_id, product_code, balance_type, version) "mytable_idx1" btree (modified_time) "mytable_idx2" btree (cage_code, cage_player_id, modified_time) SELECT relname, relpages, reltuples, relallvisible, relkind, relnatts, relhassubclass, reloptions, pg_table_size(oid) FROM pg_class WHERE relname='mytable'; ─[ RECORD 1 ]──┬─────────────────────── relname │ mytable relpages │ 18630554 reltuples │ 1.45045e+09 relallvisible │ 18629741 relkind │ r relnatts │ 9 relhassubclass │ f reloptions │ ¤ pg_table_size │ 152695029760 (142 GB) I have caught this with AUTOEXPLAIN: Query Text: SELECT * FROM myschema.mytable pbh WHERE pbh.product_code = $1 AND pbh.cage_player_id = $2 AND pbh.cage_code = $3 AND balance_type = $4 AND pbh.modified_time < $5 ORDER BY pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY Limit (cost=0.70..6.27 rows=1 width=66) -> Index Scan Backward using mytable_idx2 on mytable pbh (cost=0.70..21552.55 rows=3869 width=66) Index Cond: ((cage_code = $3) AND (cage_player_id = $2) AND (modified_time < $5)) Filter: (((product_code)::text = ($1)::text) AND ((balance_type)::text = ($4)::text)) And when I run EXPLAIN ANALYZE on the same query with the same parameters manually: Limit (cost=177.75..177.75 rows=1 width=66) (actual time=8.635..8.635 rows=1 loops=1) -> Sort (cost=177.75..178.21 rows=186 width=66) (actual time=8.634..8.634 rows=1 loops=1) Sort Key: modified_time DESC Sort Method: top-N heapsort Memory: 25kB -> Index Scan using mytable_pk on mytable pbh (cost=0.70..176.82 rows=186 width=66) (actual time=1.001..8.610 rows=25 loops=1) Index Cond: ((cage_code = 123) AND (cage_player_id = '12345'::bigint) AND ((product_code)::text = 'PRODUCT'::text) AND ((balance_type)::text = 'TOTAL'::text)) Filter: (modified_time < '2021-09-13 04:00:00+00'::timestamp with time zone) Planning Time: 2.117 ms Execution Time: 8.658 ms I have played around with SET STATISTICS, work_mem and even tried CREATE STATISTICS although there is no functional dependency on the table columns in questions, but nothing seems to work. Any ideas, hints are very much appreciated! With best regards, -- Kristjan Mustkivi Email: kristjan.mustkivi@gmail.com
On 9/13/21 3:24 PM, Kristjan Mustkivi wrote: > Dear community, > > I have a query that most of the time gets executed in a few > milliseconds yet occasionally takes ~20+ seconds. The difference, as > far as I am able to tell, comes whether it uses the table Primary Key > (fast) or an additional index with smaller size. The table in question > is INSERT ONLY - no updates or deletes done there. > It'd be really useful to have explain analyze for the slow execution. My guess is there's a poor estimate, affecting some of the parameter values, and it probably resolves itself after autoanalyze run. I see you mentioned SET STATISTICS, so you tried increasing the statistics target for some of the columns? Have you tried lowering autovacuum_analyze_scale_factor to make autoanalyze more frequent? It's also possible most values are independent, but some values have a rather strong dependency, skewing the estimates. The MCV would help with that, but those are in PG12 :-( regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Autovacuum will only run for freezing, right? Insert only tables don't get autovacuumed/analyzed until PG13 if I remember right.
On Mon, Sep 13, 2021 at 9:25 AM Kristjan Mustkivi <sonicmonkey@gmail.com> wrote:
SELECT
*
FROM
myschema.mytable pbh
WHERE
pbh.product_code = $1
AND pbh.cage_player_id = $2
AND pbh.cage_code = $3
AND balance_type = $4
AND pbh.modified_time < $5
ORDER BY
pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY;
"mytable_idx2" btree (cage_code, cage_player_id, modified_time)
Why does this index exist? It seems rather specialized, but what is it specialized for?
If you are into specialized indexes, the ideal index for this query would be:
btree (cage_code, cage_player_id, product_code, balance_type, modified_time)
But the first 4 columns can appear in any order if that helps you combine indexes. If this index existed, then it wouldn't have to choose between two other suboptimal indexes, and so would be unlikely to choose incorrectly between them.
Cheers,
Jeff
On Mon, Sep 13, 2021 at 9:25 AM Kristjan Mustkivi <sonicmonkey@gmail.com> wrote:
I have caught this with AUTOEXPLAIN:
Index Cond: ((cage_code = $3) AND (cage_player_id = $2) AND
(modified_time < $5))
Filter: (((product_code)::text = ($1)::text) AND
((balance_type)::text = ($4)::text))
Is it always the case that autoexplain shows plans with $1 etc, rather than real values, for the slow queries?
If so, then it could be that the switch from custom to generic plans is causing the problem.
Cheers,
Jeff
Hello Tomas, The auto explain analyze caught this: 2021-09-14 06:55:33 UTC, pid=12345 db=mydb, usr=myuser, client=ip, app=PostgreSQL JDBC Driver, line=55 LOG: duration: 5934.165 ms plan: Query Text: SELECT * FROM myschema.mytable pbh WHERE pbh.product_code = $1 AND pbh.cage_player_id = $2 AND pbh.cage_code = $3 AND balance_type = $4 AND pbh.modified_time < $5 ORDER BY pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY Limit (cost=0.70..6.27 rows=1 width=66) (actual time=5934.154..5934.155 rows=1 loops=1) Buffers: shared hit=7623 read=18217 -> Index Scan Backward using mytable_idx2 on mytable pbh (cost=0.70..21639.94 rows=3885 width=66) (actual time=5934.153..5934.153 rows=1 loops=1) Index Cond: ((cage_code = $3) AND (cage_player_id = $2) AND (modified_time < $5)) So it expected to get 3885 rows, but got just 1. So this is the statistics issue, right? For testing, I set autovacuum_vacuum_scale_factor = 0.0 and autovacuum_vacuum_threshold = 10000 for the table and am now monitoring the behavior. Best regards, Kristjan On Mon, Sep 13, 2021 at 4:50 PM Tomas Vondra <tomas.vondra@enterprisedb.com> wrote: > > On 9/13/21 3:24 PM, Kristjan Mustkivi wrote: > > Dear community, > > > > I have a query that most of the time gets executed in a few > > milliseconds yet occasionally takes ~20+ seconds. The difference, as > > far as I am able to tell, comes whether it uses the table Primary Key > > (fast) or an additional index with smaller size. The table in question > > is INSERT ONLY - no updates or deletes done there. > > > > It'd be really useful to have explain analyze for the slow execution. > > My guess is there's a poor estimate, affecting some of the parameter > values, and it probably resolves itself after autoanalyze run. > > I see you mentioned SET STATISTICS, so you tried increasing the > statistics target for some of the columns? Have you tried lowering > autovacuum_analyze_scale_factor to make autoanalyze more frequent? > > It's also possible most values are independent, but some values have a > rather strong dependency, skewing the estimates. The MCV would help with > that, but those are in PG12 :-( > > > regards > > -- > Tomas Vondra > EnterpriseDB: http://www.enterprisedb.com > The Enterprise PostgreSQL Company -- Kristjan Mustkivi Email: kristjan.mustkivi@gmail.com
Hi Jeff, The specialized index is present due to some other queries and the index is used frequently (according to the statistics). I do agree that in this particular case the index btree (cage_code, cage_player_id, product_code, balance_type, modified_time) would solve the problem but at the moment it is not possible to change that without unexpected consequences (this odd behavior manifests only in one of our sites). I will try if more aggressive autovacuum analyze will alleviate the case as Tomas Vondra suggested. Thank you for the help! Kristjan On Mon, Sep 13, 2021 at 10:21 PM Jeff Janes <jeff.janes@gmail.com> wrote: > > On Mon, Sep 13, 2021 at 9:25 AM Kristjan Mustkivi <sonicmonkey@gmail.com> wrote: > >> >> SELECT >> * >> FROM >> myschema.mytable pbh >> WHERE >> pbh.product_code = $1 >> AND pbh.cage_player_id = $2 >> AND pbh.cage_code = $3 >> AND balance_type = $4 >> AND pbh.modified_time < $5 >> ORDER BY >> pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY; > > >> >> "mytable_idx2" btree (cage_code, cage_player_id, modified_time) > > > Why does this index exist? It seems rather specialized, but what is it specialized for? > > If you are into specialized indexes, the ideal index for this query would be: > > btree (cage_code, cage_player_id, product_code, balance_type, modified_time) > > But the first 4 columns can appear in any order if that helps you combine indexes. If this index existed, then it wouldn'thave to choose between two other suboptimal indexes, and so would be unlikely to choose incorrectly between them. > > Cheers, > > Jeff -- Kristjan Mustkivi Email: kristjan.mustkivi@gmail.com
On Tue, 2021-09-14 at 10:55 +0300, Kristjan Mustkivi wrote: > 2021-09-14 06:55:33 UTC, pid=12345 db=mydb, usr=myuser, client=ip, > app=PostgreSQL JDBC Driver, line=55 LOG: duration: 5934.165 ms plan: > Query Text: SELECT * FROM myschema.mytable pbh WHERE > pbh.product_code = $1 AND pbh.cage_player_id = $2 AND > pbh.cage_code = $3 AND balance_type = $4 AND pbh.modified_time < > $5 ORDER BY pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY > Limit (cost=0.70..6.27 rows=1 width=66) (actual time=5934.154..5934.155 rows=1 loops=1) > Buffers: shared hit=7623 read=18217 > -> Index Scan Backward using mytable_idx2 on mytable pbh (cost=0.70..21639.94 rows=3885 width=66) (actual time=5934.153..5934.153rows=1 loops=1) > Index Cond: ((cage_code = $3) AND (cage_player_id = $2) AND (modified_time < $5)) If it scanned the index for 6 seconds before finding the first result, I'd suspect one of the following: - the index is terribly bloated - there were lots of deleted rows, and the index entries were marked as "dead" - something locked the table for a long time Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On Tue, Sep 14, 2021 at 3:55 AM Kristjan Mustkivi <sonicmonkey@gmail.com> wrote:
Hello Tomas,
The auto explain analyze caught this:
2021-09-14 06:55:33 UTC, pid=12345 db=mydb, usr=myuser, client=ip,
app=PostgreSQL JDBC Driver, line=55 LOG: duration: 5934.165 ms plan:
Query Text: SELECT * FROM myschema.mytable pbh WHERE
pbh.product_code = $1 AND pbh.cage_player_id = $2 AND
pbh.cage_code = $3 AND balance_type = $4 AND pbh.modified_time <
$5 ORDER BY pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY
Limit (cost=0.70..6.27 rows=1 width=66) (actual
time=5934.154..5934.155 rows=1 loops=1)
Buffers: shared hit=7623 read=18217
-> Index Scan Backward using mytable_idx2 on mytable pbh
(cost=0.70..21639.94 rows=3885 width=66) (actual
time=5934.153..5934.153 rows=1 loops=1)
Index Cond: ((cage_code = $3) AND (cage_player_id = $2) AND
(modified_time < $5))
So it expected to get 3885 rows, but got just 1. So this is the
statistics issue, right?
That would be true if there were no LIMIT. But with the LIMIT, all this means is that it stopped actually scanning after it found one row, but it estimates that if it didn't stop it would have found 3885. So it is not very informative. But the above plan appears incomplete, there should be a line for "Rows Removed by Filter", and I think that that is what we really want to see in this case.
Cheers,
Jeff
Cheers,
Jeff
I am very sorry, I indeed copy-pasted an incomplete plan. Here it is in full: 2021-09-14 06:55:33 UTC, pid=27576 db=mydb, usr=myuser, client=ip, app=PostgreSQL JDBC Driver, line=55 LOG: duration: 5934.165 ms plan: Query Text: SELECT * FROM myschema.mytable pbh WHERE pbh.product_code = $1 AND pbh.cage_player_id = $2 AND pbh.cage_code = $3 AND balance_type = $4 AND pbh.modified_time < $5 ORDER BY pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY Limit (cost=0.70..6.27 rows=1 width=66) (actual time=5934.154..5934.155 rows=1 loops=1) Buffers: shared hit=7623 read=18217 -> Index Scan Backward using player_balance_history_idx2 on mytable pbh (cost=0.70..21639.94 rows=3885 width=66) (actual time=5934.153..5934.153 rows=1 loops=1) Index Cond: ((cage_code = $3) AND (cage_player_id = $2) AND (modified_time < $5)) Filter: (((product_code)::text = ($1)::text) AND ((balance_type)::text = ($4)::text)) Rows Removed by Filter: 95589 Buffers: shared hit=7623 read=18217 Also, I have made incrementally the following changes: set autovacuum more aggressive, then added column based stats targets and then created a statistics object. Yet there is no change in the plan behavior. Table as it is now: \d+ myschema.mytable Table "myschema.mytable" Column │ Type │ Collation │ Nullable │ Default │ Storage │ Stats target │ Description ────────────────┼─────────────────────────────┼───────────┼──────────┼─────────┼──────────┼──────────────┼───────────── cage_code │ integer │ │ not null │ │ plain │ 500 │ cage_player_id │ bigint │ │ not null │ │ plain │ 500 │ product_code │ character varying(30) │ │ not null │ │ extended │ 500 │ balance_type │ character varying(30) │ │ not null │ │ extended │ 500 │ version │ bigint │ │ not null │ │ plain │ │ modified_time │ timestamp(3) with time zone │ │ not null │ │ plain │ 500 │ amount │ numeric(38,8) │ │ not null │ │ main │ │ change │ numeric(38,8) │ │ not null │ │ main │ │ transaction_id │ bigint │ │ not null │ │ plain │ │ Indexes: "mytable_pk" PRIMARY KEY, btree (cage_code, cage_player_id, product_code, balance_type, version) "mytable_idx1" btree (modified_time) "mytable_idx2" btree (cage_code, cage_player_id, modified_time) Statistics objects: "myschema"."statistics_pbh_1" (ndistinct, dependencies) ON cage_code, cage_player_id, product_code, balance_type FROM myschema.mytable Options: autovacuum_vacuum_scale_factor=0.0, autovacuum_vacuum_threshold=1000 I will investigate the index bloat and if something is blocking the query as suggested by Laurenz Albe. Best, Kristjan On Tue, Sep 14, 2021 at 3:26 PM Jeff Janes <jeff.janes@gmail.com> wrote: > > On Tue, Sep 14, 2021 at 3:55 AM Kristjan Mustkivi <sonicmonkey@gmail.com> wrote: >> >> Hello Tomas, >> >> The auto explain analyze caught this: >> >> 2021-09-14 06:55:33 UTC, pid=12345 db=mydb, usr=myuser, client=ip, >> app=PostgreSQL JDBC Driver, line=55 LOG: duration: 5934.165 ms plan: >> Query Text: SELECT * FROM myschema.mytable pbh WHERE >> pbh.product_code = $1 AND pbh.cage_player_id = $2 AND >> pbh.cage_code = $3 AND balance_type = $4 AND pbh.modified_time < >> $5 ORDER BY pbh.modified_time DESC FETCH FIRST 1 ROWS ONLY >> Limit (cost=0.70..6.27 rows=1 width=66) (actual >> time=5934.154..5934.155 rows=1 loops=1) >> Buffers: shared hit=7623 read=18217 >> -> Index Scan Backward using mytable_idx2 on mytable pbh >> (cost=0.70..21639.94 rows=3885 width=66) (actual >> time=5934.153..5934.153 rows=1 loops=1) >> Index Cond: ((cage_code = $3) AND (cage_player_id = $2) AND >> (modified_time < $5)) >> >> So it expected to get 3885 rows, but got just 1. So this is the >> statistics issue, right? > > > That would be true if there were no LIMIT. But with the LIMIT, all this means is that it stopped actually scanning afterit found one row, but it estimates that if it didn't stop it would have found 3885. So it is not very informative. But the above plan appears incomplete, there should be a line for "Rows Removed by Filter", and I think thatthat is what we really want to see in this case. > > Cheers, > > Jeff > Cheers, > > Jeff -- Kristjan Mustkivi Email: kristjan.mustkivi@gmail.com
Kristjan Mustkivi <sonicmonkey@gmail.com> writes: > -> Index Scan Backward using player_balance_history_idx2 on > mytable pbh (cost=0.70..21639.94 rows=3885 width=66) (actual > time=5934.153..5934.153 rows=1 loops=1) > Index Cond: ((cage_code = $3) AND (cage_player_id = > $2) AND (modified_time < $5)) > Filter: (((product_code)::text = ($1)::text) AND > ((balance_type)::text = ($4)::text)) > Rows Removed by Filter: 95589 > Buffers: shared hit=7623 read=18217 So indeed, the core issue is that that filter condition is very selective, and applying it after the index scan is expensive. Perhaps it would help to create an index that includes those columns along with cage_code and cage_player_id. (It's not clear whether to bother with modified_time in this specialized index, but if you do include it, it needs to be the last column since you're putting a non-equality condition on it.) regards, tom lane
On Tue, Sep 14, 2021 at 5:15 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Kristjan Mustkivi <sonicmonkey@gmail.com> writes: > > -> Index Scan Backward using player_balance_history_idx2 on > > mytable pbh (cost=0.70..21639.94 rows=3885 width=66) (actual > > time=5934.153..5934.153 rows=1 loops=1) > > Index Cond: ((cage_code = $3) AND (cage_player_id = > > $2) AND (modified_time < $5)) > > Filter: (((product_code)::text = ($1)::text) AND > > ((balance_type)::text = ($4)::text)) > > Rows Removed by Filter: 95589 > > Buffers: shared hit=7623 read=18217 > > So indeed, the core issue is that that filter condition is very selective, > and applying it after the index scan is expensive. Perhaps it would help > to create an index that includes those columns along with cage_code and > cage_player_id. (It's not clear whether to bother with modified_time in > this specialized index, but if you do include it, it needs to be the last > column since you're putting a non-equality condition on it.) > > regards, tom lane But the Primary Key is defined as btree (cage_code, cage_player_id, product_code, balance_type, version) so this should be exactly that (apart from the extra "version" column). And the majority of the query plans are using the PK with only a small number of cases going for the IDX2 that is btree (cage_code, cage_player_id, modified_time). So I am wondering how to make them not do that. But perhaps the index bloat is indeed playing a part here as both the PK and IDX2 have ~50% bloat ratio. I will try REINDEX-ing the table although finding a good window to do it might require some time. Best regards, Kristjan
Kristjan Mustkivi <sonicmonkey@gmail.com> writes: >>> Filter: (((product_code)::text = ($1)::text) AND >>> ((balance_type)::text = ($4)::text)) > But the Primary Key is defined as btree (cage_code, cage_player_id, > product_code, balance_type, version) so this should be exactly that > (apart from the extra "version" column). Oh, interesting. So this is really a datatype mismatch problem. I'd wondered idly why you were getting the explicit casts to text in these conditions, but now it seems that that's key to the problem: the casts prevent these clauses from being matched to the index. What are the declared data types of product_code and balance_type? And of the parameters they're compared to? regards, tom lane
Hello! Both are of type varchar(30). So is this something odd: Filter: (((product_code)::text = ($1)::text) AND ((balance_type)::text = ($4)::text)) ? But why does it do the type-cast if both product_code and balance_type are of type text (although with constraint 30) and the values are also of type text? Best regards, Kristjan On Tue, Sep 14, 2021 at 6:47 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Kristjan Mustkivi <sonicmonkey@gmail.com> writes: > >>> Filter: (((product_code)::text = ($1)::text) AND > >>> ((balance_type)::text = ($4)::text)) > > > But the Primary Key is defined as btree (cage_code, cage_player_id, > > product_code, balance_type, version) so this should be exactly that > > (apart from the extra "version" column). > > Oh, interesting. So this is really a datatype mismatch problem. > I'd wondered idly why you were getting the explicit casts to text > in these conditions, but now it seems that that's key to the > problem: the casts prevent these clauses from being matched to > the index. What are the declared data types of product_code > and balance_type? And of the parameters they're compared to? > > regards, tom lane -- Kristjan Mustkivi Email: kristjan.mustkivi@gmail.com
Kristjan Mustkivi <sonicmonkey@gmail.com> writes: > Both are of type varchar(30). Ah, right, you showed that back at the top of the thread. > So is this something odd: Filter: (((product_code)::text = ($1)::text) > AND ((balance_type)::text = ($4)::text)) ? Yes, that is very darn odd. When I try this I get: regression=# create table foo(f1 varchar(30), f2 int, primary key (f2,f1)); CREATE TABLE regression=# explain select * from foo where f2 = 11 and f1 = 'bar'; QUERY PLAN -------------------------------------------------------------------------- Index Only Scan using foo_pkey on foo (cost=0.15..8.17 rows=1 width=37) Index Cond: ((f2 = 11) AND (f1 = 'bar'::text)) (2 rows) regression=# explain select * from foo where f2 = 11 and f1::text = 'bar'; QUERY PLAN -------------------------------------------------------------------------- Index Only Scan using foo_pkey on foo (cost=0.15..8.17 rows=1 width=37) Index Cond: ((f2 = 11) AND (f1 = 'bar'::text)) (2 rows) regression=# prepare p as select * from foo where f2 = $1 and f1 = $2; PREPARE regression=# explain execute p(11,'bar'); QUERY PLAN -------------------------------------------------------------------------- Index Only Scan using foo_pkey on foo (cost=0.15..8.17 rows=1 width=37) Index Cond: ((f2 = 11) AND (f1 = 'bar'::text)) (2 rows) -- repeat a few times till it switches to a generic plan ... regression=# explain execute p(11,'bar'); QUERY PLAN -------------------------------------------------------------------------- Index Only Scan using foo_pkey on foo (cost=0.15..8.17 rows=1 width=37) Index Cond: ((f2 = $1) AND (f1 = $2)) (2 rows) Note the lack of any visible cast on the varchar column, in each one of these queries, even where I tried to force one to appear. There is something happening in your database that is not happening in mine. My mind is now running to the possibility that you've got some extension that creates an "=" operator that is capturing the syntax. regards, tom lane
On Wed, Sep 15, 2021 at 3:16 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Note the lack of any visible cast on the varchar column, in each one of > these queries, even where I tried to force one to appear. There is > something happening in your database that is not happening in mine. > > My mind is now running to the possibility that you've got some extension > that creates an "=" operator that is capturing the syntax. > > regards, tom lane The following extensions have been installed: ─[ RECORD 1 ]────────────────────────────────────────────────────────── Name │ btree_gist Version │ 1.5 Schema │ public Description │ support for indexing common datatypes in GiST ─[ RECORD 2 ]────────────────────────────────────────────────────────── Name │ pg_stat_statements Version │ 1.6 Schema │ public Description │ track execution statistics of all SQL statements executed ─[ RECORD 3 ]────────────────────────────────────────────────────────── Name │ pgcrypto Version │ 1.3 Schema │ public Description │ cryptographic functions ─[ RECORD 4 ]────────────────────────────────────────────────────────── Name │ plpgsql Version │ 1.0 Schema │ pg_catalog Description │ PL/pgSQL procedural language Plus the some libraries preloaded: shared_preload_libraries = 'pg_stat_statements,pg_cron,auto_explain' Thank you so much for looking into this! Best regards, -- Kristjan Mustkivi Email: kristjan.mustkivi@gmail.com
Kristjan Mustkivi <sonicmonkey@gmail.com> writes: > On Wed, Sep 15, 2021 at 3:16 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Note the lack of any visible cast on the varchar column, in each one of >> these queries, even where I tried to force one to appear. There is >> something happening in your database that is not happening in mine. > The following extensions have been installed: > [ nothing very exciting ] I still get the same results after installing those extensions. I realized that the reason I don't see a cast is that fix_indexqual_operand removes the cast from an index qualifier expression's index-column side. In any other context, there would be a cast there, since the operator is =(text,text) not =(varchar,varchar). So that seems like a red herring ... or is it? Now I'm confused by your original report, in which you show >>> -> Index Scan using mytable_pk on mytable pbh (cost=0.70..176.82 rows=186 width=66) (actual time=1.001..8.610rows=25 loops=1) >>> Index Cond: ((cage_code = 123) AND (cage_player_id = '12345'::bigint) AND ((product_code)::text = 'PRODUCT'::text)AND ((balance_type)::text = 'TOTAL'::text)) >>> Filter: (modified_time < '2021-09-13 04:00:00+00'::timestamp with time zone) According to the code I just looked at, there should absolutely not be casts on the product_code and balance_type index columns here. So I'm not quite sure what's up ... -ENOCAFFEINE perhaps. Nonetheless, this point is probably just a sideshow. The above EXPLAIN output proves that the planner *can* match this index, which destroys my idea that you had a datatype mismatch preventing it from doing so. After looking again at the original problem, I think you are getting bit by an issue we've seen before. The planner is coming out with a decently accurate cost estimate for the query when specific values are inserted for the parameters. However, when it considers a generic version of the query with no known parameter values, the cost estimates are not so good, and by chance it comes out estimating a very low cost for the alternative plan that uses the other index. That cost is not right, but the planner doesn't know that, so it seizes on that plan. This is a hard problem to fix, and we don't have a good answer for it. In v12 and up, you can use the big hammer of disabling generic plans by setting plan_cache_mode to "force_custom_plan", but v11 doesn't have that parameter. You might need to avoid using a prepared statement for this query. regards, tom lane
Understood. Thank you so much for looking into this! Best regards, Kristjan On Wed, Sep 15, 2021 at 5:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Kristjan Mustkivi <sonicmonkey@gmail.com> writes: > > On Wed, Sep 15, 2021 at 3:16 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Note the lack of any visible cast on the varchar column, in each one of > >> these queries, even where I tried to force one to appear. There is > >> something happening in your database that is not happening in mine. > > > The following extensions have been installed: > > [ nothing very exciting ] > > I still get the same results after installing those extensions. > > I realized that the reason I don't see a cast is that > fix_indexqual_operand removes the cast from an index qualifier > expression's index-column side. In any other context, there would > be a cast there, since the operator is =(text,text) not > =(varchar,varchar). So that seems like a red herring ... or is it? > Now I'm confused by your original report, in which you show > > >>> -> Index Scan using mytable_pk on mytable pbh (cost=0.70..176.82 rows=186 width=66) (actual time=1.001..8.610rows=25 loops=1) > >>> Index Cond: ((cage_code = 123) AND (cage_player_id = '12345'::bigint) AND ((product_code)::text = 'PRODUCT'::text)AND ((balance_type)::text = 'TOTAL'::text)) > >>> Filter: (modified_time < '2021-09-13 04:00:00+00'::timestamp with time zone) > > According to the code I just looked at, there should absolutely not > be casts on the product_code and balance_type index columns here. > So I'm not quite sure what's up ... -ENOCAFFEINE perhaps. > > Nonetheless, this point is probably just a sideshow. The above > EXPLAIN output proves that the planner *can* match this index, > which destroys my idea that you had a datatype mismatch preventing > it from doing so. > > After looking again at the original problem, I think you are getting > bit by an issue we've seen before. The planner is coming out with > a decently accurate cost estimate for the query when specific values > are inserted for the parameters. However, when it considers a generic > version of the query with no known parameter values, the cost estimates > are not so good, and by chance it comes out estimating a very low cost > for the alternative plan that uses the other index. That cost is not > right, but the planner doesn't know that, so it seizes on that plan. > > This is a hard problem to fix, and we don't have a good answer for it. > In v12 and up, you can use the big hammer of disabling generic plans by > setting plan_cache_mode to "force_custom_plan", but v11 doesn't have > that parameter. You might need to avoid using a prepared statement for > this query. > > regards, tom lane -- Kristjan Mustkivi Email: kristjan.mustkivi@gmail.com