Postgres undeterministically uses a bad plan, how to convince it otherwise? - Mailing list pgsql-general

From cen
Subject Postgres undeterministically uses a bad plan, how to convince it otherwise?
Date
Msg-id 9c98978d-fc72-8655-bb29-5aaa941bee36@gmail.com
Whole thread Raw
Responses Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?  ("David G. Johnston" <david.g.johnston@gmail.com>)
Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?  (Ron <ronljohnsonjr@gmail.com>)
List pgsql-general
Hi,

I am running the same application (identical codebase) as two separate 
instances to index (save) different sets of data. Both run PostgreSQL 13.

The queries are the same but the content in actual databases is 
different. One database is around 1TB and the other around 300GB.


There is a problem with a paginated select query with a join and an 
order. Depending on which column you order by (primary or FK) the query 
is either instant or takes minutes.

So on one database, ordering by primary is instant but on the other it 
is slow and vice-versa. Switching the columns around on the slow case 
fixes the issue.

All relavant colums are indexed.


Simplified:

Slow: SELECT * from table1 AS t1 LEFT JOIN table2 AS t2 ON 
t2.t1_id=t1.id ORDER BY t1.id ASC LIMIT 0, 10

Fast: SELECT * from table1 AS t1 LEFT JOIN table2 AS t2 ON 
t2.t1_id=t1.id ORDER BY t2.t1_id ASC LIMIT 0, 10

(and the opposite, on the other instance the first one is fast and 
second one is slow).


I have run all the statistic recalculations but that doesn't help. As 
far as I could read the docs, there is no real way to affect the plan

other than reworking the query (I've read about fencing?) which can't be 
done because it seems to be unpredictable and depends on actual data and 
data quantity.

I haven't tried reindexing.


Before providing and diving into the specific query plans:

- does the planner take previous runs of the same query and it's 
execution time into account? If not, why?

- assuming the query to be immutable, would it be possible for the 
planner to microbenchmark a few different plans instead of trying to 
estimate the cost?
As in, actually executing the query with different plans and caching the 
best one.



PS: good job to the FOSDEM Devroom organisers, one of the best rooms, 
was a blast.

Best regards, Cen




pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Automatic aggressive vacuum on almost frozen table takes too long
Next
From: "David G. Johnston"
Date:
Subject: Re: Postgres undeterministically uses a bad plan, how to convince it otherwise?