Re: unstable query plan on pg 16,17,18 - Mailing list pgsql-performance

From Attila Soki
Subject Re: unstable query plan on pg 16,17,18
Date
Msg-id A92040E8-9AFC-4B41-AB19-88E09ACB71AF@gmx.net
Whole thread Raw
In response to Re: unstable query plan on pg 16,17,18  (Laurenz Albe <laurenz.albe@cybertec.at>)
Responses Re: unstable query plan on pg 16,17,18
List pgsql-performance
On 23 Feb 2026, at 20:59, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Mon, 2026-02-23 at 16:10 +0100, Attila Soki wrote:
On 23 Feb 2026, at 10:41, Laurenz Albe <laurenz.albe@cybertec.at> wrote:

On Mon, 2026-02-23 at 10:37 +0100, Attila Soki wrote:
When upgrading from PostgreSQL 14.4, I noticed that one of my somewhat complex
analytical queries sometimes gets an inefficient plan under PostgreSQL 16, 17, and 18.
Under 14.4, the query runs with a stable plan and completes in 19 to 22 seconds.
In newer versions, the plan seems to be unstable, sometimes the query completes
in 17 to 20 seconds, sometimes it runs for 5 to 18 minutes with the inefficient plan.
This also happens even if the data is not significantly changed.

This is very likely owing to a bad estimate.

Could you turn on "track_io_timing" and send us the EXPLAIN (ANALYZE, BUFFERS) output
for both the good and the bad plan?

Thank you for your reply. Here are the two explains.
In order to be able to publish the plans here, I have obfuscated the table and field names, but this is reversible, so I can provide more info if needed.

plan-ok:
https://explain.depesz.com/s/hQvM

plan-wrong:
https://explain.depesz.com/s/uLvl

Thanks.

The difference in the plans is under the "Subquery Scan on odg", starting with
plan node 50 (everything under the "Sort").  I suspect that the mis-estimate
that is at the root of the problem is here:

->  Index Scan using table_k_late_spec_dp_end_dat_key on schema1.table_k kal  (... rows=196053 ...) (... rows=471.00 ...)
     Index Cond: (kal.dp_end_dat < ('now'::cstring)::date)
     Index Searches: 1
     Buffers: shared hit=230 read=49
     I/O Timings: shared read=0.142

PostgreSQL overestimates the row count by a factor of over 400.
Try to fix that estimate and see if that gets PostgreSQL to do the right thing.

Perhaps a simple ANALYZE on the table can do the trick.

In the examples I used table_k to flip the plan with
vacuumed -Upostgres -vZ -t schema1.tbl_used_in_query db1
in the explain output schema1.tbl_used_in_query is table_k

The right side of the comparison looks awkward, as if you wrote 'now'::text::date
My experiments show that PostgreSQL v18 estimates well even with such a weird
condition, but perhaps if you write "current_date" instead, you'd get better results.

I didn't realize that made a difference. I will replace all occurrences. It also looks more clean with current_date.


I'd play just with a query like

 EXPLAIN (ANALYZE)
 SELECT * FROM schema1.table_k AS kal
 WHERE dp_end_dat < current_date;

until I get a good estimate.

I will try to set custom statistics for dp_end_dat and the fields used by the table_k_late_spec_dp_end_dat_key index.
Let’s see if that helps.

I am on UTC+1. I will try all of this tomorrow and get back to you with the results later.

Thank you

regards,
Attila


pgsql-performance by date:

Previous
From: Andrei Lepikhov
Date:
Subject: Re: unstable query plan on pg 16,17,18
Next
From: Laurenz Albe
Date:
Subject: Re: unstable query plan on pg 16,17,18