Re: Postgres: Queries are too slow after upgrading to PG17 from PG15 - Mailing list pgsql-bugs

From Sajith Prabhakar Shetty
Subject Re: Postgres: Queries are too slow after upgrading to PG17 from PG15
Date
Msg-id DM4PR19MB6486E73FF04BADFD120A0568B525A@DM4PR19MB6486.namprd19.prod.outlook.com
Whole thread Raw
In response to Re: Postgres: Queries are too slow after upgrading to PG17 from PG15  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-bugs

Thanks Peter for detailed response. We really appreciate your time and effort in this regard.

Please help me on the next step of actions required from our end. I understand you have found out the root cause, can you confirm if any fix would be approved for this case in upcoming patches and any ETA is really helpful.

 

Also note that this has affected many of our SQL queries, and we cannot afford to modify our code to accommodate the PG17 planner changes.

 

Looping in the email, my colleague Todd, who helped me get the reproducer steps.

 

 

 

Sajith P Shetty

Principal Engineer

Black Duck 

+91 9448389989ssajith@blackduck.com

signature_778616162

 

 

From: Peter Geoghegan <pg@bowt.ie>
Date: Tuesday, 29 July 2025 at 2:12
AM
To: Sajith Prabhakar Shetty <ssajith@blackduck.com>
Cc: Andrei Lepikhov <lepihov@gmail.com>, pgsql-bugs@lists.postgresql.org <pgsql-bugs@lists.postgresql.org>, Tom Lane <tgl@sss.pgh.pa.us>
Subject: Re: Postgres: Queries are too slow after upgrading to PG17 from PG15

On Mon, Jul 28, 2025 at 2:20AM Sajith Prabhakar Shetty
<ssajith@blackduck.com> wrote:
> We are able to get you a self-contained reproducer, please find attached dump, sql script and read me files.

I find that your test case spends a great deal of time on nbtree
preprocessing, which happens once per execution of the inner index
scan on "zsf". According to "perf top", most cycles on spent on these:

 32.02%  postgres                    [.] FunctionCall2Coll
  22.01%  postgres                    [.] qsort_arg
  18.64%  postgres                    [.] _bt_compare_array_elements
   8.20%  postgres                    [.] btint8cmp
   3.97%  postgres                    [.] _bt_preprocess_keys
  ...

The query takes ~1550ms on my local workstation. If I just comment out
the relevant qsort, it'll take only ~190 ms. That qsort might not be
the only problem here, but it is the immediate problem. Note that
commenting out the qsort should produce the same answer, at least for
this one query, since the constants that appear in the query are
already sorted (the EXPLAIN row counts match what they show with the
qsort in place).

In principle, we could limit the use of the qsort to the first inner
index scan, and safely skip each subsequent qsort -- at least in cases
where the array was a constant (which includes this case). Obviously,
we *do* need a qsort (what if the constants aren't exactly in sorted
order?), but we generally don't need to do it once per inner index
scan.

There is a separate question as to whether or not the planner should
pick this plan in the first place. I find that I can get a faster plan
(without commenting out anything) by tricking the planner into using
the single column "zsf_pkey", rather than the multi-column
"zsf_id_fpi_cid_key". Even then, I can only get a merge join with the
"zsf_pkey" index on the inner side -- not a nested loop join with the
"zsf_pkey" index on the inner side, as expected. The merge join plan
brings the execution time down to ~90ms, which is better, but
certainly still less than ideal.

--
Peter Geoghegan

Attachment

pgsql-bugs by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Unexpected Standby Shutdown on sync_replication_slots change
Next
From: Laurenz Albe
Date:
Subject: Re: Unexpected Standby Shutdown on sync_replication_slots change