Thread: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 18950 Logged by: Lowell Hought Email address: lowell.hought@gmail.com PostgreSQL version: 17.5 Operating system: Red Hat Enterprise Linux release 8.10 (Ootpa) Description: A pgsql function that returns Table is relatively simple. It accepts two arguments; a date and an interval. It uses these to arguments in a query that returns a result set. This function worked in Postgresql 16.1 and Postgresql 16.9. However, it does not work in versions 17.0 nor in 17.5 Here is my build sequence for version 17.5, the last version I tested. ./configure --prefix=/usr/local/pgsql/17/ --with-openssl --with-ldap --with-systemd --with-libxml --with-libxslt gmake world gmake check : # All 222 tests passed. sudo gmake install-world su root mkdir /usr/local/pgsql/17/data chown postgres /usr/local/pgsql/17/data su postgres /usr/local/pgsql/17/bin/initdb -D /usr/local/pgsql/17/data Success. You can now start the database server using: /usr/local/pgsql/17/bin/pg_ctl -D /usr/local/pgsql/17/data -l logfile start /usr/local/pgsql/17/bin/pg_ctl -D /usr/local/pgsql/17/data -l /usr/local/pgsql/17/data/logfile start /usr/local/pgsql/17/bin/createdb test /usr/local/pgsql/17/bin/psql test psql (17.5) Type "help" for help. Restore from backup: /usr/local/pgsql/17/bin/psql -d test -Upostgres -f /bak/db/db_backup.sql Then I run psql and execute this command: SELECT * FROM report.GetReportPoolTrainees(CURRENT_DATE, '1 year'); The result is that even after 12 hours the function does not return. There is no response whatsoever. I tested this both on Redhat 8.10 and on Red Hat Enterprise Linux release 8.10 (Ootpa) and on Rocky Linux release 8.7 (Green Obsidian) with identical results. On both systems the function call works as expected in version 16, but in version 17 it never returns. Another interesting fact: I run the query contained in within the function, it executes as expected and returns the expected results. It is only when called via the function that it fails. Here is the function definition: CREATE OR REPLACE FUNCTION report.GetReportPoolTrainees( venddate DATE, vtimespan INTERVAL ) RETURNS TABLE ( facility_key INT, facility_code TEXT, traineeid INT, ndc_emp_id INT, facility_eod DATE, hire_status TEXT, trainee_start_date DATE, devstatus_date DATE, status TEXT, status_days INT )AS $$ DECLARE plenddate DATE; plinterval INTERVAL; BEGIN IF venddate IS NULL THEN plenddate = CURRENT_DATE; ELSE plenddate = venddate; END IF; IF vtimespan IS NULL THEN SELECT value::INTERVAL INTO plinterval FROM ntd.site_settings WHERE variable = 'PPT_DATA_YEARS'; IF NOT FOUND THEN plinterval = '10 years'; END IF; ELSE plinterval = vtimespan; END IF; RETURN QUERY WITH t AS ( SELECT DISTINCT td.facility_key, td.traineeid, td.ndc_emp_id, td.facility_eod, td.tia_code AS hire_status, td.trainee_start_date, td.devstatus_date, td.status, td.status_days FROM report.vw_training_details td ), f AS ( SELECT * FROM ntd.facility_dim fd JOIN ( SELECT hft.facility_key, MAX(facility_type_start_date) AS facility_type_start_date FROM ntd.history_facility_type hft GROUP BY hft.facility_key )DT1 USING(facility_key) ) SELECT DT1.facility_key, DT1.facility_code, t.traineeid, t.ndc_emp_id, t.facility_eod, t.hire_status, t.trainee_start_date, t.devstatus_date, t.status, t.status_days FROM t JOIN (SELECT l.facility_key, l.facility_code, CASE WHEN facility_type_start_date IS NOT NULL AND facility_type_start_date > l.last_devstatus_date - plinterval THEN facility_type_start_date ELSE l.last_devstatus_date - plinterval END AS last_devstatus_date FROM (SELECT t.facility_key, f.facility_code, MAX(t.devstatus_date) AS last_devstatus_date FROM t LEFT JOIN f USING(facility_key) WHERE (t.status = 'Completed' OR t.status = 'Did Not Complete') AND t.devstatus_date <= plenddate GROUP BY t.facility_key, f.facility_code )l LEFT JOIN (SELECT pef.facility_key, MAX(facility_type_start_date) AS facility_type_start_date FROM ntd.ppt_exception_facilities pef WHERE facility_type_start_date < CURRENT_DATE GROUP BY pef.facility_key ) e USING(facility_key) )DT1 USING(facility_key) WHERE t.devstatus_date <= plenddate AND t.devstatus_date >= DT1.last_devstatus_date ; END; $$ LANGUAGE plpgsql; This is the first time I have ever encountered a bug and so I have no experience with reporting it. I understand I need to send the underlying table structures but not sure if I do it here or some other way. I do not see an option to attach files. I love Postgres! It is a terrific database and has served me well for about 20 years.
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Laurenz Albe
Date:
On Sat, 2025-06-07 at 16:08 +0000, PG Bug reporting form wrote: > PostgreSQL version: 17.5 > Operating system: Red Hat Enterprise Linux release 8.10 (Ootpa) > > A pgsql function that returns Table is relatively simple. It accepts two > arguments; a date and an interval. It uses these to arguments in a query > that returns a result set. This function worked in Postgresql 16.1 and > Postgresql 16.9. However, it does not work in versions 17.0 nor in 17.5 > > [function definition] > > This is the first time I have ever encountered a bug and so I have no > experience with reporting it. Your function essentially is running a query. If you run that query outside of the function, does it finish on time? If not, you have a simpler problem to tackle. It need not necessarily be a bug if some queries perform worse in a later PostgreSQL version. There are too many factors involved in the execution of a complicated query. To give help, we'd need at least the EXPLAIN output from the query and the EXPLAIN (ANALYZE, BUFFERS) output from the query on the old version. Yours, Laurenz Albe
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Lowell Hought
Date:
If I run the query outside of the function it works as expected. But the function never returns. Lowell > On Jun 7, 2025, at 7:27 PM, Laurenz Albe <laurenz.albe@cybertec.at> wrote: > > On Sat, 2025-06-07 at 16:08 +0000, PG Bug reporting form wrote: >> PostgreSQL version: 17.5 >> Operating system: Red Hat Enterprise Linux release 8.10 (Ootpa) >> >> A pgsql function that returns Table is relatively simple. It accepts two >> arguments; a date and an interval. It uses these to arguments in a query >> that returns a result set. This function worked in Postgresql 16.1 and >> Postgresql 16.9. However, it does not work in versions 17.0 nor in 17.5 >> >> [function definition] >> >> This is the first time I have ever encountered a bug and so I have no >> experience with reporting it. > > Your function essentially is running a query. > If you run that query outside of the function, does it finish on time? > If not, you have a simpler problem to tackle. > > It need not necessarily be a bug if some queries perform worse in a later > PostgreSQL version. There are too many factors involved in the execution > of a complicated query. To give help, we'd need at least the EXPLAIN > output from the query and the EXPLAIN (ANALYZE, BUFFERS) output from the > query on the old version. > > Yours, > Laurenz Albe
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Tom Lane
Date:
Lowell Hought <lowell.hought@gmail.com> writes: > If I run the query outside of the function it works as expected. But the function never returns. This isn't that surprising either: a query inside a function is often translated into a "generic" plan that doesn't depend on specific parameter values, typically sacrificing runtime to avoid repeated planning. Does it get any better if you do "set plan_cache_mode = force_custom_plan" before running the function? regards, tom lane
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Lowell Hought
Date:
Yes! I just changed the parameter as you suggested and the query returned as expected.
So I guess something changed between version 16 and version 17? Perhaps the default for that setting?
Lowell
On Sat, Jun 7, 2025 at 9:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Lowell Hought <lowell.hought@gmail.com> writes:
> If I run the query outside of the function it works as expected. But the function never returns.
This isn't that surprising either: a query inside a function is
often translated into a "generic" plan that doesn't depend on
specific parameter values, typically sacrificing runtime to
avoid repeated planning. Does it get any better if you do
"set plan_cache_mode = force_custom_plan" before running the
function?
regards, tom lane
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Lowell Hought
Date:
Wait, I spoke too soon. When I just ran the query I was on version 16. Give me a moment to shut down 16 and fire up 17 and I will try it again.
Lowell
On Sat, Jun 7, 2025 at 9:57 PM Lowell Hought <lowell.hought@gmail.com> wrote:
Yes! I just changed the parameter as you suggested and the query returned as expected.So I guess something changed between version 16 and version 17? Perhaps the default for that setting?LowellOn Sat, Jun 7, 2025 at 9:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:Lowell Hought <lowell.hought@gmail.com> writes:
> If I run the query outside of the function it works as expected. But the function never returns.
This isn't that surprising either: a query inside a function is
often translated into a "generic" plan that doesn't depend on
specific parameter values, typically sacrificing runtime to
avoid repeated planning. Does it get any better if you do
"set plan_cache_mode = force_custom_plan" before running the
function?
regards, tom lane
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Tom Lane
Date:
Lowell Hought <lowell.hought@gmail.com> writes: > Yes! I just changed the parameter as you suggested and the query returned > as expected. > So I guess something changed between version 16 and version 17? Perhaps > the default for that setting? No, that default has not changed. What apparently happened is that cost estimates or something changed enough to persuade the planner to use a generic plan that's considerably inferior to what it was choosing before. There's nowhere near enough info in your report to pin it down more closely than that. Just to eliminate the obvious --- you did run ANALYZE after the upgrade, right? If you had any nondefault settings of default_statistics_target or similar planner parameters, did you make sure they got transferred to the new installation? regards, tom lane
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Lowell Hought
Date:
Changing that parameter had no effect on the version 17 instance. The query itself produces results in a matter of a second or so, but the function still hangs and does not return.
Lowell
On Sat, Jun 7, 2025 at 9:59 PM Lowell Hought <lowell.hought@gmail.com> wrote:
Wait, I spoke too soon. When I just ran the query I was on version 16. Give me a moment to shut down 16 and fire up 17 and I will try it again.LowellOn Sat, Jun 7, 2025 at 9:57 PM Lowell Hought <lowell.hought@gmail.com> wrote:Yes! I just changed the parameter as you suggested and the query returned as expected.So I guess something changed between version 16 and version 17? Perhaps the default for that setting?LowellOn Sat, Jun 7, 2025 at 9:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:Lowell Hought <lowell.hought@gmail.com> writes:
> If I run the query outside of the function it works as expected. But the function never returns.
This isn't that surprising either: a query inside a function is
often translated into a "generic" plan that doesn't depend on
specific parameter values, typically sacrificing runtime to
avoid repeated planning. Does it get any better if you do
"set plan_cache_mode = force_custom_plan" before running the
function?
regards, tom lane
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Lowell Hought
Date:
My initial response that changing the parameter worked was mistaken. I forgot to shut down version 16 so the function call happened on that instance. I now shut down 16, fired up 17 and set the parameter, and the behavior did not change. The query itself returns results in about 1 second. The function never returns at all, it just sits there.
Lowell
On Sat, Jun 7, 2025 at 10:02 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Lowell Hought <lowell.hought@gmail.com> writes:
> Yes! I just changed the parameter as you suggested and the query returned
> as expected.
> So I guess something changed between version 16 and version 17? Perhaps
> the default for that setting?
No, that default has not changed. What apparently happened is that
cost estimates or something changed enough to persuade the planner
to use a generic plan that's considerably inferior to what it was
choosing before. There's nowhere near enough info in your report
to pin it down more closely than that.
Just to eliminate the obvious --- you did run ANALYZE after the
upgrade, right? If you had any nondefault settings of
default_statistics_target or similar planner parameters,
did you make sure they got transferred to the new installation?
regards, tom lane
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Tom Lane
Date:
Lowell Hought <lowell.hought@gmail.com> writes: > Changing that parameter had no effect on the version 17 instance. The > query itself produces results in a matter of a second or so, but the > function still hangs and does not return. Darn. But I'm still pretty sure that the problem is an undesirable change of plan, and that we don't have enough information to say more than that. Any chance you could extract a self-contained test case? regards, tom lane
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Lowell Hought
Date:
I can try. I am not sure how to go about that. I did not see on the bug report page where I could upload files, and I am afraid the file size of the tables needed might be too large for email. The entire database when written to an sql dump file is about 20 GB, so not terribly large. I could attempt to dump the schema definition in one file and then the underlying tables in another. Would that work? Or would you also need the files for the function and any views the query relies upon?
Lowell
On Sat, Jun 7, 2025 at 10:11 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Lowell Hought <lowell.hought@gmail.com> writes:
> Changing that parameter had no effect on the version 17 instance. The
> query itself produces results in a matter of a second or so, but the
> function still hangs and does not return.
Darn. But I'm still pretty sure that the problem is an undesirable
change of plan, and that we don't have enough information to say
more than that. Any chance you could extract a self-contained
test case?
regards, tom lane
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Tom Lane
Date:
Lowell Hought <lowell.hought@gmail.com> writes: > I can try. I am not sure how to go about that. I did not see on the bug > report page where I could upload files, and I am afraid the file size of > the tables needed might be too large for email. No, uploading stuff to that webform doesn't work. But at this point we're just conversing on the pgsql-bugs mailing list, so anything you can squeeze into email is fine. Having said that, nobody likes multi-gigabyte emails. > The entire database when > written to an sql dump file is about 20 GB, so not terribly large. I could > attempt to dump the schema definition in one file and then the underlying > tables in another. Would that work? Or would you also need the files for > the function and any views the query relies upon? Yeah, we'd need all the moving parts. Usually people with this kind of problem don't want to expose their data anyway, for privacy and/or legal reasons. So what I'd suggest is trying to create some little script that generates fake data that's close enough to trigger the problem. Then you just need to provide that script and the DDL and function definitions. regards, tom lane
Re: BUG #18950: pgsql function that worked in Postgresql 16 does not return in Postgresql 17
From
Lowell Hought
Date:
I will work on that over the next few days. I really appreciate you and your team!
Lowell
On Sat, Jun 7, 2025 at 10:26 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Lowell Hought <lowell.hought@gmail.com> writes:
> I can try. I am not sure how to go about that. I did not see on the bug
> report page where I could upload files, and I am afraid the file size of
> the tables needed might be too large for email.
No, uploading stuff to that webform doesn't work. But at this point
we're just conversing on the pgsql-bugs mailing list, so anything you
can squeeze into email is fine. Having said that, nobody likes
multi-gigabyte emails.
> The entire database when
> written to an sql dump file is about 20 GB, so not terribly large. I could
> attempt to dump the schema definition in one file and then the underlying
> tables in another. Would that work? Or would you also need the files for
> the function and any views the query relies upon?
Yeah, we'd need all the moving parts.
Usually people with this kind of problem don't want to expose their
data anyway, for privacy and/or legal reasons. So what I'd suggest
is trying to create some little script that generates fake data
that's close enough to trigger the problem. Then you just need to
provide that script and the DDL and function definitions.
regards, tom lane