Re: [BUG] temporary file usage report with extended protocol and unnamed portals - Mailing list pgsql-hackers

From Sami Imseih
Subject Re: [BUG] temporary file usage report with extended protocol and unnamed portals
Date
Msg-id CAA5RZ0s-JLjD4E7shD9otcqJTgy-1K7FLrs9F=0QCC5qn_bMrQ@mail.gmail.com
Whole thread Raw
In response to Re: [BUG] temporary file usage report with extended protocol and unnamed portals  (Michael Paquier <michael@paquier.xyz>)
List pgsql-hackers
> > About the main issue, seems to me the consensus is that dropping unnamed
> > portals more aggressively is the way to go?
>
> At the cost of one extra hash table lookup, that unlikely does not
> matter for performance anyway.  I guess so.

AFAICT, we don’t need to introduce an extra hash lookup. What we
need to do is ensure that the unnamed portal was run to completion
inside exec_execute_message; if it was, we can drop it. Dropping
relies on the fact that we already have a valid pointer to the portal.
v16 is based on [0], with updates to the documentation and tests.

Note that in the tests, I opted to keep SELECT 'pipelined query' and
SELECT 'named portal', since they provide another execution after
the temporary logging inside the transaction. However, we are no
longer looking for their statement text in the logfile.

I wanted to add test cases in which we require multiple fetches
from the same portal, but we don't have the ability to change
the fetch size (i.e. max_rows) in psql. So here is the test I ran
using jdbc, but it's not helpful, since JDBC created a named
portal :( so I just hacked the patch to exercise the code path
of dropping the unnamed portal during _execute_ with:

"
-                       if (portal->name[0] == '\0')
+                       if (portal->name[0] == 'C')
                                PortalDrop(portal, false);
                }
"

and the logging is

"""
2025-10-28 17:33:48.606 UTC [260207] LOG:  duration: 0.107 ms  execute
<unnamed>: BEGIN
2025-10-28 17:33:48.607 UTC [260207] LOG:  duration: 0.816 ms  parse
<unnamed>: SELECT a FROM foo ORDER BY a OFFSET 4990
2025-10-28 17:33:48.607 UTC [260207] LOG:  duration: 0.455 ms  bind
<unnamed>/C_1: SELECT a FROM foo ORDER BY a OFFSET 4990
2025-10-28 17:33:48.608 UTC [260207] LOG:  duration: 1.307 ms  execute
<unnamed>/C_1: SELECT a FROM foo ORDER BY a OFFSET 4990
2025-10-28 17:33:48.619 UTC [260207] LOG:  duration: 0.018 ms  execute
fetch from <unnamed>/C_1: SELECT a FROM foo ORDER BY a OFFSET 4990
2025-10-28 17:33:48.619 UTC [260207] LOG:  duration: 0.005 ms  execute
fetch from <unnamed>/C_1: SELECT a FROM foo ORDER BY a OFFSET 4990
2025-10-28 17:33:48.619 UTC [260207] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp260207.0", size 73728
2025-10-28 17:33:48.619 UTC [260207] STATEMENT:  SELECT a FROM foo
ORDER BY a OFFSET 4990
2025-10-28 17:33:48.619 UTC [260207] LOG:  duration: 0.125 ms  execute
fetch from <unnamed>/C_1: SELECT a FROM foo ORDER BY a OFFSET 4990
2025-10-28 17:33:48.620 UTC [260207] LOG:  duration: 0.020 ms  parse S_2: COMMIT
"""

which looks correct to me as the temp file is logged at the last
execution. It will be good to probably increase the capability
of psql to perform such tests, but for now I thought this was
a good verification, maybe.

> It's not something that can be backpatched.

I agree, as this is a change to the protocol.

[0] https://www.postgresql.org/message-id/CAA5RZ0vGPa%3DUjiNiS0gK1zHVhysSBPMSaGU5Qc%3D1PrVKJ6ODCw%40mail.gmail.com

--
Sami Imseih
Amazon Web Services (AWS)

Attachment

pgsql-hackers by date:

Previous
From: Jacob Champion
Date:
Subject: Re: Channel binding for post-quantum cryptography
Next
From: Sami Imseih
Date:
Subject: Re: Bug in pg_stat_statements