Thread: [BUG] temporary file usage report with extended protocol and unnamed portals

Hi,

It seems there's a bug in the logging of temporary file usage when the 
extended protocol is used with unnamed portals.

For example, with the attached Java / pgJDBC  programs, we get the 
following logs:

[...] LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp525566.0", 
size 2416640
[..] STATEMENT:  SELECT 1

but it should be:

[...] LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp538230.0", 
size 2416640
[...] STATEMENT:  SELECT * FROM foo ORDER BY a OFFSET $1 LIMIT 2

It has been tested with HEAD and REL_17_STABLE.

My guess is that there's a race somewhere, probably when the global 
variable "debug_query_string" is set.
The unnamed portal is closed when the BIND of the next query arrives 
(here: SELECT 1), and I suspect that the variable is set before the 
temporary file is deleted (and logged).

pgJDBC uses unnamed portals, but I don't think this is specific to JDBC. 
I see the same problem with the attached Python / psycopg3 program.
I think it would be better if the drivers used named portals all the 
time (and an explicit close message), but this seems to be a postgres bug.

What do you think?

Best regards,
Frédéric


PS : the dataset is created like this on the server:

CREATE UNLOGGED TABLE foo(a int);
INSERT INTO foo SELECT * FROM generate_series(1, 200000);
ALTER SYSTEM SET log_temp_files = 0;
ALTER SYSTEM SET log_min_duration_statement = -1
SELECT pg_reload_conf();
Attachment

On 4/18/25 10:49, Frédéric Yhuel wrote:
> Hi,
> 
> It seems there's a bug in the logging of temporary file usage when the 
> extended protocol is used with unnamed portals.
> 

FWIW, the attached patch seems to fix the problem.

Best regards,
Frédéric
Attachment
> [...] LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp525566.0",
> size 2416640
> [..] STATEMENT:  SELECT 1

> but it should be:

> [...] LOG:  temporary file: path "base/pgsql_tmp/pgsql_tmp538230.0",
> size 2416640
> [...] STATEMENT:  SELECT * FROM foo ORDER BY a OFFSET $1 LIMIT 2


hmm, looking at the repro and your patch, I was really confused by
why the STATEMENT: ends up being SELECT 1 even though the
query with temp being tracked is the ORDER BY query.

So, I ran a slowed down test where there is a 15 second thread sleep between
the ORDER BY and SELECT 1 ( see attached ) and I also ran it with the
following GUCs enabled

```
log_temp_files = '0'
work_mem = '64kB'
max_parallel_workers_per_gather = '0'
log_min_duration_statement = '0'
```

2025-04-18 13:31:45.572 CDT [95918] LOG:  duration: 0.046 ms  bind
<unnamed>: select pg_sleep(0)
2025-04-18 13:31:45.572 CDT [95918] LOG:  duration: 0.009 ms  execute
<unnamed>: select pg_sleep(0)
2025-04-18 13:31:45.574 CDT [95918] LOG:  duration: 0.454 ms  parse
<unnamed>: SELECT * FROM foo ORDER BY a
2025-04-18 13:31:45.575 CDT [95918] LOG:  duration: 0.278 ms  bind
<unnamed>: SELECT * FROM foo ORDER BY a
2025-04-18 13:31:46.396 CDT [95918] LOG:  duration: 821.076 ms
execute <unnamed>: SELECT * FROM foo ORDER BY a

2025-04-18 13:32:01.403 CDT [95918] LOG:  duration: 0.438 ms  parse
<unnamed>: SELECT 1
2025-04-18 13:32:01.454 CDT [95918] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp95918.0", size 202039296
2025-04-18 13:32:01.454 CDT [95918] STATEMENT:  SELECT 1
2025-04-18 13:32:01.454 CDT [95918] LOG:  duration: 50.818 ms  bind
<unnamed>: SELECT 1
2025-04-18 13:32:01.454 CDT [95918] LOG:  duration: 0.022 ms  execute
<unnamed>: SELECT 1
2025-04-18 13:32:01.457 CDT [95918] LOG:  duration: 0.064 ms  parse S_1: COMMIT
2025-04-18 13:32:01.457 CDT [95918] LOG:  duration: 0.017 ms  bind S_1: COMMIT
2025-04-18 13:32:01.457 CDT [95918] LOG:  duration: 0.105 ms  execute
S_1: COMMIT

In the log above, it shows that the temporary file logging does not
actually occur
until the next query is executed, so at that point the query string is
already "SELECT 1"

what is even more interesting is when running the commands directly from psql
using extended query protocol, the STATEMENT is not logged after the
temporary file message.


```
SELECT  FROM foo ORDER BY a \bind
;
SELECT 1 \bind
;
```

```
istance=0 kB, estimate=0 kB; lsn=0/E55A088, redo lsn=0/E55A030
2025-04-18 13:36:39.275 CDT [96237] LOG:  duration: 1.447 ms  parse
<unnamed>: SELECT  FROM foo ORDER BY a
    ;
2025-04-18 13:36:39.275 CDT [96237] LOG:  duration: 0.719 ms  bind
<unnamed>: SELECT  FROM foo ORDER BY a
    ;
2025-04-18 13:36:39.822 CDT [96237] LOG:  duration: 546.803 ms
execute <unnamed>: SELECT  FROM foo ORDER BY a
    ;
2025-04-18 13:36:39.852 CDT [96237] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp96237.0", size 202039296
2025-04-18 13:36:40.736 CDT [96237] LOG:  duration: 0.394 ms  parse
<unnamed>: SELECT 1
    ;
2025-04-18 13:36:40.736 CDT [96237] LOG:  duration: 0.133 ms  bind
<unnamed>: SELECT 1
    ;
2025-04-18 13:36:40.736 CDT [96237] LOG:  duration: 0.023 ms  execute
<unnamed>: SELECT 1
    ;
```

I think the DropPortal is happening later in the JDBC case, and only
after the next query is
executed, maybe? So your patch does take care of the problem because
it ensures that
the drop portal, when it occurs, is referencing the correct sql. I am
not yet sure if the patch
the right solution yet, it maybe the best answer.

I don't have a better answer, but wanted to share this research as well.

--
Sami Imseih
Amazon Web Services (AWS)

Attachment
I looked into this a bit more.

What occurs during exec_bind_message is that the debug_query_string of the
query is set early on. Later in that routine, a new portal is created
via CreatePortal,
which also drops the existing unnamed portal from the previous
execution and which
also logs the temp file information. So the logging is now using the
debug_query_string
of the current query and not the one referenced by the to-be-dropped portal.

This behavior means that there is a delay in log_temp_files logging,
since the portal
must be dropped, which can occur after the statement has completed
execution. We can look
into improving that, but I am not sure whether it is possible or can
be done safely.

I think the solution proposed by Frédéric seems reasonable: to switch
the debug_query_string
inside PortalDrop. However, I do not like the way the
debug_query_string changes values
after the CreatePortal call inside exec_bind_message; that seems incorrect.
So, I believe we should temporarily switch the debug_query_string
value only while
running PortalDrop. Attached is what I think could be safer to do.
What do you think?

--
Sami Imseih
Amazon Web Services (AWS)

Attachment
Sami Imseih <samimseih@gmail.com> writes:
> I think the solution proposed by Frédéric seems reasonable: to switch
> the debug_query_string
> inside PortalDrop. However, I do not like the way the
> debug_query_string changes values
> after the CreatePortal call inside exec_bind_message; that seems incorrect.
> So, I believe we should temporarily switch the debug_query_string
> value only while
> running PortalDrop. Attached is what I think could be safer to do.
> What do you think?

I don't think this is safe at all.  The portal's query string
is potentially shorter-lived than the portal, see in particular
exec_simple_query which just passes a pointer to the original
string in MessageContext.

I'm frankly inclined to do nothing, but if we must do something,
the way to fix it here would be to transiently set debug_query_string
to NULL so that the actions of PortalDrop aren't blamed on any
particular query.  (In my testing, the "temporary file:" message comes
out without any attached STATEMENT most of the time already, so this
isn't losing much as far as that's concerned.)

The whole thing is just a band-aid, though.  debug_query_string
is not the only indicator of what the backend is currently doing.
If somebody comes along and complains that the pg_stat_activity
entry doesn't reflect what's happening, are we going to take
that seriously?

Perhaps a cleaner answer is to rearrange things in postgres.c
so that if there's a pre-existing unnamed portal, we drop that
before we ever set debug_query_string and friends at all.
This would add an extra portal hashtable lookup, but I'm not sure
if that would be measurable or not.  (Possibly we could get that
back by simplifying CreatePortal's API so that it doesn't need to
perform an initial lookup.)

            regards, tom lane



> Sami Imseih <samimseih@gmail.com> writes:
> > I think the solution proposed by Frédéric seems reasonable: to switch
> > the debug_query_string
> > inside PortalDrop. However, I do not like the way the
> > debug_query_string changes values
> > after the CreatePortal call inside exec_bind_message; that seems incorrect.
> > So, I believe we should temporarily switch the debug_query_string
> > value only while
> > running PortalDrop. Attached is what I think could be safer to do.
> > What do you think?
>
> I don't think this is safe at all.  The portal's query string
> is potentially shorter-lived than the portal, see in particular
> exec_simple_query which just passes a pointer to the original
> string in MessageContext.

Yes, you are correct. The comments about MessageContect in
mmgr/README give me pause.

"....
This
is kept separate from per-transaction and per-portal contexts because a
query string might need to live either a longer or shorter time than any
single transaction or portal.
"""

> I'm frankly inclined to do nothing, but if we must do something,
> the way to fix it here would be to transiently set debug_query_string
> to NULL so that the actions of PortalDrop aren't blamed on any
> particular query.  (In my testing, the "temporary file:" message comes
> out without any attached STATEMENT most of the time already, so this
> isn't losing much as far as that's concerned.)

moreover, as I was looking into why the case mentioned earlier

```
SELECT FROM foo ORDER BY a \bind
;
SELECT 1 \bind
;
```

does not show STATEMENT after the temp file logging, I realized
it's because the temp files are cleaned up and reported at
the end of transaction, which means that debug_query_string is NULL at the
time the portal is dropped in the next query.  This causes
check_log_of_query to return false.

/* query string available? */
if (debug_query_string == NULL)
      return false;

> Perhaps a cleaner answer is to rearrange things in postgres.c
> so that if there's a pre-existing unnamed portal, we drop that
> before we ever set debug_query_string and friends at all.

That did cross my mind as well, but I was trying to avoid doing this
type of rearranging. I still rather not go down that path considering the
case mentioned above will still not display the query text in a STATEMENT log.

> I'm frankly inclined to do nothing, but if we must do something,
> the way to fix it here would be to transiently set debug_query_string
> to NULL so that the actions of PortalDrop aren't blamed on any
> particular query.

I think this is better, because I rather we avoid lines like the below in which
there are temp files being reported all with STATEMENT logging
of a different query. It's better to just not show STATEMENT at all.

```
2025-04-19 16:44:38.082 CDT [38963] STATEMENT:  SELECT 1
2025-04-19 16:44:38.115 CDT [38963] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp38963.1", size 1073741824
2025-04-19 16:44:38.115 CDT [38963] STATEMENT:  SELECT 1
2025-04-19 16:44:38.149 CDT [38963] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp38963.2", size 1073741824
2025-04-19 16:44:38.149 CDT [38963] STATEMENT:  SELECT 1
2025-04-19 16:44:38.305 CDT [38963] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp38963.3", size 1073741824
2025-04-19 16:44:38.305 CDT [38963] STATEMENT:  SELECT 1
2025-04-19 16:44:38.558 CDT [38963] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp38963.4", size 1073741824
2025-04-19 16:44:38.558 CDT [38963] STATEMENT:  SELECT 1
2025-04-19 16:44:38.744 CDT [38963] LOG:  temporary file: path
"base/pgsql_tmp/pgsql_tmp38963.5", size 1073741824
```


--
Sami Imseih
Amazon Web Services (AWS)