Re: [BUG] temporary file usage report with extended protocol and unnamed portals - Mailing list pgsql-hackers

From Sami Imseih
Subject Re: [BUG] temporary file usage report with extended protocol and unnamed portals
Date
Msg-id CAA5RZ0tcK=F+GBVpdFKPPDZBpq=vbQ=qUgv2WdxyutGOYs8NFg@mail.gmail.com
Whole thread Raw
In response to Re: [BUG] temporary file usage report with extended protocol and unnamed portals  (Michael Paquier <michael@paquier.xyz>)
Responses Re: [BUG] temporary file usage report with extended protocol and unnamed portals
List pgsql-hackers
> > After thinking about this a bit more, I found the test that breaks
> > with v12. It is a bind statement in an implicit transaction. The
> > portal will get dropped by the end of the transaction and will not
> > reach drop_unnamed_portal. So, v13 takes Frederic's original idea,
> > saves the pointer of debug_query_string, and resets it at the end of
> > DropPortal. I also enhanced the test coverage.
> >
> > Debugging also shows that the STATEMENT: log is formed while we are
> > in ExecutorEnd. We know that other extensions rely on
> > QueryDesc->sourceText in that hook (i.e., pg_stat_statements), so
> > this is another reason this approach appears safe, in addition to
> > what was mentioned here [0] about the MessageContext outliving the
> > portal.
> >
> > [0] https://www.postgresql.org/message-id/CAA5RZ0vB-h2pFimPSi72ObWfpRwKR5kaN9XWW17TOkLntC9svA%40mail.gmail.com
>
> Hmm.  We are running in circles here, based on the argument that the
> drop of an unnamed portal happens when the next BIND command shows
> up, and the fact that it is an old protocol artifact that we have been
> leaving with for ages in the backend engine for efficiency.

Yes, and with the correct tests, we ultimately returned to the original idea:
temporarily switching the debug_query_string inside PortalDrop proved
to be the only viable solution. Although, this was also the approach that
Tom early-on was against due to the possibility of "The portal's query string
is potentially shorter-lived than the portal" [0], although the code
comments [1]
say otherwise. I have been able to find a test case that breaks due to
this concern.

> I do agree about the point that it may be better to encourage drivers
> to not use unnamed portals, but at the same time I don't recall
> anybody complaining about the fact that a statement string should
> absolutely match with the statement where a temporary file has been
> created and should be linked to it.

Tools like pgbadger show the wrong information with the current
output. I am surprised with the lack of complaints.

> Thinking about this problem in a twisted way, could there be an
> argument in favor of the existing logic as it is?  It is true that the
> cleanup happens when the next bind query happens.  So, in fact, one
> could also say that it makes sense to reflect when a temp file is
> cleaned up and what's the query being processed that does the cleanup.

One could possibly make this claim, based on the comment of what
the STATEMENT: log actually does.
```
/*
* If the user wants the query that generated this error logged, do it.
*/
if (check_log_of_query(edata))
{
log_line_prefix(&buf, edata);
appendStringInfoString(&buf, _("STATEMENT: "));
```
But, from a temporary file logging perspective, this is very confusing
for the user.

> In short, I would be inclined to do nothing here,

Maybe we document this behavior? [2]

> but I do see an
> argument in favor of the tests, particularly to track the fact that
> the unnamed portal drop happens in the next BIND query, and looking

+1


[0] https://www.postgresql.org/message-id/23969.1745083695%40sss.pgh.pa.us
[1] https://www.postgresql.org/message-id/CAA5RZ0vB-h2pFimPSi72ObWfpRwKR5kaN9XWW17TOkLntC9svA%40mail.gmail.com
[2] https://www.postgresql.org/docs/current/runtime-config-logging.html#GUC-LOG-TEMP-FILES

--
Sami



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: plan shape work
Next
From: Richard Guo
Date:
Subject: Re: plan shape work