Home > mailing lists

Re: query_id, pg_stat_activity, extended query protocol - Mailing list pgsql-hackers

From	Sami Imseih
Subject	Re: query_id, pg_stat_activity, extended query protocol
Date	July 23, 2024 21:00:25
Msg-id	24F4B22C-D188-4E9E-A165-2FD7AA17ADC7@gmail.com Whole thread Raw
In response to	Re: query_id, pg_stat_activity, extended query protocol (Michael Paquier <michael@paquier.xyz>)
Responses	Re: query_id, pg_stat_activity, extended query protocol
List	pgsql-hackers

Tree view

> On Wed, Jul 17, 2024 at 11:32:49AM +0200, Anthonin Bonnefoy wrote:
>> Wouldn't it be enough to call pgstat_report_query_id in ExecutorRun
>> and ProcessUtility? With those changes [1], both normal statements and
>> utility statements called through extended protocol will correctly
>> report the query_id.
>
> Interesting, and this position is really tempting.  By doing so you
> would force the query ID to be set to the one from the CTAS and
> EXPLAIN, because these would be executed before the inner queries, and
> pgstat_report_query_id() with its non-force option does not overwrite
> what would be already set (aka what should be the top-level query ID).
>
> Using ExecutorRun() feels consistent with the closest thing I've
> touched in this area lately in 1d477a907e63, because that's the only
> code path that we are sure to take depending on the portal execution
> (two execution scenarios depending on how rows are retrieved, as far
> as I recall).  The comment should be perhaps more consistent with the
> executor start counterpart.  So I would be OK with that..  The
> location before the hook of ProcessUtility is tempting, as it would
> take care of the case of PortalRunMulti().  However..  Joining with a
> point from Sami upthread..
>
> This is still not enough in the case of where we have a holdStore, no?
> This is the case where we would do *one* ExecutorRun(), followed up by
> a scan of the tuplestore in more than one execute message.  The v2
> proposed upthread, by positioning a query ID to be set in
> PortalRunSelect(), is still positioning that in two places.

Correct, I also don’t think ExecutorRun is enough. Another reason is we should also
be setting the queryId during bind, right before planning starts.
Planning could have significant impact on the server and I think we better
track the responsible queryId.

I have not tested the holdStore case.  IIUC the holdStore deals with fetching a
WITH HOLD CURSOR. Why would this matter for this conversation?

> Hmm...  How about being much more aggressive and just do the whole
> business in exec_execute_message(), just before we do the PortalRun()?
> I mean, that's the source of all our problems, and we know the
> statements that the portal will work on so we could go through the
> list, grab the first planned query and set the query ID based on that,
> without caring about the portal patterns we would need to think about.

Doing the work in exec_execute_message makes sense, although maybe
setting the queryId after pgstat_report_activity is better because it occurs earlier.
Also, we should do the same for exec_bind_message and set the queryId
right after pgstat_report_activity in this function as well.

We do have to account for the queryId changing after cache revalidation, so
we should still set the queryId inside GetCachedPlan in the case the query
underwent re-analysis. This means there is a chance that a queryId set at
the start of the exec_bind_message may be different by the time we complete
the function, in the case the query revalidation results in a different queryId.

See the attached v3.

Regards,

Sami

Attachment

v3-0001-Fix-queryId-reporting-in-Extended-Query-Protocol.patch

pgsql-hackers by date:

From: Jeff Davis
Date: 23 July 2024, 20:43:18
Subject: Re: [18] Policy on IMMUTABLE functions and Unicode updates

From: Tom Lane
Date: 23 July 2024, 21:11:25
Subject: Re: [18] Policy on IMMUTABLE functions and Unicode updates

Re: query_id, pg_stat_activity, extended query protocol - Mailing list pgsql-hackers

Attachment

Previous

Next