Thread: query ID goes missing with extended query protocol

query ID goes missing with extended query protocol

From
Robert Haas
Date:
Hi,

Suppose you set compute_query_id=on and then execute a query using
either the simple or the extended query protocol. With the simple
query protocol, the query ID is advertised during query execution;
with the extended query protocol, it's not. Here's a test case to
demonstrate:

robert.haas=# select query_id from pg_stat_activity where pid =
pg_backend_pid() \g
      query_id
---------------------
 4332952175652079452
(1 row)

robert.haas=# select query_id from pg_stat_activity where pid =
pg_backend_pid() \bind \g
 query_id
----------

(1 row)

I found this surprising and decided to investigate why it was
happening. It turns out that exec_{parse,bind,execute}_message() all
start by calling pgstat_report_activity(STATE_RUNNING, ...) which
resets the advertised query ID, which seems fair enough. After that,
exec_parse_message() does parse analysis, which results in a call to
pgstat_report_query_id() that advertises the correct query ID. After
resetting the advertised query ID initially, exec_bind_message() calls
PortalStart() which calls ExecutorStart() which re-advertises the same
query ID that was computed at parse analysis time. At execute time, we
call PortalRun() which calls ExecutorRun() which does NOT re-advertise
the saved query ID. But as far as I can see, that's just an oversight
in commit 4f0b0966c86, which added this hunk in ExecutorStart:

+    /*
+     * In some cases (e.g. an EXECUTE statement) a query execution will skip
+     * parse analysis, which means that the queryid won't be reported.  Note
+     * that it's harmless to report the queryid multiple time, as the call will
+     * be ignored if the top level queryid has already been reported.
+     */
+    pgstat_report_queryid(queryDesc->plannedstmt->queryId, false);

But I think that the entire first sentence of this comment is just
wrong. Query execution can't skip parse analysis; parse analysis has
to be performed before planning and planning has to be performed
before execution. So any time parse analysis computes the query ID, we
should have it at execution time, as long as it got saved into the
PlannedStmt (and why would we ever intentionally skip that?). The
comment is also wrong about the consequences: this not only affects
the EXECUTE statement but also an Execute protocol message, hence the
behavior demonstrated above.

I propose that we should:

- Add a call to
pgstat_report_query_id(queryDesc->plannedstmt->queryId, false) to the
top of ExecutorRun() with an appropriate comment.
- Fix the incorrect comment mentioned above.
- Back-patch to all releases containing 4f0b0966c86 i.e. v14+.

Thoughts?

-- 
Robert Haas
EDB: http://www.enterprisedb.com



Re: query ID goes missing with extended query protocol

From
Robert Haas
Date:
On Wed, Aug 28, 2024 at 6:48 PM Michael Paquier <michael@paquier.xyz> wrote:
> This is being discussed already on a different thread:
> - Thread: https://www.postgresql.org/message-id/CA+427g8DiW3aZ6pOpVgkPbqK97ouBdf18VLiHFesea2jUk3XoQ@mail.gmail.com
> - CF entry: https://commitfest.postgresql.org/49/4963/
>
> There is a patch proposed there, that attempts to deal with the two
> cases of a portal where ExecutorRun() is called once while holding a
> tuplestore and where ExecutorRun() is called multiple times.  A few
> approaches have been discussed for the tests, where the new psql
> metacommands added in d55322b0da60 should become handy.  That's on my
> TODO spreadsheet of things to look at, but I did not come back to it
> yet.

That's interesting, but it sort of seems like it's reinventing the
wheel, vs. the one-line change that I proposed.

--
Robert Haas
EDB: http://www.enterprisedb.com