Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activityview? - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activityview?
Date
Msg-id CAOBaU_aQKkB3HehOp820ssJXBQbGiAQjxvB_8ynNG2g4MbUUjA@mail.gmail.com
Whole thread Raw
In response to Re: Feature improvement: can we add queryId forpg_catalog.pg_stat_activity view?  (Michael Paquier <michael@paquier.xyz>)
Responses Re: Feature improvement: can we add queryId forpg_catalog.pg_stat_activity view?
List pgsql-hackers
Thanks for looking at it!

On Wed, Sep 11, 2019 at 6:45 AM Michael Paquier <michael@paquier.xyz> wrote:
>
> An invalid query ID is assumed to be 0 in the patch, per the way it is
> defined in pg_stat_statements.  However this also maps with the case
> where we have a utility statement.

Oh indeed.  Which means that if a utility statements later calls
parse_analyze or friends, this patch would report an unexpected
queryid.  That's at least possible for something like

COPY (SELECT * FROM tbl) TO ...

The thing is that pg_stat_statements assigns a 0 queryid in the
post_parse_analyze_hook to recognize utility statements and avoid
tracking instrumentation twice in case of utility statements, and then
compute a queryid base on a hash of the query text.  Maybe we could
instead fully reserve queryid "2" for utility statements (so forcing
queryid "1" for standard queries if jumbling returns 0 *or* 2 instead
of only 0), and use "2" as the identifier for utility statement
instead of "0"?

> +   /*
> +    * We only report the top-level query identifiers.  The stored queryid is
> +    * reset when a backend call pgstat_report_activity(STATE_RUNNING), or with
> +    * an explicit call to this function.  If the saved query identifier is not
> +    * zero it means that it's not a top-level command, so ignore the one
> +    * provided unless it's an explicit call to reset the identifier.
> +    */
> +   if (queryId != 0 && beentry->st_queryid != 0)
> +       return;
> Hmm.  I am wondering if we shouldn't have an API dedicated to the
> reset of the query ID.  That logic looks rather brittle..

How about adding a "bool force" parameter to allow resetting the queryid to 0?

> Wouldn't it be better (and more consistent) to update the query ID in
> parse_analyze_varparams() and parse_analyze() as well after going
> through the post_parse_analyze hook instead of pg_analyze_and_rewrite?

I thought about it without knowing what would be best.  I'll change to
report the queryid right after calling post_parse_analyze_hook then.

> +   /*
> +    * If a new query is started, we reset the query identifier as it'll only
> +    * be known after parse analysis, to avoid reporting last query's
> +    * identifier.
> +    */
> +   if (state == STATE_RUNNING)
> +       beentry->st_queryid = 0
> I don't quite get why you don't reset the counter in other cases as
> well.  If the backend entry is idle in transaction or in an idle
> state, it seems to me that we should not report the query ID of the
> last query run in the transaction.  And that would make the reset in
> exec_simple_query() unnecessary, no?

I'm reproducing the same behavior as for the query text, ie. showing
the information about the last executed query text if state is idle:

+     <entry><structfield>queryid</structfield></entry>
+     <entry><type>bigint</type></entry>
+     <entry>Identifier of this backend's most recent query. If
+      <structfield>state</structfield> is <literal>active</literal> this field
+      shows the identifier of the currently executing query. In all other
+      states, it shows the identifier of last query that was executed.

I think that showing the last executed query's queryid is as useful as
the query text.  Also, while avoiding a reset in exec_simple_query()
it'd be required to do such reset in case of error during query
execution, so that wouldn't make things quite simpler..



pgsql-hackers by date:

Previous
From: Michael Meskes
Date:
Subject: Re: A problem presentaion about ECPG, DECLARE STATEMENT
Next
From: Alvaro Herrera from 2ndQuadrant
Date:
Subject: Re: propagating replica identity to partitions