Re: refreshing query id for pg_stat_statements based on comment in sql - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: refreshing query id for pg_stat_statements based on comment in sql
Date
Msg-id 20220308014233.lqxevlklbzdhyx2r@jrouhaud
Whole thread Raw
In response to refreshing query id for pg_stat_statements based on comment in sql  (Zhihong Yu <zyu@yugabyte.com>)
List pgsql-hackers
Hi,

On Mon, Mar 07, 2022 at 09:42:26AM -0800, Zhihong Yu wrote:
> Hi,
> Currently the query id for pg_stat_statements gets calculated based on the
> parse nodes specifics.
> This means that the user cannot add a comment to a SQL query to test
> something. (though some other RDBMS allows this practice).
>
> Consider this use case: for query q, admin looks at stats and performs some
> optimization (without changing the query). Admin adds / modifies the
> comment for q - now the query becomes q'. If query id doesn't change, there
> still would be one row in pg_stat_statements which makes it difficult to
> gauge the effectiveness of the tuning.
>
> I want to get opinion from the community whether adding / changing comment
> in SQL query should result in new query id for pg_stat_statements.

Are you talking about optimizer hint with something like pg_hint_plan, or just
random comment like "/* we now added index blabla */ SELECT ..."?

If the former, then such an extension can already provide its own queryid
generator which can chose to ignore part or all of the comments or not.

If the latter, then it seems shortsighted to me.  At the very least not all
application can be modified to have a specific comment attached to a query.

Also, if you want check how a query if performing after doing some
modifications, you should start with some EXPLAIN ANALYZE first (or even a
simple EXPLAIN if you want to validate some new index using hypothetical
indexes).  If this is some more general change (e.g. shared_buffers,
work_mem...) then the whole system is going to perform differently, and you
certainly won't add a new comment to every single query executed.

So again it seems to me that doing pg_stat_statement snapshots and comparing
the diff between each to see how the whole workload, or specific queries, is
behaving is still the best answer here.



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_tablespace_location() failure with allow_in_place_tablespaces
Next
From: Andres Freund
Date:
Subject: Re: Time to drop plpython2?