Thread: refreshing query id for pg_stat_statements based on comment in sql
Hi,
Currently the query id for pg_stat_statements gets calculated based on the parse nodes specifics.
This means that the user cannot add a comment to a SQL query to test something. (though some other RDBMS allows this practice).
Consider this use case: for query q, admin looks at stats and performs some optimization (without changing the query). Admin adds / modifies the comment for q - now the query becomes q'. If query id doesn't change, there still would be one row in pg_stat_statements which makes it difficult to gauge the effectiveness of the tuning.
I want to get opinion from the community whether adding / changing comment in SQL query should result in new query id for pg_stat_statements.
Cheers
On Mon, Mar 7, 2022 at 09:42:26AM -0800, Zhihong Yu wrote: > Hi, > Currently the query id for pg_stat_statements gets calculated based on the > parse nodes specifics. > This means that the user cannot add a comment to a SQL query to test something. > (though some other RDBMS allows this practice). > > Consider this use case: for query q, admin looks at stats and performs some > optimization (without changing the query). Admin adds / modifies the comment > for q - now the query becomes q'. If query id doesn't change, there still would > be one row in pg_stat_statements which makes it difficult to gauge the > effectiveness of the tuning. > > I want to get opinion from the community whether adding / changing comment in > SQL query should result in new query id for pg_stat_statements. Uh, we don't have a parse node for comments, and I didn't think comments were part of the query id, and my testing confirms that: psql -c "SET log_statement = 'all'" -c "select pg_sleep(10000) -- test1;" test psql -c "SET log_statement = 'all'" -c "select pg_sleep(10000) -- test2;" test shows the comment in the logs: 2022-03-07 19:02:19.509 EST [1075860] LOG: statement: select pg_sleep(10000) -- test1; 2022-03-07 19:02:24.389 EST [1075860] ERROR: canceling statement due to user request 2022-03-07 19:02:24.389 EST [1075860] STATEMENT: select pg_sleep(10000) -- test1; 2022-03-07 19:02:27.029 EST [1075893] LOG: statement: select pg_sleep(10000) -- test2; 2022-03-07 19:02:47.915 EST [1075893] ERROR: canceling statement due to user request 2022-03-07 19:02:47.915 EST [1075893] STATEMENT: select pg_sleep(10000) -- test2; and I see the same query_id for both: test=> select query, query_id from pg_stat_activity; query | query_id -----------------------------------------------+---------------------- | | --> select pg_sleep(10000) -- test1; | 2920433178127795318 select query, query_id from pg_stat_activity; | -8032661921273433383 | | | (7 rows) test=> select query, query_id from pg_stat_activity; query | query_id -----------------------------------------------+---------------------- | | --> select pg_sleep(10000) -- test2; | 2920433178127795318 select query, query_id from pg_stat_activity; | -8032661921273433383 I think you need to show us the problem you are having. -- Bruce Momjian <bruce@momjian.us> https://momjian.us EDB https://enterprisedb.com If only the physical world exists, free will is an illusion.
Hi, On Mon, Mar 07, 2022 at 09:42:26AM -0800, Zhihong Yu wrote: > Hi, > Currently the query id for pg_stat_statements gets calculated based on the > parse nodes specifics. > This means that the user cannot add a comment to a SQL query to test > something. (though some other RDBMS allows this practice). > > Consider this use case: for query q, admin looks at stats and performs some > optimization (without changing the query). Admin adds / modifies the > comment for q - now the query becomes q'. If query id doesn't change, there > still would be one row in pg_stat_statements which makes it difficult to > gauge the effectiveness of the tuning. > > I want to get opinion from the community whether adding / changing comment > in SQL query should result in new query id for pg_stat_statements. Are you talking about optimizer hint with something like pg_hint_plan, or just random comment like "/* we now added index blabla */ SELECT ..."? If the former, then such an extension can already provide its own queryid generator which can chose to ignore part or all of the comments or not. If the latter, then it seems shortsighted to me. At the very least not all application can be modified to have a specific comment attached to a query. Also, if you want check how a query if performing after doing some modifications, you should start with some EXPLAIN ANALYZE first (or even a simple EXPLAIN if you want to validate some new index using hypothetical indexes). If this is some more general change (e.g. shared_buffers, work_mem...) then the whole system is going to perform differently, and you certainly won't add a new comment to every single query executed. So again it seems to me that doing pg_stat_statement snapshots and comparing the diff between each to see how the whole workload, or specific queries, is behaving is still the best answer here.