Thread: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17552 Logged by: Maxim Boguk Email address: maxim.boguk@gmail.com PostgreSQL version: 14.4 Operating system: Linux Description: pg_stat_statements track internal FK trigger check queries during data load via COPY ( pg_stat_statements.track=top (default) and pg_stat_statements.track_utility=off (not default) ) Tested on latest 13 and 14 versions. Test script (test.sql): drop table if exists t1 CASCADE; drop table if exists t2; create table t1 (id integer primary key); insert into t1 values (1); vacuum ANALYZE t1; create table t2 (id integer primary key, fk integer references t1(id)); create extension IF NOT EXISTS pg_stat_statements; select e.extname AS "Name", e.extversion AS "Version" FROM pg_catalog.pg_extension e where e.extname='pg_stat_statements'; show pg_stat_statements.track; show pg_stat_statements.track_utility; select pg_stat_statements_reset(); COPY public.t2 (id, fk) FROM stdin; 1 1 2 1 3 1 4 1 5 1 6 1 7 1 8 1 9 1 10 1 11 1 12 1 13 1 14 1 15 1 16 1 17 1 18 1 19 1 20 1 21 1 22 1 23 1 24 1 25 1 26 1 27 1 28 1 29 1 30 1 31 1 32 1 33 1 34 1 35 1 36 1 37 1 38 1 39 1 40 1 41 1 42 1 43 1 44 1 45 1 46 1 47 1 48 1 49 1 50 1 51 1 52 1 53 1 54 1 55 1 56 1 57 1 58 1 59 1 60 1 61 1 62 1 63 1 64 1 65 1 66 1 67 1 68 1 69 1 70 1 71 1 72 1 73 1 74 1 75 1 76 1 77 1 78 1 79 1 80 1 81 1 82 1 83 1 84 1 85 1 86 1 87 1 88 1 89 1 90 1 91 1 92 1 93 1 94 1 95 1 96 1 97 1 98 1 99 1 100 1 \. select query, calls from public.pg_stat_statements where dbid=(select oid from pg_database where datname=current_database()) order by calls desc limit 2; run: psql -e -f test.sql -d sometestdb Final output rows: select query, calls from public.pg_stat_statements where dbid=(select oid from pg_database where datname=current_database()) order by calls desc limit 2; query | calls ---------------------------------------------------------------------------------------------+------- SELECT $2 FROM ONLY "public"."t1" x WHERE "id" OPERATOR(pg_catalog.=) $1 FOR KEY SHARE OF x | 100 select pg_stat_statements_reset() | 1 (2 rows) Expected output: query | calls ---------------------------------------------------------------------------------------------+------- select pg_stat_statements_reset() | 1 (1 rows) PS: with pg_stat_statements.track_utility=on test work as expected and provides expected select query, calls from public.pg_stat_statements where dbid=(select oid from pg_database where datname=current_database()) order by calls desc limit 2; query | calls ------------------------------------+------- COPY public.t2 (id, fk) FROM stdin | 1 select pg_stat_statements_reset() | 1 Regards, Maxim
Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
From
Julien Rouhaud
Date:
Hi, On Sun, Jul 17, 2022 at 07:23:36PM +0000, PG Bug reporting form wrote: > > Bug reference: 17552 > Logged by: Maxim Boguk > Email address: maxim.boguk@gmail.com > PostgreSQL version: 14.4 > Operating system: Linux > Description: > > pg_stat_statements track internal FK trigger check queries during data load > via COPY > ( pg_stat_statements.track=top (default) and > pg_stat_statements.track_utility=off (not default) ) > Tested on latest 13 and 14 versions. Reproduced on current master too. > Test script (test.sql): > [...] > select query, calls from public.pg_stat_statements where dbid=(select oid > from pg_database where datname=current_database()) order by calls desc limit > 2; > > run: > psql -e -f test.sql -d sometestdb > > Final output rows: > select query, calls from public.pg_stat_statements where dbid=(select oid > from pg_database where datname=current_database()) order by calls desc limit > 2; > query > | calls > ---------------------------------------------------------------------------------------------+------- > SELECT $2 FROM ONLY "public"."t1" x WHERE "id" OPERATOR(pg_catalog.=) $1 > FOR KEY SHARE OF x | 100 > select pg_stat_statements_reset() > | 1 > (2 rows) > > > Expected output: > query > | calls > ---------------------------------------------------------------------------------------------+------- > select pg_stat_statements_reset() > | 1 > (1 rows) I agree with you that the current behavior seems wrong. If we ignore a utility statement, it doesn't mean that any standard query executed underneath should become a top-level statement as it's the case now. I see that there's an existing test that actually relies on that behavior, which is an anonymous block code like: DO LANGUAGE plpgsql $$ BEGIN -- this is a SELECT PERFORM 'hello world'::TEXT; END; $$; which is expected to reports a "SELECT $1::text" with pg_stat_statements.track = top and track_utility = off, and also doesn't seem right. That test was added in 83f2061dd03, apparently covered by "Also, significantly expand pg_stat_statements' regression test script", but without any discussion on that specific behavior as far as I can see after a quick look. Assuming that we want to fix it, simply increasing the exec_nested_level in the fallback calls to prev_ProcessUtility/standard_ProcessUtility for handled utility commands should get the correct behavior.
Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
From
Sergei Kornilov
Date:
Hello I have a similar problem with exactly the same reason. Queries are counted as top level, although they are not called atthe top level. Another small example: set pg_stat_statements.track_utility to off; set pg_stat_statements.track to 'top'; do $$ declare i int; begin select 1 as n into i; end $$ language plpgsql; select toplevel, query from pg_stat_statements order by query; toplevel | query ----------+----------------------------------- t | select $1 as n t | select pg_stat_statements_reset() (2 rows) But "select $1 as n" was not a top-level query. In the first patch, a testcase showing the behavior now. The second patch adds a nesting level increment if track_utilityis disabled. I need to duplicate the PGSS_HANDLED_UTILITY check here because it's documented above: > If it's an EXECUTE statement, we don't track it and don't increment the nesting level. regards, Sergei
Attachment
Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
From
Tomas Vondra
Date:
On 3/22/23 13:21, Sergei Kornilov wrote: > Hello > > I have a similar problem with exactly the same reason. Queries are counted as top level, although they are not called atthe top level. Another small example: > > set pg_stat_statements.track_utility to off; > set pg_stat_statements.track to 'top'; > do $$ declare i int; begin select 1 as n into i; end $$ language plpgsql; > select toplevel, query from pg_stat_statements order by query; > toplevel | query > ----------+----------------------------------- > t | select $1 as n > t | select pg_stat_statements_reset() > (2 rows) > > But "select $1 as n" was not a top-level query. > > In the first patch, a testcase showing the behavior now. The second patch adds a nesting level increment if track_utilityis disabled. I need to duplicate the PGSS_HANDLED_UTILITY check here because it's documented above: > >> If it's an EXECUTE statement, we don't track it and don't increment the nesting level. > I took a look, and the patch seems correct to me. It'll need better comments explaining the change, but I'll take care of that. Barring objections, I'll get this committed and backpatched. FWIW it's probably a good idea to post patches on pgsql-hackers - not everyone watches pgsql-bugs and/or CF app. Furthermore, when a patch is switched to "RFC" it's customary to mention that in the thread, along with a review. Otherwise these silent changes are quite puzzling. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
From
Sergei Kornilov
Date:
Hello > I took a look, and the patch seems correct to me. It'll need better > comments explaining the change, but I'll take care of that. Barring > objections, I'll get this committed and backpatched. Great, thank you! > FWIW it's probably a good idea to post patches on pgsql-hackers - not > everyone watches pgsql-bugs and/or CF app. Furthermore, when a patch is > switched to "RFC" it's customary to mention that in the thread, along > with a review. Otherwise these silent changes are quite puzzling. ok, I'll keep that in mind. yeah, RFC status was a surprise. regards, Sergei
Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
From
Tom Lane
Date:
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > I took a look, and the patch seems correct to me. It'll need better > comments explaining the change, but I'll take care of that. Barring > objections, I'll get this committed and backpatched. Since Tomas hadn't done anything with this, I picked it up with intent to commit, and got as far as revising the comments and eliminating some duplicative condition checks (see attached). However, I then started to wonder whether it's really a good idea that the code treats PREPARE and EXECUTE alike for this purpose. In the case of EXECUTE, the idea is that we'll still be at top level when we reach the executor hooks, and they will do the right thing and then increment exec_nested_level before any lower-level statement can be reached. But how does that concept apply to PREPARE, which won't reach the executor? ISTM the net result is that if any subsidiary statements are reached during PREPARE (perhaps via const-simplification of some SQL or PL function), we will erroneously treat them as top-level. It's worth noting that the comments here explicitly mention that suppressing the nesting level bump is desired for EXECUTE, while *not* saying that for PREPARE. I've not dug in the commit history, but I wonder if this isn't a bug somebody introduced during careless refactoring to invent the PGSS_HANDLED_UTILITY macro. It's not clear to me that that macro is anything but an invitation to sloppy thinking. In short, I think we probably want to bump the nest level for everything except EXECUTE, and I'm not sure that PGSS_HANDLED_UTILITY is worth keeping. Thoughts? BTW, I'm inclined not to back-patch this. While it seems like a bug fix, it's also a behavioral change that might break someone's expectations. regards, tom lane diff --git a/contrib/pg_stat_statements/expected/level_tracking.out b/contrib/pg_stat_statements/expected/level_tracking.out index d924c87b41..e7b29e9c99 100644 --- a/contrib/pg_stat_statements/expected/level_tracking.out +++ b/contrib/pg_stat_statements/expected/level_tracking.out @@ -67,6 +67,61 @@ SELECT toplevel, calls, query FROM pg_stat_statements t | 1 | SET pg_stat_statements.track = 'all' (7 rows) +-- DO block - top-level tracking without utility. +SET pg_stat_statements.track = 'top'; +SET pg_stat_statements.track_utility = FALSE; +SELECT pg_stat_statements_reset(); + pg_stat_statements_reset +-------------------------- + +(1 row) + +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + toplevel | calls | query +----------+-------+----------------------------------- + t | 1 | DELETE FROM stats_track_tab + t | 1 | SELECT pg_stat_statements_reset() +(2 rows) + +-- DO block - all-level tracking without utility. +SET pg_stat_statements.track = 'all'; +SELECT pg_stat_statements_reset(); + pg_stat_statements_reset +-------------------------- + +(1 row) + +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + toplevel | calls | query +----------+-------+----------------------------------- + f | 1 | DELETE FROM stats_track_tab + t | 1 | DELETE FROM stats_track_tab + f | 1 | SELECT $1::TEXT + t | 1 | SELECT pg_stat_statements_reset() +(4 rows) + -- PL/pgSQL function - top-level tracking. SET pg_stat_statements.track = 'top'; SET pg_stat_statements.track_utility = FALSE; diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 767b6ceb10..794c780f24 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -100,11 +100,22 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100; #define IS_STICKY(c) ((c.calls[PGSS_PLAN] + c.calls[PGSS_EXEC]) == 0) /* - * Utility statements that pgss_ProcessUtility and pgss_post_parse_analyze - * ignores. + * Test whether a utility statement is one that we should handle in + * pgss_ProcessUtility and pgss_post_parse_analyze. + * + * If it's an EXECUTE statement, we don't track it and don't increment the + * nesting level. This allows the cycles to be charged to the underlying + * PREPARE instead (by the Executor hooks), which is much more useful. + * + * We likewise don't track execution of PREPARE. If we did, we would get one + * hash table entry for the PREPARE (with hash calculated from the query + * string), and then a different one with the same query string (but hash + * calculated from the query tree) would be used to accumulate costs of + * ensuing EXECUTEs. This would be confusing, and inconsistent with other + * cases where planning time is not included at all. */ #define PGSS_HANDLED_UTILITY(n) (!IsA(n, ExecuteStmt) && \ - !IsA(n, PrepareStmt)) + !IsA(n, PrepareStmt)) /* * Extension version number, for supporting older extension versions' objects @@ -1097,6 +1108,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, uint64 saved_queryId = pstmt->queryId; int saved_stmt_location = pstmt->stmt_location; int saved_stmt_len = pstmt->stmt_len; + bool enabled = pgss_track_utility && pgss_enabled(exec_nested_level); + bool handled = PGSS_HANDLED_UTILITY(parsetree); /* * Force utility statements to get queryId zero. We do this even in cases @@ -1112,23 +1125,15 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, * that user configured another extension to handle utility statements * only. */ - if (pgss_enabled(exec_nested_level) && pgss_track_utility) + if (enabled) pstmt->queryId = UINT64CONST(0); /* - * If it's an EXECUTE statement, we don't track it and don't increment the - * nesting level. This allows the cycles to be charged to the underlying - * PREPARE instead (by the Executor hooks), which is much more useful. - * - * We also don't track execution of PREPARE. If we did, we would get one - * hash table entry for the PREPARE (with hash calculated from the query - * string), and then a different one with the same query string (but hash - * calculated from the query tree) would be used to accumulate costs of - * ensuing EXECUTEs. This would be confusing, and inconsistent with other - * cases where planning time is not included at all. + * If we're enabled and configured to track utility statements, and the + * statement isn't one that we choose not to handle, then measure its + * execution time. */ - if (pgss_track_utility && pgss_enabled(exec_nested_level) && - PGSS_HANDLED_UTILITY(parsetree)) + if (enabled && handled) { instr_time start; instr_time duration; @@ -1206,14 +1211,29 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, } else { - if (prev_ProcessUtility) - prev_ProcessUtility(pstmt, queryString, readOnlyTree, - context, params, queryEnv, - dest, qc); - else - standard_ProcessUtility(pstmt, queryString, readOnlyTree, + /* + * We're not tracking execution time; but unless the statement is one + * we ignore altogether, we still need to increment the nesting level. + */ + if (handled) + exec_nested_level++; + PG_TRY(); + { + if (prev_ProcessUtility) + prev_ProcessUtility(pstmt, queryString, readOnlyTree, context, params, queryEnv, dest, qc); + else + standard_ProcessUtility(pstmt, queryString, readOnlyTree, + context, params, queryEnv, + dest, qc); + } + PG_FINALLY(); + { + if (handled) + exec_nested_level--; + } + PG_END_TRY(); } } diff --git a/contrib/pg_stat_statements/sql/level_tracking.sql b/contrib/pg_stat_statements/sql/level_tracking.sql index 0c20b8ce69..eab881a43b 100644 --- a/contrib/pg_stat_statements/sql/level_tracking.sql +++ b/contrib/pg_stat_statements/sql/level_tracking.sql @@ -33,6 +33,39 @@ END; $$; SELECT toplevel, calls, query FROM pg_stat_statements ORDER BY query COLLATE "C", toplevel; +-- DO block - top-level tracking without utility. +SET pg_stat_statements.track = 'top'; +SET pg_stat_statements.track_utility = FALSE; +SELECT pg_stat_statements_reset(); +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + +-- DO block - all-level tracking without utility. +SET pg_stat_statements.track = 'all'; +SELECT pg_stat_statements_reset(); +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + -- PL/pgSQL function - top-level tracking. SET pg_stat_statements.track = 'top'; SET pg_stat_statements.track_utility = FALSE;
Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
From
Tom Lane
Date:
I wrote: > However, I then started to wonder whether it's really a good idea > that the code treats PREPARE and EXECUTE alike for this purpose. > In the case of EXECUTE, the idea is that we'll still be at top > level when we reach the executor hooks, and they will do the right > thing and then increment exec_nested_level before any lower-level > statement can be reached. But how does that concept apply to > PREPARE, which won't reach the executor? ISTM the net result > is that if any subsidiary statements are reached during PREPARE > (perhaps via const-simplification of some SQL or PL function), > we will erroneously treat them as top-level. Poking at that, I found that (a) PREPARE doesn't run the planner, so probably nested statements are impossible, and (b) if we remove the check so that it bumps the nesting level, then some existing regression test outputs change. PREPARE does run parse analysis of the contained statement, and it seems that that's sensitive to the is-top-level state. So we do need to keep the exclusion for PREPARE. However, I'm still fairly down on PGSS_HANDLED_UTILITY, because I think the rationale for treating PREPARE and EXECUTE specially is different in each place where that's being used, so tying them together is more likely to cause future bugs than prevent bugs. So v3-0001 attached revises the patch per those ideas. It's only cosmetically different from before, but I think the explanatory comments are better. Meanwhile, I realized that we have a second set of bugs. As I said above, it's possible to reach nested statements during planning, as a consequence of const-simplification of an immutable or stable function. pg_stat_statements generally does the wrong thing here, incorrectly treating such statements as top-level if we weren't already nested. This needs to be fixed by including plan_nest_level as a reason to consider execution to be nested too. What's more, testing this showed that pgss_planner has the same bug as pgss_ProcessUtility: it needs to bump the nesting level even if it chooses not to track planning time. v3-0002 fixes this in a minimally invasive way by replacing each check of exec_nested_level by plan_nested_level + exec_nested_level. That's just for review/testing though. What I think we actually ought to do is fold plan_nested_level and exec_nested_level into a single variable nested_level (or nesting_level would be better English). There's no reason to keep them separate; it just adds cycles, complexity, and more risk of mistakes. Comments? regards, tom lane diff --git a/contrib/pg_stat_statements/expected/level_tracking.out b/contrib/pg_stat_statements/expected/level_tracking.out index d924c87b41..e7b29e9c99 100644 --- a/contrib/pg_stat_statements/expected/level_tracking.out +++ b/contrib/pg_stat_statements/expected/level_tracking.out @@ -67,6 +67,61 @@ SELECT toplevel, calls, query FROM pg_stat_statements t | 1 | SET pg_stat_statements.track = 'all' (7 rows) +-- DO block - top-level tracking without utility. +SET pg_stat_statements.track = 'top'; +SET pg_stat_statements.track_utility = FALSE; +SELECT pg_stat_statements_reset(); + pg_stat_statements_reset +-------------------------- + +(1 row) + +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + toplevel | calls | query +----------+-------+----------------------------------- + t | 1 | DELETE FROM stats_track_tab + t | 1 | SELECT pg_stat_statements_reset() +(2 rows) + +-- DO block - all-level tracking without utility. +SET pg_stat_statements.track = 'all'; +SELECT pg_stat_statements_reset(); + pg_stat_statements_reset +-------------------------- + +(1 row) + +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + toplevel | calls | query +----------+-------+----------------------------------- + f | 1 | DELETE FROM stats_track_tab + t | 1 | DELETE FROM stats_track_tab + f | 1 | SELECT $1::TEXT + t | 1 | SELECT pg_stat_statements_reset() +(4 rows) + -- PL/pgSQL function - top-level tracking. SET pg_stat_statements.track = 'top'; SET pg_stat_statements.track_utility = FALSE; diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index 767b6ceb10..da5ef7d0b7 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -99,13 +99,6 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100; #define USAGE_DEALLOC_PERCENT 5 /* free this % of entries at once */ #define IS_STICKY(c) ((c.calls[PGSS_PLAN] + c.calls[PGSS_EXEC]) == 0) -/* - * Utility statements that pgss_ProcessUtility and pgss_post_parse_analyze - * ignores. - */ -#define PGSS_HANDLED_UTILITY(n) (!IsA(n, ExecuteStmt) && \ - !IsA(n, PrepareStmt)) - /* * Extension version number, for supporting older extension versions' objects */ @@ -840,12 +833,14 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate) return; /* - * Clear queryId for prepared statements related utility, as those will - * inherit from the underlying statement's one. + * If it's EXECUTE, clear the queryId so that stats will accumulate for + * the underlying PREPARE. But don't do this if we're not tracking + * utility statements, to avoid messing up another extension that might be + * tracking them. */ if (query->utilityStmt) { - if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt)) + if (pgss_track_utility && IsA(query->utilityStmt, ExecuteStmt)) { query->queryId = UINT64CONST(0); return; @@ -1097,6 +1092,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, uint64 saved_queryId = pstmt->queryId; int saved_stmt_location = pstmt->stmt_location; int saved_stmt_len = pstmt->stmt_len; + bool enabled = pgss_track_utility && pgss_enabled(exec_nested_level); /* * Force utility statements to get queryId zero. We do this even in cases @@ -1112,7 +1108,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, * that user configured another extension to handle utility statements * only. */ - if (pgss_enabled(exec_nested_level) && pgss_track_utility) + if (enabled) pstmt->queryId = UINT64CONST(0); /* @@ -1124,11 +1120,13 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, * hash table entry for the PREPARE (with hash calculated from the query * string), and then a different one with the same query string (but hash * calculated from the query tree) would be used to accumulate costs of - * ensuing EXECUTEs. This would be confusing, and inconsistent with other - * cases where planning time is not included at all. + * ensuing EXECUTEs. This would be confusing. Since PREPARE doesn't + * actually run the planner (only parse+rewrite), its costs are generally + * pretty negligible and it seems okay to just ignore it. */ - if (pgss_track_utility && pgss_enabled(exec_nested_level) && - PGSS_HANDLED_UTILITY(parsetree)) + if (enabled && + !IsA(parsetree, ExecuteStmt) && + !IsA(parsetree, PrepareStmt)) { instr_time start; instr_time duration; @@ -1206,14 +1204,41 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, } else { - if (prev_ProcessUtility) - prev_ProcessUtility(pstmt, queryString, readOnlyTree, - context, params, queryEnv, - dest, qc); - else - standard_ProcessUtility(pstmt, queryString, readOnlyTree, + /* + * Even though we're not tracking execution time for this statement, + * we must still increment the nesting level, to ensure that functions + * evaluated within it are not seen as top-level calls. But don't do + * so for EXECUTE; that way, when control reaches pgss_planner or + * pgss_ExecutorStart, we will treat the costs as top-level if + * appropriate. Likewise, don't bump for PREPARE, so that parse + * analysis will treat the statement as top-level if appropriate. + * + * To be absolutely certain we don't mess up the nesting level, + * evaluate the bump_level condition just once. + */ + bool bump_level = + !IsA(parsetree, ExecuteStmt) && + !IsA(parsetree, PrepareStmt); + + if (bump_level) + exec_nested_level++; + PG_TRY(); + { + if (prev_ProcessUtility) + prev_ProcessUtility(pstmt, queryString, readOnlyTree, context, params, queryEnv, dest, qc); + else + standard_ProcessUtility(pstmt, queryString, readOnlyTree, + context, params, queryEnv, + dest, qc); + } + PG_FINALLY(); + { + if (bump_level) + exec_nested_level--; + } + PG_END_TRY(); } } diff --git a/contrib/pg_stat_statements/sql/level_tracking.sql b/contrib/pg_stat_statements/sql/level_tracking.sql index 0c20b8ce69..eab881a43b 100644 --- a/contrib/pg_stat_statements/sql/level_tracking.sql +++ b/contrib/pg_stat_statements/sql/level_tracking.sql @@ -33,6 +33,39 @@ END; $$; SELECT toplevel, calls, query FROM pg_stat_statements ORDER BY query COLLATE "C", toplevel; +-- DO block - top-level tracking without utility. +SET pg_stat_statements.track = 'top'; +SET pg_stat_statements.track_utility = FALSE; +SELECT pg_stat_statements_reset(); +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + +-- DO block - all-level tracking without utility. +SET pg_stat_statements.track = 'all'; +SELECT pg_stat_statements_reset(); +DELETE FROM stats_track_tab; +DO $$ +BEGIN + DELETE FROM stats_track_tab; +END; $$; +DO LANGUAGE plpgsql $$ +BEGIN + -- this is a SELECT + PERFORM 'hello world'::TEXT; +END; $$; +SELECT toplevel, calls, query FROM pg_stat_statements + ORDER BY query COLLATE "C", toplevel; + -- PL/pgSQL function - top-level tracking. SET pg_stat_statements.track = 'top'; SET pg_stat_statements.track_utility = FALSE; diff --git a/contrib/pg_stat_statements/expected/level_tracking.out b/contrib/pg_stat_statements/expected/level_tracking.out index e7b29e9c99..0b94c71c9c 100644 --- a/contrib/pg_stat_statements/expected/level_tracking.out +++ b/contrib/pg_stat_statements/expected/level_tracking.out @@ -173,6 +173,31 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; 1 | 1 | SELECT pg_stat_statements_reset() (3 rows) +-- immutable SQL function --- can be executed at plan time +CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS +$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL; +SELECT PLUS_THREE(8); + plus_three +------------ + 11 +(1 row) + +SELECT PLUS_THREE(10); + plus_three +------------ + 13 +(1 row) + +SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; + toplevel | calls | rows | query +----------+-------+------+------------------------------------------------------------------------------ + t | 2 | 2 | SELECT PLUS_ONE($1) + t | 2 | 2 | SELECT PLUS_THREE($1) + t | 2 | 2 | SELECT PLUS_TWO($1) + t | 1 | 3 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C" + t | 1 | 1 | SELECT pg_stat_statements_reset() +(5 rows) + -- PL/pgSQL function - all-level tracking. SET pg_stat_statements.track = 'all'; SELECT pg_stat_statements_reset(); @@ -184,6 +209,7 @@ SELECT pg_stat_statements_reset(); -- we drop and recreate the functions to avoid any caching funnies DROP FUNCTION PLUS_ONE(INTEGER); DROP FUNCTION PLUS_TWO(INTEGER); +DROP FUNCTION PLUS_THREE(INTEGER); -- PL/pgSQL function CREATE FUNCTION PLUS_TWO(i INTEGER) RETURNS INTEGER AS $$ DECLARE @@ -229,7 +255,34 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; 1 | 1 | SELECT pg_stat_statements_reset() (5 rows) -DROP FUNCTION PLUS_ONE(INTEGER); +-- immutable SQL function --- can be executed at plan time +CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS +$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL; +SELECT PLUS_THREE(8); + plus_three +------------ + 11 +(1 row) + +SELECT PLUS_THREE(10); + plus_three +------------ + 13 +(1 row) + +SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; + toplevel | calls | rows | query +----------+-------+------+------------------------------------------------------------------------------ + f | 2 | 2 | SELECT (i + $2 + $3)::INTEGER + f | 2 | 2 | SELECT (i + $2)::INTEGER LIMIT $3 + t | 2 | 2 | SELECT PLUS_ONE($1) + t | 2 | 2 | SELECT PLUS_THREE($1) + t | 2 | 2 | SELECT PLUS_TWO($1) + t | 1 | 5 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C" + f | 2 | 2 | SELECT i + $2 LIMIT $3 + t | 1 | 1 | SELECT pg_stat_statements_reset() +(8 rows) + -- -- pg_stat_statements.track = none -- diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c index da5ef7d0b7..f1cb02e463 100644 --- a/contrib/pg_stat_statements/pg_stat_statements.c +++ b/contrib/pg_stat_statements/pg_stat_statements.c @@ -829,7 +829,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate) prev_post_parse_analyze_hook(pstate, query, jstate); /* Safety check... */ - if (!pgss || !pgss_hash || !pgss_enabled(exec_nested_level)) + if (!pgss || !pgss_hash || !pgss_enabled(plan_nested_level + exec_nested_level)) return; /* @@ -954,12 +954,26 @@ pgss_planner(Query *parse, } else { - if (prev_planner_hook) - result = prev_planner_hook(parse, query_string, cursorOptions, - boundParams); - else - result = standard_planner(parse, query_string, cursorOptions, - boundParams); + /* + * Even though we're not tracking plan time for this statement, we + * must still increment the nesting level, to ensure that functions + * evaluated during planning are not seen as top-level calls. + */ + plan_nested_level++; + PG_TRY(); + { + if (prev_planner_hook) + result = prev_planner_hook(parse, query_string, cursorOptions, + boundParams); + else + result = standard_planner(parse, query_string, cursorOptions, + boundParams); + } + PG_FINALLY(); + { + plan_nested_level--; + } + PG_END_TRY(); } return result; @@ -981,7 +995,7 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags) * counting of optimizable statements that are directly contained in * utility statements. */ - if (pgss_enabled(exec_nested_level) && queryDesc->plannedstmt->queryId != UINT64CONST(0)) + if (pgss_enabled(plan_nested_level + exec_nested_level) && queryDesc->plannedstmt->queryId != UINT64CONST(0)) { /* * Set up to track total elapsed time in ExecutorRun. Make sure the @@ -1051,7 +1065,7 @@ pgss_ExecutorEnd(QueryDesc *queryDesc) uint64 queryId = queryDesc->plannedstmt->queryId; if (queryId != UINT64CONST(0) && queryDesc->totaltime && - pgss_enabled(exec_nested_level)) + pgss_enabled(plan_nested_level + exec_nested_level)) { /* * Make sure stats accumulation is done. (Note: it's okay if several @@ -1092,7 +1106,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString, uint64 saved_queryId = pstmt->queryId; int saved_stmt_location = pstmt->stmt_location; int saved_stmt_len = pstmt->stmt_len; - bool enabled = pgss_track_utility && pgss_enabled(exec_nested_level); + bool enabled = pgss_track_utility && pgss_enabled(plan_nested_level + exec_nested_level); /* * Force utility statements to get queryId zero. We do this even in cases @@ -1296,7 +1310,7 @@ pgss_store(const char *query, uint64 queryId, key.userid = GetUserId(); key.dbid = MyDatabaseId; key.queryid = queryId; - key.toplevel = (exec_nested_level == 0); + key.toplevel = (plan_nested_level + exec_nested_level == 0); /* Lookup the hash table entry with shared lock. */ LWLockAcquire(pgss->lock, LW_SHARED); diff --git a/contrib/pg_stat_statements/sql/level_tracking.sql b/contrib/pg_stat_statements/sql/level_tracking.sql index eab881a43b..dcd0b04358 100644 --- a/contrib/pg_stat_statements/sql/level_tracking.sql +++ b/contrib/pg_stat_statements/sql/level_tracking.sql @@ -90,6 +90,15 @@ SELECT PLUS_ONE(10); SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; +-- immutable SQL function --- can be executed at plan time +CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS +$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL; + +SELECT PLUS_THREE(8); +SELECT PLUS_THREE(10); + +SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; + -- PL/pgSQL function - all-level tracking. SET pg_stat_statements.track = 'all'; SELECT pg_stat_statements_reset(); @@ -97,6 +106,7 @@ SELECT pg_stat_statements_reset(); -- we drop and recreate the functions to avoid any caching funnies DROP FUNCTION PLUS_ONE(INTEGER); DROP FUNCTION PLUS_TWO(INTEGER); +DROP FUNCTION PLUS_THREE(INTEGER); -- PL/pgSQL function CREATE FUNCTION PLUS_TWO(i INTEGER) RETURNS INTEGER AS $$ @@ -118,7 +128,15 @@ SELECT PLUS_ONE(3); SELECT PLUS_ONE(1); SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -DROP FUNCTION PLUS_ONE(INTEGER); + +-- immutable SQL function --- can be executed at plan time +CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS +$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL; + +SELECT PLUS_THREE(8); +SELECT PLUS_THREE(10); + +SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"; -- -- pg_stat_statements.track = none
Re:BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
From
Sergei Kornilov
Date:
Hello Thannk you! I reviewed v3 patches. Looks good for me. I experimented with different options (track_planning, track top/all,track_utility, simple query and extended query protocols) and did not find any unexpected results in toplevel field. > BTW, I'm inclined not to back-patch this. While it seems like a bug fix, it's also a behavioral change that might breaksomeone's expectations. I'm fine with patches only on HEAD. regards, Sergei
Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
From
Tom Lane
Date:
Sergei Kornilov <sk@zsrv.org> writes: > Thannk you! I reviewed v3 patches. Looks good for me. I experimented with different options (track_planning, track top/all,track_utility, simple query and extended query protocols) and did not find any unexpected results in toplevel field. Cool. I merged the counters into one as I mentioned earlier, and pushed it. regards, tom lane