Thread: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

From

PG Bug reporting form

Date:

17 July 2022, 19:23:36

The following bug has been logged on the website:

Bug reference:      17552
Logged by:          Maxim Boguk
Email address:      maxim.boguk@gmail.com
PostgreSQL version: 14.4
Operating system:   Linux
Description:

pg_stat_statements track internal FK trigger check queries during data load
via COPY
( pg_stat_statements.track=top (default) and
pg_stat_statements.track_utility=off (not default) )
Tested on latest 13 and 14 versions.

Test script (test.sql):

drop table if exists t1 CASCADE;
drop table if exists t2;

create table t1 (id integer primary key);
insert into t1 values (1);
vacuum ANALYZE t1;

create table t2 (id integer primary key, fk integer references t1(id));

create extension  IF NOT EXISTS  pg_stat_statements;
select  e.extname AS "Name", e.extversion AS "Version" FROM
pg_catalog.pg_extension e  where e.extname='pg_stat_statements';
show pg_stat_statements.track;
show pg_stat_statements.track_utility;

select pg_stat_statements_reset();

COPY public.t2 (id, fk) FROM stdin;
1    1
2    1
3    1
4    1
5    1
6    1
7    1
8    1
9    1
10    1
11    1
12    1
13    1
14    1
15    1
16    1
17    1
18    1
19    1
20    1
21    1
22    1
23    1
24    1
25    1
26    1
27    1
28    1
29    1
30    1
31    1
32    1
33    1
34    1
35    1
36    1
37    1
38    1
39    1
40    1
41    1
42    1
43    1
44    1
45    1
46    1
47    1
48    1
49    1
50    1
51    1
52    1
53    1
54    1
55    1
56    1
57    1
58    1
59    1
60    1
61    1
62    1
63    1
64    1
65    1
66    1
67    1
68    1
69    1
70    1
71    1
72    1
73    1
74    1
75    1
76    1
77    1
78    1
79    1
80    1
81    1
82    1
83    1
84    1
85    1
86    1
87    1
88    1
89    1
90    1
91    1
92    1
93    1
94    1
95    1
96    1
97    1
98    1
99    1
100    1
\.

select query, calls from public.pg_stat_statements where dbid=(select oid
from pg_database where datname=current_database()) order by calls desc limit
2;

run:
psql -e -f test.sql -d sometestdb

Final output rows:
select query, calls from public.pg_stat_statements where dbid=(select oid
from pg_database where datname=current_database()) order by calls desc limit
2;
                                            query
                | calls 
---------------------------------------------------------------------------------------------+-------
 SELECT $2 FROM ONLY "public"."t1" x WHERE "id" OPERATOR(pg_catalog.=) $1
FOR KEY SHARE OF x |   100
 select pg_stat_statements_reset()
                |     1
(2 rows)


Expected output:
                                            query
                | calls 
---------------------------------------------------------------------------------------------+-------
 select pg_stat_statements_reset()
                |     1
(1 rows)


PS: with pg_stat_statements.track_utility=on test work as expected and
provides expected
select query, calls from public.pg_stat_statements where dbid=(select oid
from pg_database where datname=current_database()) order by calls desc limit
2;
               query                | calls 
------------------------------------+-------
 COPY public.t2 (id, fk) FROM stdin |     1
 select pg_stat_statements_reset()  |     1



Regards,
Maxim

Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

From

Julien Rouhaud

Date:

18 July 2022, 10:19:55

Hi,

On Sun, Jul 17, 2022 at 07:23:36PM +0000, PG Bug reporting form wrote:
>
> Bug reference:      17552
> Logged by:          Maxim Boguk
> Email address:      maxim.boguk@gmail.com
> PostgreSQL version: 14.4
> Operating system:   Linux
> Description:
>
> pg_stat_statements track internal FK trigger check queries during data load
> via COPY
> ( pg_stat_statements.track=top (default) and
> pg_stat_statements.track_utility=off (not default) )
> Tested on latest 13 and 14 versions.

Reproduced on current master too.

> Test script (test.sql):
> [...]
> select query, calls from public.pg_stat_statements where dbid=(select oid
> from pg_database where datname=current_database()) order by calls desc limit
> 2;
>
> run:
> psql -e -f test.sql -d sometestdb
>
> Final output rows:
> select query, calls from public.pg_stat_statements where dbid=(select oid
> from pg_database where datname=current_database()) order by calls desc limit
> 2;
>                                             query
>                 | calls
> ---------------------------------------------------------------------------------------------+-------
>  SELECT $2 FROM ONLY "public"."t1" x WHERE "id" OPERATOR(pg_catalog.=) $1
> FOR KEY SHARE OF x |   100
>  select pg_stat_statements_reset()
>                 |     1
> (2 rows)
>
>
> Expected output:
>                                             query
>                 | calls
> ---------------------------------------------------------------------------------------------+-------
>  select pg_stat_statements_reset()
>                 |     1
> (1 rows)

I agree with you that the current behavior seems wrong.  If we ignore a utility
statement, it doesn't mean that any standard query executed underneath should
become a top-level statement as it's the case now.

I see that there's an existing test that actually relies on that behavior,
which is an anonymous block code like:

DO LANGUAGE plpgsql $$
BEGIN
  -- this is a SELECT
  PERFORM 'hello world'::TEXT;
END;
$$;

which is expected to reports a "SELECT $1::text" with pg_stat_statements.track
= top and track_utility = off, and also doesn't seem right.

That test was added in 83f2061dd03, apparently covered by "Also, significantly
expand pg_stat_statements' regression test script", but without any discussion
on that specific behavior as far as I can see after a quick look.

Assuming that we want to fix it, simply increasing the exec_nested_level in the
fallback calls to prev_ProcessUtility/standard_ProcessUtility for handled
utility commands should get the correct behavior.

Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

From

Sergei Kornilov

Date:

22 March 2023, 12:21:07

Hello

I have a similar problem with exactly the same reason. Queries are counted as top level, although they are not called
atthe top level. Another small example:
 

set pg_stat_statements.track_utility to off;
set pg_stat_statements.track to 'top';
do $$ declare i int; begin select 1 as n into i; end $$ language plpgsql;
select toplevel, query from pg_stat_statements order by query;
 toplevel |               query               
----------+-----------------------------------
 t        | select $1 as n
 t        | select pg_stat_statements_reset()
(2 rows)

But "select $1 as n" was not a top-level query.

In the first patch, a testcase showing the behavior now. The second patch adds a nesting level increment if
track_utilityis disabled. I need to duplicate the PGSS_HANDLED_UTILITY check here because it's documented above:
 

> If it's an EXECUTE statement, we don't track it and don't increment the nesting level.

regards, Sergei

Attachment

Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

From

Tomas Vondra

Date:

16 July 2023, 15:19:49

On 3/22/23 13:21, Sergei Kornilov wrote:
> Hello
> 
> I have a similar problem with exactly the same reason. Queries are counted as top level, although they are not called
atthe top level. Another small example:
 
> 
> set pg_stat_statements.track_utility to off;
> set pg_stat_statements.track to 'top';
> do $$ declare i int; begin select 1 as n into i; end $$ language plpgsql;
> select toplevel, query from pg_stat_statements order by query;
>  toplevel |               query               
> ----------+-----------------------------------
>  t        | select $1 as n
>  t        | select pg_stat_statements_reset()
> (2 rows)
> 
> But "select $1 as n" was not a top-level query.
> 
> In the first patch, a testcase showing the behavior now. The second patch adds a nesting level increment if
track_utilityis disabled. I need to duplicate the PGSS_HANDLED_UTILITY check here because it's documented above:
 
> 
>> If it's an EXECUTE statement, we don't track it and don't increment the nesting level.
> 

I took a look, and the patch seems correct to me. It'll need better
comments explaining the change, but I'll take care of that. Barring
objections, I'll get this committed and backpatched.

FWIW it's probably a good idea to post patches on pgsql-hackers - not
everyone watches pgsql-bugs and/or CF app. Furthermore, when a patch is
switched to "RFC" it's customary to mention that in the thread, along
with a review. Otherwise these silent changes are quite puzzling.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

From

Sergei Kornilov

Date:

18 July 2023, 16:19:25

Hello

> I took a look, and the patch seems correct to me. It'll need better
> comments explaining the change, but I'll take care of that. Barring
> objections, I'll get this committed and backpatched.

Great, thank you!

> FWIW it's probably a good idea to post patches on pgsql-hackers - not
> everyone watches pgsql-bugs and/or CF app. Furthermore, when a patch is
> switched to "RFC" it's customary to mention that in the thread, along
> with a review. Otherwise these silent changes are quite puzzling.

ok, I'll keep that in mind.
yeah, RFC status was a surprise.

regards, Sergei

Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

From

Tom Lane

Date:

01 November 2023, 17:15:27

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> I took a look, and the patch seems correct to me. It'll need better
> comments explaining the change, but I'll take care of that. Barring
> objections, I'll get this committed and backpatched.

Since Tomas hadn't done anything with this, I picked it up with
intent to commit, and got as far as revising the comments and
eliminating some duplicative condition checks (see attached).

However, I then started to wonder whether it's really a good idea
that the code treats PREPARE and EXECUTE alike for this purpose.
In the case of EXECUTE, the idea is that we'll still be at top
level when we reach the executor hooks, and they will do the right
thing and then increment exec_nested_level before any lower-level
statement can be reached.  But how does that concept apply to
PREPARE, which won't reach the executor?  ISTM the net result
is that if any subsidiary statements are reached during PREPARE
(perhaps via const-simplification of some SQL or PL function),
we will erroneously treat them as top-level.

It's worth noting that the comments here explicitly mention that
suppressing the nesting level bump is desired for EXECUTE, while
*not* saying that for PREPARE.  I've not dug in the commit history,
but I wonder if this isn't a bug somebody introduced during
careless refactoring to invent the PGSS_HANDLED_UTILITY macro.
It's not clear to me that that macro is anything but an
invitation to sloppy thinking.

In short, I think we probably want to bump the nest level
for everything except EXECUTE, and I'm not sure that
PGSS_HANDLED_UTILITY is worth keeping.

Thoughts?

BTW, I'm inclined not to back-patch this.  While it seems like
a bug fix, it's also a behavioral change that might break
someone's expectations.

            regards, tom lane

diff --git a/contrib/pg_stat_statements/expected/level_tracking.out
b/contrib/pg_stat_statements/expected/level_tracking.out
index d924c87b41..e7b29e9c99 100644
--- a/contrib/pg_stat_statements/expected/level_tracking.out
+++ b/contrib/pg_stat_statements/expected/level_tracking.out
@@ -67,6 +67,61 @@ SELECT toplevel, calls, query FROM pg_stat_statements
  t        |     1 | SET pg_stat_statements.track = 'all'
 (7 rows)

+-- DO block - top-level tracking without utility.
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset
+--------------------------
+
+(1 row)
+
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+ toplevel | calls |               query
+----------+-------+-----------------------------------
+ t        |     1 | DELETE FROM stats_track_tab
+ t        |     1 | SELECT pg_stat_statements_reset()
+(2 rows)
+
+-- DO block - all-level tracking without utility.
+SET pg_stat_statements.track = 'all';
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset
+--------------------------
+
+(1 row)
+
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+ toplevel | calls |               query
+----------+-------+-----------------------------------
+ f        |     1 | DELETE FROM stats_track_tab
+ t        |     1 | DELETE FROM stats_track_tab
+ f        |     1 | SELECT $1::TEXT
+ t        |     1 | SELECT pg_stat_statements_reset()
+(4 rows)
+
 -- PL/pgSQL function - top-level tracking.
 SET pg_stat_statements.track = 'top';
 SET pg_stat_statements.track_utility = FALSE;
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 767b6ceb10..794c780f24 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -100,11 +100,22 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define IS_STICKY(c)    ((c.calls[PGSS_PLAN] + c.calls[PGSS_EXEC]) == 0)

 /*
- * Utility statements that pgss_ProcessUtility and pgss_post_parse_analyze
- * ignores.
+ * Test whether a utility statement is one that we should handle in
+ * pgss_ProcessUtility and pgss_post_parse_analyze.
+ *
+ * If it's an EXECUTE statement, we don't track it and don't increment the
+ * nesting level.  This allows the cycles to be charged to the underlying
+ * PREPARE instead (by the Executor hooks), which is much more useful.
+ *
+ * We likewise don't track execution of PREPARE.  If we did, we would get one
+ * hash table entry for the PREPARE (with hash calculated from the query
+ * string), and then a different one with the same query string (but hash
+ * calculated from the query tree) would be used to accumulate costs of
+ * ensuing EXECUTEs.  This would be confusing, and inconsistent with other
+ * cases where planning time is not included at all.
  */
 #define PGSS_HANDLED_UTILITY(n)        (!IsA(n, ExecuteStmt) && \
-                                    !IsA(n, PrepareStmt))
+                                     !IsA(n, PrepareStmt))

 /*
  * Extension version number, for supporting older extension versions' objects
@@ -1097,6 +1108,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
     uint64        saved_queryId = pstmt->queryId;
     int            saved_stmt_location = pstmt->stmt_location;
     int            saved_stmt_len = pstmt->stmt_len;
+    bool        enabled = pgss_track_utility && pgss_enabled(exec_nested_level);
+    bool        handled = PGSS_HANDLED_UTILITY(parsetree);

     /*
      * Force utility statements to get queryId zero.  We do this even in cases
@@ -1112,23 +1125,15 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
      * that user configured another extension to handle utility statements
      * only.
      */
-    if (pgss_enabled(exec_nested_level) && pgss_track_utility)
+    if (enabled)
         pstmt->queryId = UINT64CONST(0);

     /*
-     * If it's an EXECUTE statement, we don't track it and don't increment the
-     * nesting level.  This allows the cycles to be charged to the underlying
-     * PREPARE instead (by the Executor hooks), which is much more useful.
-     *
-     * We also don't track execution of PREPARE.  If we did, we would get one
-     * hash table entry for the PREPARE (with hash calculated from the query
-     * string), and then a different one with the same query string (but hash
-     * calculated from the query tree) would be used to accumulate costs of
-     * ensuing EXECUTEs.  This would be confusing, and inconsistent with other
-     * cases where planning time is not included at all.
+     * If we're enabled and configured to track utility statements, and the
+     * statement isn't one that we choose not to handle, then measure its
+     * execution time.
      */
-    if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
-        PGSS_HANDLED_UTILITY(parsetree))
+    if (enabled && handled)
     {
         instr_time    start;
         instr_time    duration;
@@ -1206,14 +1211,29 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
     }
     else
     {
-        if (prev_ProcessUtility)
-            prev_ProcessUtility(pstmt, queryString, readOnlyTree,
-                                context, params, queryEnv,
-                                dest, qc);
-        else
-            standard_ProcessUtility(pstmt, queryString, readOnlyTree,
+        /*
+         * We're not tracking execution time; but unless the statement is one
+         * we ignore altogether, we still need to increment the nesting level.
+         */
+        if (handled)
+            exec_nested_level++;
+        PG_TRY();
+        {
+            if (prev_ProcessUtility)
+                prev_ProcessUtility(pstmt, queryString, readOnlyTree,
                                     context, params, queryEnv,
                                     dest, qc);
+            else
+                standard_ProcessUtility(pstmt, queryString, readOnlyTree,
+                                        context, params, queryEnv,
+                                        dest, qc);
+        }
+        PG_FINALLY();
+        {
+            if (handled)
+                exec_nested_level--;
+        }
+        PG_END_TRY();
     }
 }

diff --git a/contrib/pg_stat_statements/sql/level_tracking.sql b/contrib/pg_stat_statements/sql/level_tracking.sql
index 0c20b8ce69..eab881a43b 100644
--- a/contrib/pg_stat_statements/sql/level_tracking.sql
+++ b/contrib/pg_stat_statements/sql/level_tracking.sql
@@ -33,6 +33,39 @@ END; $$;
 SELECT toplevel, calls, query FROM pg_stat_statements
   ORDER BY query COLLATE "C", toplevel;

+-- DO block - top-level tracking without utility.
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+
+-- DO block - all-level tracking without utility.
+SET pg_stat_statements.track = 'all';
+SELECT pg_stat_statements_reset();
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+
 -- PL/pgSQL function - top-level tracking.
 SET pg_stat_statements.track = 'top';
 SET pg_stat_statements.track_utility = FALSE;

Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

From

Tom Lane

Date:

01 November 2023, 21:20:21

I wrote:
> However, I then started to wonder whether it's really a good idea
> that the code treats PREPARE and EXECUTE alike for this purpose.
> In the case of EXECUTE, the idea is that we'll still be at top
> level when we reach the executor hooks, and they will do the right
> thing and then increment exec_nested_level before any lower-level
> statement can be reached.  But how does that concept apply to
> PREPARE, which won't reach the executor?  ISTM the net result
> is that if any subsidiary statements are reached during PREPARE
> (perhaps via const-simplification of some SQL or PL function),
> we will erroneously treat them as top-level.

Poking at that, I found that (a) PREPARE doesn't run the planner,
so probably nested statements are impossible, and (b) if we
remove the check so that it bumps the nesting level, then some
existing regression test outputs change.  PREPARE does run
parse analysis of the contained statement, and it seems that
that's sensitive to the is-top-level state.  So we do need to
keep the exclusion for PREPARE.  However, I'm still fairly down
on PGSS_HANDLED_UTILITY, because I think the rationale for
treating PREPARE and EXECUTE specially is different in each
place where that's being used, so tying them together is more
likely to cause future bugs than prevent bugs.

So v3-0001 attached revises the patch per those ideas.  It's only
cosmetically different from before, but I think the explanatory
comments are better.

Meanwhile, I realized that we have a second set of bugs.  As I
said above, it's possible to reach nested statements during
planning, as a consequence of const-simplification of an immutable
or stable function.  pg_stat_statements generally does the wrong
thing here, incorrectly treating such statements as top-level if
we weren't already nested.  This needs to be fixed by including
plan_nest_level as a reason to consider execution to be nested too.
What's more, testing this showed that pgss_planner has the same bug
as pgss_ProcessUtility: it needs to bump the nesting level even
if it chooses not to track planning time.

v3-0002 fixes this in a minimally invasive way by replacing each
check of exec_nested_level by plan_nested_level + exec_nested_level.
That's just for review/testing though.  What I think we actually
ought to do is fold plan_nested_level and exec_nested_level into
a single variable nested_level (or nesting_level would be better
English).  There's no reason to keep them separate; it just adds
cycles, complexity, and more risk of mistakes.

Comments?

            regards, tom lane

diff --git a/contrib/pg_stat_statements/expected/level_tracking.out
b/contrib/pg_stat_statements/expected/level_tracking.out
index d924c87b41..e7b29e9c99 100644
--- a/contrib/pg_stat_statements/expected/level_tracking.out
+++ b/contrib/pg_stat_statements/expected/level_tracking.out
@@ -67,6 +67,61 @@ SELECT toplevel, calls, query FROM pg_stat_statements
  t        |     1 | SET pg_stat_statements.track = 'all'
 (7 rows)

+-- DO block - top-level tracking without utility.
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset
+--------------------------
+
+(1 row)
+
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+ toplevel | calls |               query
+----------+-------+-----------------------------------
+ t        |     1 | DELETE FROM stats_track_tab
+ t        |     1 | SELECT pg_stat_statements_reset()
+(2 rows)
+
+-- DO block - all-level tracking without utility.
+SET pg_stat_statements.track = 'all';
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset
+--------------------------
+
+(1 row)
+
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+ toplevel | calls |               query
+----------+-------+-----------------------------------
+ f        |     1 | DELETE FROM stats_track_tab
+ t        |     1 | DELETE FROM stats_track_tab
+ f        |     1 | SELECT $1::TEXT
+ t        |     1 | SELECT pg_stat_statements_reset()
+(4 rows)
+
 -- PL/pgSQL function - top-level tracking.
 SET pg_stat_statements.track = 'top';
 SET pg_stat_statements.track_utility = FALSE;
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 767b6ceb10..da5ef7d0b7 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -99,13 +99,6 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define USAGE_DEALLOC_PERCENT    5    /* free this % of entries at once */
 #define IS_STICKY(c)    ((c.calls[PGSS_PLAN] + c.calls[PGSS_EXEC]) == 0)

-/*
- * Utility statements that pgss_ProcessUtility and pgss_post_parse_analyze
- * ignores.
- */
-#define PGSS_HANDLED_UTILITY(n)        (!IsA(n, ExecuteStmt) && \
-                                    !IsA(n, PrepareStmt))
-
 /*
  * Extension version number, for supporting older extension versions' objects
  */
@@ -840,12 +833,14 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
         return;

     /*
-     * Clear queryId for prepared statements related utility, as those will
-     * inherit from the underlying statement's one.
+     * If it's EXECUTE, clear the queryId so that stats will accumulate for
+     * the underlying PREPARE.  But don't do this if we're not tracking
+     * utility statements, to avoid messing up another extension that might be
+     * tracking them.
      */
     if (query->utilityStmt)
     {
-        if (pgss_track_utility && !PGSS_HANDLED_UTILITY(query->utilityStmt))
+        if (pgss_track_utility && IsA(query->utilityStmt, ExecuteStmt))
         {
             query->queryId = UINT64CONST(0);
             return;
@@ -1097,6 +1092,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
     uint64        saved_queryId = pstmt->queryId;
     int            saved_stmt_location = pstmt->stmt_location;
     int            saved_stmt_len = pstmt->stmt_len;
+    bool        enabled = pgss_track_utility && pgss_enabled(exec_nested_level);

     /*
      * Force utility statements to get queryId zero.  We do this even in cases
@@ -1112,7 +1108,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
      * that user configured another extension to handle utility statements
      * only.
      */
-    if (pgss_enabled(exec_nested_level) && pgss_track_utility)
+    if (enabled)
         pstmt->queryId = UINT64CONST(0);

     /*
@@ -1124,11 +1120,13 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
      * hash table entry for the PREPARE (with hash calculated from the query
      * string), and then a different one with the same query string (but hash
      * calculated from the query tree) would be used to accumulate costs of
-     * ensuing EXECUTEs.  This would be confusing, and inconsistent with other
-     * cases where planning time is not included at all.
+     * ensuing EXECUTEs.  This would be confusing.  Since PREPARE doesn't
+     * actually run the planner (only parse+rewrite), its costs are generally
+     * pretty negligible and it seems okay to just ignore it.
      */
-    if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
-        PGSS_HANDLED_UTILITY(parsetree))
+    if (enabled &&
+        !IsA(parsetree, ExecuteStmt) &&
+        !IsA(parsetree, PrepareStmt))
     {
         instr_time    start;
         instr_time    duration;
@@ -1206,14 +1204,41 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
     }
     else
     {
-        if (prev_ProcessUtility)
-            prev_ProcessUtility(pstmt, queryString, readOnlyTree,
-                                context, params, queryEnv,
-                                dest, qc);
-        else
-            standard_ProcessUtility(pstmt, queryString, readOnlyTree,
+        /*
+         * Even though we're not tracking execution time for this statement,
+         * we must still increment the nesting level, to ensure that functions
+         * evaluated within it are not seen as top-level calls.  But don't do
+         * so for EXECUTE; that way, when control reaches pgss_planner or
+         * pgss_ExecutorStart, we will treat the costs as top-level if
+         * appropriate.  Likewise, don't bump for PREPARE, so that parse
+         * analysis will treat the statement as top-level if appropriate.
+         *
+         * To be absolutely certain we don't mess up the nesting level,
+         * evaluate the bump_level condition just once.
+         */
+        bool        bump_level =
+            !IsA(parsetree, ExecuteStmt) &&
+            !IsA(parsetree, PrepareStmt);
+
+        if (bump_level)
+            exec_nested_level++;
+        PG_TRY();
+        {
+            if (prev_ProcessUtility)
+                prev_ProcessUtility(pstmt, queryString, readOnlyTree,
                                     context, params, queryEnv,
                                     dest, qc);
+            else
+                standard_ProcessUtility(pstmt, queryString, readOnlyTree,
+                                        context, params, queryEnv,
+                                        dest, qc);
+        }
+        PG_FINALLY();
+        {
+            if (bump_level)
+                exec_nested_level--;
+        }
+        PG_END_TRY();
     }
 }

diff --git a/contrib/pg_stat_statements/sql/level_tracking.sql b/contrib/pg_stat_statements/sql/level_tracking.sql
index 0c20b8ce69..eab881a43b 100644
--- a/contrib/pg_stat_statements/sql/level_tracking.sql
+++ b/contrib/pg_stat_statements/sql/level_tracking.sql
@@ -33,6 +33,39 @@ END; $$;
 SELECT toplevel, calls, query FROM pg_stat_statements
   ORDER BY query COLLATE "C", toplevel;

+-- DO block - top-level tracking without utility.
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+
+-- DO block - all-level tracking without utility.
+SET pg_stat_statements.track = 'all';
+SELECT pg_stat_statements_reset();
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+
 -- PL/pgSQL function - top-level tracking.
 SET pg_stat_statements.track = 'top';
 SET pg_stat_statements.track_utility = FALSE;
diff --git a/contrib/pg_stat_statements/expected/level_tracking.out
b/contrib/pg_stat_statements/expected/level_tracking.out
index e7b29e9c99..0b94c71c9c 100644
--- a/contrib/pg_stat_statements/expected/level_tracking.out
+++ b/contrib/pg_stat_statements/expected/level_tracking.out
@@ -173,6 +173,31 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
      1 |    1 | SELECT pg_stat_statements_reset()
 (3 rows)

+-- immutable SQL function --- can be executed at plan time
+CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS
+$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL;
+SELECT PLUS_THREE(8);
+ plus_three
+------------
+         11
+(1 row)
+
+SELECT PLUS_THREE(10);
+ plus_three
+------------
+         13
+(1 row)
+
+SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ toplevel | calls | rows |                                    query
+----------+-------+------+------------------------------------------------------------------------------
+ t        |     2 |    2 | SELECT PLUS_ONE($1)
+ t        |     2 |    2 | SELECT PLUS_THREE($1)
+ t        |     2 |    2 | SELECT PLUS_TWO($1)
+ t        |     1 |    3 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+ t        |     1 |    1 | SELECT pg_stat_statements_reset()
+(5 rows)
+
 -- PL/pgSQL function - all-level tracking.
 SET pg_stat_statements.track = 'all';
 SELECT pg_stat_statements_reset();
@@ -184,6 +209,7 @@ SELECT pg_stat_statements_reset();
 -- we drop and recreate the functions to avoid any caching funnies
 DROP FUNCTION PLUS_ONE(INTEGER);
 DROP FUNCTION PLUS_TWO(INTEGER);
+DROP FUNCTION PLUS_THREE(INTEGER);
 -- PL/pgSQL function
 CREATE FUNCTION PLUS_TWO(i INTEGER) RETURNS INTEGER AS $$
 DECLARE
@@ -229,7 +255,34 @@ SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
      1 |    1 | SELECT pg_stat_statements_reset()
 (5 rows)

-DROP FUNCTION PLUS_ONE(INTEGER);
+-- immutable SQL function --- can be executed at plan time
+CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS
+$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL;
+SELECT PLUS_THREE(8);
+ plus_three
+------------
+         11
+(1 row)
+
+SELECT PLUS_THREE(10);
+ plus_three
+------------
+         13
+(1 row)
+
+SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+ toplevel | calls | rows |                                    query
+----------+-------+------+------------------------------------------------------------------------------
+ f        |     2 |    2 | SELECT (i + $2 + $3)::INTEGER
+ f        |     2 |    2 | SELECT (i + $2)::INTEGER LIMIT $3
+ t        |     2 |    2 | SELECT PLUS_ONE($1)
+ t        |     2 |    2 | SELECT PLUS_THREE($1)
+ t        |     2 |    2 | SELECT PLUS_TWO($1)
+ t        |     1 |    5 | SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C"
+ f        |     2 |    2 | SELECT i + $2 LIMIT $3
+ t        |     1 |    1 | SELECT pg_stat_statements_reset()
+(8 rows)
+
 --
 -- pg_stat_statements.track = none
 --
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index da5ef7d0b7..f1cb02e463 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -829,7 +829,7 @@ pgss_post_parse_analyze(ParseState *pstate, Query *query, JumbleState *jstate)
         prev_post_parse_analyze_hook(pstate, query, jstate);

     /* Safety check... */
-    if (!pgss || !pgss_hash || !pgss_enabled(exec_nested_level))
+    if (!pgss || !pgss_hash || !pgss_enabled(plan_nested_level + exec_nested_level))
         return;

     /*
@@ -954,12 +954,26 @@ pgss_planner(Query *parse,
     }
     else
     {
-        if (prev_planner_hook)
-            result = prev_planner_hook(parse, query_string, cursorOptions,
-                                       boundParams);
-        else
-            result = standard_planner(parse, query_string, cursorOptions,
-                                      boundParams);
+        /*
+         * Even though we're not tracking plan time for this statement, we
+         * must still increment the nesting level, to ensure that functions
+         * evaluated during planning are not seen as top-level calls.
+         */
+        plan_nested_level++;
+        PG_TRY();
+        {
+            if (prev_planner_hook)
+                result = prev_planner_hook(parse, query_string, cursorOptions,
+                                           boundParams);
+            else
+                result = standard_planner(parse, query_string, cursorOptions,
+                                          boundParams);
+        }
+        PG_FINALLY();
+        {
+            plan_nested_level--;
+        }
+        PG_END_TRY();
     }

     return result;
@@ -981,7 +995,7 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
      * counting of optimizable statements that are directly contained in
      * utility statements.
      */
-    if (pgss_enabled(exec_nested_level) && queryDesc->plannedstmt->queryId != UINT64CONST(0))
+    if (pgss_enabled(plan_nested_level + exec_nested_level) && queryDesc->plannedstmt->queryId != UINT64CONST(0))
     {
         /*
          * Set up to track total elapsed time in ExecutorRun.  Make sure the
@@ -1051,7 +1065,7 @@ pgss_ExecutorEnd(QueryDesc *queryDesc)
     uint64        queryId = queryDesc->plannedstmt->queryId;

     if (queryId != UINT64CONST(0) && queryDesc->totaltime &&
-        pgss_enabled(exec_nested_level))
+        pgss_enabled(plan_nested_level + exec_nested_level))
     {
         /*
          * Make sure stats accumulation is done.  (Note: it's okay if several
@@ -1092,7 +1106,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
     uint64        saved_queryId = pstmt->queryId;
     int            saved_stmt_location = pstmt->stmt_location;
     int            saved_stmt_len = pstmt->stmt_len;
-    bool        enabled = pgss_track_utility && pgss_enabled(exec_nested_level);
+    bool        enabled = pgss_track_utility && pgss_enabled(plan_nested_level + exec_nested_level);

     /*
      * Force utility statements to get queryId zero.  We do this even in cases
@@ -1296,7 +1310,7 @@ pgss_store(const char *query, uint64 queryId,
     key.userid = GetUserId();
     key.dbid = MyDatabaseId;
     key.queryid = queryId;
-    key.toplevel = (exec_nested_level == 0);
+    key.toplevel = (plan_nested_level + exec_nested_level == 0);

     /* Lookup the hash table entry with shared lock. */
     LWLockAcquire(pgss->lock, LW_SHARED);
diff --git a/contrib/pg_stat_statements/sql/level_tracking.sql b/contrib/pg_stat_statements/sql/level_tracking.sql
index eab881a43b..dcd0b04358 100644
--- a/contrib/pg_stat_statements/sql/level_tracking.sql
+++ b/contrib/pg_stat_statements/sql/level_tracking.sql
@@ -90,6 +90,15 @@ SELECT PLUS_ONE(10);

 SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";

+-- immutable SQL function --- can be executed at plan time
+CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS
+$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL;
+
+SELECT PLUS_THREE(8);
+SELECT PLUS_THREE(10);
+
+SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
+
 -- PL/pgSQL function - all-level tracking.
 SET pg_stat_statements.track = 'all';
 SELECT pg_stat_statements_reset();
@@ -97,6 +106,7 @@ SELECT pg_stat_statements_reset();
 -- we drop and recreate the functions to avoid any caching funnies
 DROP FUNCTION PLUS_ONE(INTEGER);
 DROP FUNCTION PLUS_TWO(INTEGER);
+DROP FUNCTION PLUS_THREE(INTEGER);

 -- PL/pgSQL function
 CREATE FUNCTION PLUS_TWO(i INTEGER) RETURNS INTEGER AS $$
@@ -118,7 +128,15 @@ SELECT PLUS_ONE(3);
 SELECT PLUS_ONE(1);

 SELECT calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";
-DROP FUNCTION PLUS_ONE(INTEGER);
+
+-- immutable SQL function --- can be executed at plan time
+CREATE FUNCTION PLUS_THREE(i INTEGER) RETURNS INTEGER AS
+$$ SELECT i + 3 LIMIT 1 $$ IMMUTABLE LANGUAGE SQL;
+
+SELECT PLUS_THREE(8);
+SELECT PLUS_THREE(10);
+
+SELECT toplevel, calls, rows, query FROM pg_stat_statements ORDER BY query COLLATE "C";

 --
 -- pg_stat_statements.track = none

Re:BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

From

Sergei Kornilov

Date:

07 November 2023, 20:23:17

Hello

Thannk you! I reviewed v3 patches. Looks good for me. I experimented with different options (track_planning, track
top/all,track_utility, simple query and extended query protocols) and did not find any unexpected results in toplevel
field.

> BTW, I'm inclined not to back-patch this. While it seems like a bug fix, it's also a behavioral change that might
breaksomeone's expectations.
 

I'm fine with patches only on HEAD.

regards, Sergei

Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data

From

Tom Lane

Date:

08 November 2023, 17:02:59

Sergei Kornilov <sk@zsrv.org> writes:
> Thannk you! I reviewed v3 patches. Looks good for me. I experimented with different options (track_planning, track
top/all,track_utility, simple query and extended query protocols) and did not find any unexpected results in toplevel
field.

Cool.  I merged the counters into one as I mentioned earlier,
and pushed it.

            regards, tom lane