Home > mailing lists
Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data - Mailing list pgsql-bugs

From	Tom Lane
Subject	Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Date	November 1, 2023 17:15:27
Msg-id	2831679.1698858927@sss.pgh.pa.us Whole thread Raw
In response to	Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses	Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
List	pgsql-bugs
Tree view
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
> I took a look, and the patch seems correct to me. It'll need better
> comments explaining the change, but I'll take care of that. Barring
> objections, I'll get this committed and backpatched.

Since Tomas hadn't done anything with this, I picked it up with
intent to commit, and got as far as revising the comments and
eliminating some duplicative condition checks (see attached).

However, I then started to wonder whether it's really a good idea
that the code treats PREPARE and EXECUTE alike for this purpose.
In the case of EXECUTE, the idea is that we'll still be at top
level when we reach the executor hooks, and they will do the right
thing and then increment exec_nested_level before any lower-level
statement can be reached.  But how does that concept apply to
PREPARE, which won't reach the executor?  ISTM the net result
is that if any subsidiary statements are reached during PREPARE
(perhaps via const-simplification of some SQL or PL function),
we will erroneously treat them as top-level.

It's worth noting that the comments here explicitly mention that
suppressing the nesting level bump is desired for EXECUTE, while
*not* saying that for PREPARE.  I've not dug in the commit history,
but I wonder if this isn't a bug somebody introduced during
careless refactoring to invent the PGSS_HANDLED_UTILITY macro.
It's not clear to me that that macro is anything but an
invitation to sloppy thinking.

In short, I think we probably want to bump the nest level
for everything except EXECUTE, and I'm not sure that
PGSS_HANDLED_UTILITY is worth keeping.

Thoughts?

BTW, I'm inclined not to back-patch this.  While it seems like
a bug fix, it's also a behavioral change that might break
someone's expectations.

            regards, tom lane

diff --git a/contrib/pg_stat_statements/expected/level_tracking.out
b/contrib/pg_stat_statements/expected/level_tracking.out
index d924c87b41..e7b29e9c99 100644
--- a/contrib/pg_stat_statements/expected/level_tracking.out
+++ b/contrib/pg_stat_statements/expected/level_tracking.out
@@ -67,6 +67,61 @@ SELECT toplevel, calls, query FROM pg_stat_statements
  t        |     1 | SET pg_stat_statements.track = 'all'
 (7 rows)

+-- DO block - top-level tracking without utility.
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset
+--------------------------
+
+(1 row)
+
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+ toplevel | calls |               query
+----------+-------+-----------------------------------
+ t        |     1 | DELETE FROM stats_track_tab
+ t        |     1 | SELECT pg_stat_statements_reset()
+(2 rows)
+
+-- DO block - all-level tracking without utility.
+SET pg_stat_statements.track = 'all';
+SELECT pg_stat_statements_reset();
+ pg_stat_statements_reset
+--------------------------
+
+(1 row)
+
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+ toplevel | calls |               query
+----------+-------+-----------------------------------
+ f        |     1 | DELETE FROM stats_track_tab
+ t        |     1 | DELETE FROM stats_track_tab
+ f        |     1 | SELECT $1::TEXT
+ t        |     1 | SELECT pg_stat_statements_reset()
+(4 rows)
+
 -- PL/pgSQL function - top-level tracking.
 SET pg_stat_statements.track = 'top';
 SET pg_stat_statements.track_utility = FALSE;
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 767b6ceb10..794c780f24 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -100,11 +100,22 @@ static const uint32 PGSS_PG_MAJOR_VERSION = PG_VERSION_NUM / 100;
 #define IS_STICKY(c)    ((c.calls[PGSS_PLAN] + c.calls[PGSS_EXEC]) == 0)

 /*
- * Utility statements that pgss_ProcessUtility and pgss_post_parse_analyze
- * ignores.
+ * Test whether a utility statement is one that we should handle in
+ * pgss_ProcessUtility and pgss_post_parse_analyze.
+ *
+ * If it's an EXECUTE statement, we don't track it and don't increment the
+ * nesting level.  This allows the cycles to be charged to the underlying
+ * PREPARE instead (by the Executor hooks), which is much more useful.
+ *
+ * We likewise don't track execution of PREPARE.  If we did, we would get one
+ * hash table entry for the PREPARE (with hash calculated from the query
+ * string), and then a different one with the same query string (but hash
+ * calculated from the query tree) would be used to accumulate costs of
+ * ensuing EXECUTEs.  This would be confusing, and inconsistent with other
+ * cases where planning time is not included at all.
  */
 #define PGSS_HANDLED_UTILITY(n)        (!IsA(n, ExecuteStmt) && \
-                                    !IsA(n, PrepareStmt))
+                                     !IsA(n, PrepareStmt))

 /*
  * Extension version number, for supporting older extension versions' objects
@@ -1097,6 +1108,8 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
     uint64        saved_queryId = pstmt->queryId;
     int            saved_stmt_location = pstmt->stmt_location;
     int            saved_stmt_len = pstmt->stmt_len;
+    bool        enabled = pgss_track_utility && pgss_enabled(exec_nested_level);
+    bool        handled = PGSS_HANDLED_UTILITY(parsetree);

     /*
      * Force utility statements to get queryId zero.  We do this even in cases
@@ -1112,23 +1125,15 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
      * that user configured another extension to handle utility statements
      * only.
      */
-    if (pgss_enabled(exec_nested_level) && pgss_track_utility)
+    if (enabled)
         pstmt->queryId = UINT64CONST(0);

     /*
-     * If it's an EXECUTE statement, we don't track it and don't increment the
-     * nesting level.  This allows the cycles to be charged to the underlying
-     * PREPARE instead (by the Executor hooks), which is much more useful.
-     *
-     * We also don't track execution of PREPARE.  If we did, we would get one
-     * hash table entry for the PREPARE (with hash calculated from the query
-     * string), and then a different one with the same query string (but hash
-     * calculated from the query tree) would be used to accumulate costs of
-     * ensuing EXECUTEs.  This would be confusing, and inconsistent with other
-     * cases where planning time is not included at all.
+     * If we're enabled and configured to track utility statements, and the
+     * statement isn't one that we choose not to handle, then measure its
+     * execution time.
      */
-    if (pgss_track_utility && pgss_enabled(exec_nested_level) &&
-        PGSS_HANDLED_UTILITY(parsetree))
+    if (enabled && handled)
     {
         instr_time    start;
         instr_time    duration;
@@ -1206,14 +1211,29 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
     }
     else
     {
-        if (prev_ProcessUtility)
-            prev_ProcessUtility(pstmt, queryString, readOnlyTree,
-                                context, params, queryEnv,
-                                dest, qc);
-        else
-            standard_ProcessUtility(pstmt, queryString, readOnlyTree,
+        /*
+         * We're not tracking execution time; but unless the statement is one
+         * we ignore altogether, we still need to increment the nesting level.
+         */
+        if (handled)
+            exec_nested_level++;
+        PG_TRY();
+        {
+            if (prev_ProcessUtility)
+                prev_ProcessUtility(pstmt, queryString, readOnlyTree,
                                     context, params, queryEnv,
                                     dest, qc);
+            else
+                standard_ProcessUtility(pstmt, queryString, readOnlyTree,
+                                        context, params, queryEnv,
+                                        dest, qc);
+        }
+        PG_FINALLY();
+        {
+            if (handled)
+                exec_nested_level--;
+        }
+        PG_END_TRY();
     }
 }

diff --git a/contrib/pg_stat_statements/sql/level_tracking.sql b/contrib/pg_stat_statements/sql/level_tracking.sql
index 0c20b8ce69..eab881a43b 100644
--- a/contrib/pg_stat_statements/sql/level_tracking.sql
+++ b/contrib/pg_stat_statements/sql/level_tracking.sql
@@ -33,6 +33,39 @@ END; $$;
 SELECT toplevel, calls, query FROM pg_stat_statements
   ORDER BY query COLLATE "C", toplevel;

+-- DO block - top-level tracking without utility.
+SET pg_stat_statements.track = 'top';
+SET pg_stat_statements.track_utility = FALSE;
+SELECT pg_stat_statements_reset();
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+
+-- DO block - all-level tracking without utility.
+SET pg_stat_statements.track = 'all';
+SELECT pg_stat_statements_reset();
+DELETE FROM stats_track_tab;
+DO $$
+BEGIN
+  DELETE FROM stats_track_tab;
+END; $$;
+DO LANGUAGE plpgsql $$
+BEGIN
+  -- this is a SELECT
+  PERFORM 'hello world'::TEXT;
+END; $$;
+SELECT toplevel, calls, query FROM pg_stat_statements
+  ORDER BY query COLLATE "C", toplevel;
+
 -- PL/pgSQL function - top-level tracking.
 SET pg_stat_statements.track = 'top';
 SET pg_stat_statements.track_utility = FALSE;
pgsql-bugs by date:
From: Alexander Korotkov
Date: 01 November 2023, 10:29:12
Subject: Re: BUG #18170: Unexpected error: no relation entry for relid 3
From: Tom Lane
Date: 01 November 2023, 21:20:21
Subject: Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data
Re: BUG #17552: pg_stat_statements tracks internal FK check queries when COPY used to load data - Mailing list pgsql-bugs

Previous

Next