Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
Date
Msg-id 9684.1333043839@sss.pgh.pa.us
Whole thread Raw
In response to Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
List pgsql-hackers
I wrote:
> Hm ... I just had a different idea.  I need to go look at the code
> again, but I believe that in the problematic cases, the post-analyze
> hook does not compute a queryId for the optimizable statement.  This
> means that it will arrive at the executor with queryId zero.  What if
> we simply made the executor hooks do nothing when queryId is zero?
> (Note that this also means that in the problematic cases, the behavior
> is already pretty wrong because executor costs for *all* statements of
> this sort are getting merged into one hashtable entry for hash zero.)

The attached proposed patch does it that way.  It makes the EXPLAIN,
SELECT INTO, and DECLARE CURSOR cases behave as expected for utility
statements.  PREPARE/EXECUTE work a bit funny though: if you have
track = all then you get EXECUTE cycles reported against both the
EXECUTE statement and the underlying PREPARE.  This is because when
PREPARE calls parse_analyze_varparams the post-analyze hook doesn't know
that this isn't a top-level statement, so it marks the query with a
queryId.  I don't see any way around that part without something like
what I suggested before.  However, this behavior seems to me to be
considerably less of a POLA violation than the cases involving two
identical-looking entries for self-contained statements, and it might
even be thought to be a feature not a bug (since the PREPARE entry will
accumulate totals for all uses of the prepared statement).  So I'm
satisfied with it for now.

            regards, tom lane

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 137b242..bebfff1 100644
*** a/contrib/pg_stat_statements/pg_stat_statements.c
--- b/contrib/pg_stat_statements/pg_stat_statements.c
*************** pgss_post_parse_analyze(ParseState *psta
*** 602,610 ****
      if (!pgss || !pgss_hash)
          return;

!     /* We do nothing with utility statements at this stage */
      if (query->utilityStmt)
          return;

      /* Set up workspace for query jumbling */
      jstate.jumble = (unsigned char *) palloc(JUMBLE_SIZE);
--- 602,620 ----
      if (!pgss || !pgss_hash)
          return;

!     /*
!      * Utility statements get queryId zero.  We do this even in cases where
!      * the statement contains an optimizable statement for which a queryId
!      * could be derived (such as EXPLAIN or DECLARE CURSOR).  For such cases,
!      * runtime control will first go through ProcessUtility and then the
!      * executor, and we don't want the executor hooks to do anything, since
!      * we are already measuring the statement's costs at the utility level.
!      */
      if (query->utilityStmt)
+     {
+         query->queryId = 0;
          return;
+     }

      /* Set up workspace for query jumbling */
      jstate.jumble = (unsigned char *) palloc(JUMBLE_SIZE);
*************** pgss_post_parse_analyze(ParseState *psta
*** 619,624 ****
--- 629,641 ----
      query->queryId = hash_any(jstate.jumble, jstate.jumble_len);

      /*
+      * If we are unlucky enough to get a hash of zero, use 1 instead, to
+      * prevent confusion with the utility-statement case.
+      */
+     if (query->queryId == 0)
+         query->queryId = 1;
+
+     /*
       * If we were able to identify any ignorable constants, we immediately
       * create a hash table entry for the query, so that we can record the
       * normalized form of the query string.  If there were no such constants,
*************** pgss_ExecutorStart(QueryDesc *queryDesc,
*** 649,655 ****
      else
          standard_ExecutorStart(queryDesc, eflags);

!     if (pgss_enabled())
      {
          /*
           * Set up to track total elapsed time in ExecutorRun.  Make sure the
--- 666,677 ----
      else
          standard_ExecutorStart(queryDesc, eflags);

!     /*
!      * If query has queryId zero, don't track it.  This prevents double
!      * counting of optimizable statements that are directly contained in
!      * utility statements.
!      */
!     if (pgss_enabled() && queryDesc->plannedstmt->queryId != 0)
      {
          /*
           * Set up to track total elapsed time in ExecutorRun.  Make sure the
*************** pgss_ExecutorFinish(QueryDesc *queryDesc
*** 719,731 ****
  static void
  pgss_ExecutorEnd(QueryDesc *queryDesc)
  {
!     if (queryDesc->totaltime && pgss_enabled())
!     {
!         uint32        queryId;
!
!         /* Query's ID should have been filled in by post-analyze hook */
!         queryId = queryDesc->plannedstmt->queryId;

          /*
           * Make sure stats accumulation is done.  (Note: it's okay if several
           * levels of hook all do this.)
--- 741,750 ----
  static void
  pgss_ExecutorEnd(QueryDesc *queryDesc)
  {
!     uint32        queryId = queryDesc->plannedstmt->queryId;

+     if (queryId != 0 && queryDesc->totaltime && pgss_enabled())
+     {
          /*
           * Make sure stats accumulation is done.  (Note: it's okay if several
           * levels of hook all do this.)

pgsql-hackers by date:

Previous
From: "David E. Wheeler"
Date:
Subject: Re: Finer Extension dependencies
Next
From: Robert Haas
Date:
Subject: Re: Finer Extension dependencies