Thread: In what range of the code can we read debug_query_string?

In what range of the code can we read debug_query_string?

From
Kyotaro HORIGUCHI
Date:
Hello.

While looking into *some* tool that heavily reliant on
debug_query_string, I found that the tool can miss it.  However I
believed that it is always set when post_parse_analyze_hook is
called, it is not while processing DESCRIBE message of extended
protocol. I believe that the assumption is not so ridiculous.

The call stack is as the follows.

parse_analze()
pg_analyze_and_rewrite()
RevalidateCachedQuery()
CachedPlanGetTargetList()
exec_describe_statement_message()
PostgreMain() (message 'DS')

It is set for other kinds of message, (parse, bind, execute). I
think fastpath, close, flush and sync don't need that. If it is
reasonable to assume that we can see debug_query_string in the
DESCRIBE path, the attached patch would work.

The exec_describe_statement_message() case seems rather simple
but the exec_describe_portal_message() case is troublesome.

But, in the first place, the biggest problem is the fact that I
myself haven't been able to run the path...

Any suggestions, thoughts, opinions are welcome.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
***************
*** 2399,2407 **** exec_describe_statement_message(const char *stmt_name)
--- 2399,2414 ----
      {
          List       *tlist;
  
+         /*
+          * Report query to various monitoring facilities.
+          */
+         debug_query_string = psrc->query_string;
+ 
          /* Get the plan's primary targetlist */
          tlist = CachedPlanGetTargetList(psrc, NULL);
  
+         debug_query_string = NULL;
+ 
          SendRowDescriptionMessage(&row_description_buf,
                                    psrc->resultDesc,
                                    tlist,
*** a/src/backend/utils/cache/plancache.c
--- b/src/backend/utils/cache/plancache.c
***************
*** 1429,1434 **** CachedPlanGetTargetList(CachedPlanSource *plansource,
--- 1429,1436 ----
                          QueryEnvironment *queryEnv)
  {
      Query       *pstmt;
+     bool        reset_debug_query_string = false;
+     List       *ret;
  
      /* Assert caller is doing things in a sane order */
      Assert(plansource->magic == CACHEDPLANSOURCE_MAGIC);
***************
*** 1441,1453 **** CachedPlanGetTargetList(CachedPlanSource *plansource,
      if (plansource->resultDesc == NULL)
          return NIL;
  
      /* Make sure the querytree list is valid and we have parse-time locks */
      RevalidateCachedQuery(plansource, queryEnv);
  
      /* Get the primary statement and find out what it returns */
      pstmt = QueryListGetPrimaryStmt(plansource->query_list);
  
!     return FetchStatementTargetList((Node *) pstmt);
  }
  
  /*
--- 1443,1470 ----
      if (plansource->resultDesc == NULL)
          return NIL;
  
+     
+     /*
+      * Report query to various monitoring facilities if we haven't done.
+      */
+     if (!debug_query_string && plansource->query_string)
+     {
+         debug_query_string = plansource->query_string;
+         reset_debug_query_string = true;
+     }
+ 
      /* Make sure the querytree list is valid and we have parse-time locks */
      RevalidateCachedQuery(plansource, queryEnv);
  
      /* Get the primary statement and find out what it returns */
      pstmt = QueryListGetPrimaryStmt(plansource->query_list);
  
!     ret = FetchStatementTargetList((Node *) pstmt);
! 
!     if (reset_debug_query_string)
!         debug_query_string = NULL;
! 
!     return ret;
  }
  
  /*

Re: In what range of the code can we read debug_query_string?

From
Tom Lane
Date:
Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> It is set for other kinds of message, (parse, bind, execute). I
> think fastpath, close, flush and sync don't need that. If it is
> reasonable to assume that we can see debug_query_string in the
> DESCRIBE path, the attached patch would work.

I think this patch is a bad idea.  In the first place, debug_query_string
is generally understood as the current *client* command string, not
something internally generated.  In the second place, it looks way too
easy for this to leave debug_query_string as a dangling pointer,
ie pointing at a dropped plancache entry.

There might be a case for providing some way to get at the current
actively-executing query's string, using a protocol that can deal
with nesting of execution.  (This might already be more or less
possible via ActivePortal->sourceText, depending on what you think
the semantics ought to be.)  But debug_query_string was never
intended to do that.

            regards, tom lane


Re: In what range of the code can we read debug_query_string?

From
Kyotaro HORIGUCHI
Date:
Thank you for the comment.

At Fri, 25 May 2018 10:08:08 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in <3389.1527257288@sss.pgh.pa.us>
> Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> writes:
> > It is set for other kinds of message, (parse, bind, execute). I
> > think fastpath, close, flush and sync don't need that. If it is
> > reasonable to assume that we can see debug_query_string in the
> > DESCRIBE path, the attached patch would work.
> 
> I think this patch is a bad idea.

I agree on the CachedPlanGetTargetList (Describe-Portal) case.

>                                    In the first place, debug_query_string
> is generally understood as the current *client* command string, not
> something internally generated.

Ture. It is commented on the variable, and I see bind and execute
restore the prepared statement (string) to the variable while
processing on the other hand. So my core question here is whether
the same reasoning is applicable on describe_statement. Execute
returns the result of a statement, and describe returns the
input/output description for the same statement these are
different only in what they returns for the same query.

>                                 In the second place, it looks way too
> easy for this to leave debug_query_string as a dangling pointer,
> ie pointing at a dropped plancache entry.

exec_bind_message uses CachedPlanSource.plansource for
debug_query_string. If the describe_statement case is problematic
in the way, is exec also problematic in the same way?

> There might be a case for providing some way to get at the current
> actively-executing query's string, using a protocol that can deal
> with nesting of execution.  (This might already be more or less
> possible via ActivePortal->sourceText, depending on what you think
> the semantics ought to be.)  But debug_query_string was never
> intended to do that.

I'm not sure of for what purpose the describe_statement is used,
but exec_execute_messages seems doing exactly that.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center