Re: RFC: Logging plan of the running query - Mailing list pgsql-hackers

From torikoshia
Subject Re: RFC: Logging plan of the running query
Date
Msg-id 3bb7bacbc5ef9e58e2ece0557adac98b@oss.nttdata.com
Whole thread Raw
In response to Re: RFC: Logging plan of the running query  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: RFC: Logging plan of the running query
Re: RFC: Logging plan of the running query
List pgsql-hackers
2022-02-01 01:51, Fujii Masao wrote:
Thanks for reviewing and suggestions!

>>    +        Note that there is the case where the request to log a 
>> query
>>    +        plan is skipped even while the target backend is running a
>>    +        query due to lock conflict avoidance.
>>    +        If this happens, users can just retry pg_log_query_plan().
> 
> This may cause users to misunderstand that pg_log_query_plan() fails
> while the target backend is holding *any* locks? Isn't it better to
> mention "page-level locks", instead? So how about the following?
> 
> --------------------------
> Note that the request to log the query plan will be ignored if it's
> received during a short period while the target backend is holding a
> page-level lock.
> --------------------------

Agreed.

>>> +            ereport(LOG_SERVER_ONLY,
>>> +                errmsg("backend with PID %d is holding a page lock. 
>>> Try again",
>>> +                    MyProcPid));
>>> 
>>> It seems more proper to output this message in DETAIL or HINT,
>>> instead. So how about something like the following messages?
>>> 
>>> LOG: could not log the query plan
>>> DETAIL: query plan cannot be logged while page level lock is being 
>>> held
>>> HINT: Try pg_log_query_plan() after a few ....
>> 
>> Agreed.
>> I felt the HINT message 'after a few ...' is difficult to describe, 
>> and wrote as below.
>> 
>> | HINT: Retrying pg_log_query_plan() might succeed since the lock 
>> duration of page level locks are usually short
>> 
>> How do you think?
> 
> Or we don't need HINT message?

Removed the HINT message.

> +                             errmsg("could not log the query plan"),
> +                             errdetail("query plan cannot be logged 
> while page level lock is
> being held"),
> 
> In detail message, the first word of sentences should be capitalized.
> How about "Cannot log the query plan while holding page-level lock.",
> instead?

Agreed.

> Thanks for updating the patch! Here are some review comments.
> 
> +      <row>
> +       <entry role="func_table_entry"><para role="func_signature">
> +        <indexterm>
> +         <primary>pg_log_query_plan</primary>
> +        </indexterm>
> 
> This entry is placed before one for pg_log_backend_memory_contexts().
> But it should be *after* that since those entries seem to be placed in
> alphabetical order in the table?

Modified it.

> +        Requests to log the plan of the query currently running on the
> +        backend with specified process ID along with the untruncated
> +        query string.
> 
> Other descriptions about logging of query string seem not to mention
> something like "untruncated query string". For example, auto_explain,
> log_statement, etc. Why do we need to mention "along with the
> untruncated query string" specially for pg_log_query_plan()?

Modified it as below:

              Requests to log the plan of the query currently running on 
the
     -        backend with specified process ID along with the 
untruncated
     -        query string.
     -        They will be logged at <literal>LOG</literal> message level 
and
     +        backend with specified process ID.
     +        It will be logged at <literal>LOG</literal> message level 
and

> +    Note that nested statements (statements executed inside a 
> function) are not
> +    considered for logging. Only the plan of the most deeply nested
> query is logged.
> 
> Now the plan of even nested statement can be logged. So this
> description needs to be updated?

Modified it as below:

     -    Note that nested statements (statements executed inside a 
function) are not
     -    considered for logging. Only the plan of the most deeply nested 
query is logged.
     +    Note that when the statements are executed inside a function, 
only the
     +    plan of the most deeply nested query is logged.

> @@ -440,6 +450,7 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
>       MemoryContextSwitchTo(oldcontext);
>  +    ActiveQueryDesc = NULL;
> 
> ActiveQueryDesc seems unnecessary. Why does ActiveQueryDesc need to be
> reset to NULL in standard_ExecutorFinish()?

ActiveQueryDesc should not be reset in standard_ExecutorFinish().
Removed it.

> Currently even during ProcessLogQueryPlanInterrupt(),
> pg_log_query_plan() can be call and another
> ProcessLogQueryPlanInterrupt() can be executed. So repeatable
> re-entrances to ProcessLogQueryPlanInterrupt() could cause "stack
> depth limit exceeded" error. To avoid this, shouldn't we make
> ProcessLogQueryPlanInterrupt() do nothing and return immediately, if
> it's called during another ProcessLogQueryPlanInterrupt()?
> 
> pg_log_backend_memory_contexts() also might have the same issue.

As you pointed out offlist, the issue could occur even when both 
pg_log_backend_memory_contexts and pg_log_query_plan are called and it 
may be better to make another patch.


You also pointed out offlist that it would be necessary to call 
LockErrorCleanup() if ProcessLogQueryPlanInterrupt() can incur ERROR.
AFAICS it can call ereport(ERROR), i.e., palloc0() in NewExplainState(), 
so I added PG_TRY/CATCH and make it call LockErrorCleanup() when ERROR 
occurs.



On 2022-02-01 17:27, Kyotaro Horiguchi wrote:

Thanks for reviewing Horiguchi-san!

> By the way, I'm anxious about the following part and I'd like to
> remove it.

I also think it would be nice if it's possible.
> 
> +        * Ensure no page lock is held on this process.
> 
> It seems to me what is wrong is ginInsertCleanup(), not this feature.

> Actually, the discussion is a bit dubious. What we need really to
check is wheter such locks are not held on an index *elsewhere*.

Since I'm not sure how long it will take to discuss this point, the 
attached patch is based on the current HEAD at this time.
I also think it may be better to discuss it on another thread.

-- 
Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATION
Attachment

pgsql-hackers by date:

Previous
From: "kuroda.hayato@fujitsu.com"
Date:
Subject: RE: [Proposal] Add foreign-server health checks infrastructure
Next
From: Peter Eisentraut
Date:
Subject: Re: ICU for global collation