Re: RFC: Logging plan of the running query - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: RFC: Logging plan of the running query |
Date | |
Msg-id | CAExHW5ujL=mzbjxf6xtyU-Ft6LMBMEmt8H4ZSDiiZre0Fef0jg@mail.gmail.com Whole thread Raw |
In response to | Re: RFC: Logging plan of the running query (torikoshia <torikoshia@oss.nttdata.com>) |
Responses |
Re: RFC: Logging plan of the running query
|
List | pgsql-hackers |
Hi, On Tue, Oct 10, 2023 at 7:00 PM torikoshia <torikoshia@oss.nttdata.com> wrote: > Thanks for your comments! > > I tested v30 patch with > v28-0002-Testing-attempt-logging-plan-on-ever-CFI-call.patch which makes > CFI() call ProcessLogQueryPlanInterrupt() internally, and confirmed that > very few logging queries failed with v30 patch. > > This is a result in line with your prediction. > > > ``` > $ rg -c'ignored request for logging query plan due to lock confilcts' > postmaster.log > 8 Like many others I think this feature is useful to debug a long running query. Sorry for jumping late into this. I have a few of high level comments There is a lot of similarity between what this feature does and what auto explain does. I see the code is also duplicated. There is some merit in avoiding this duplication 1. we will get all the features of auto_explain automatically like choosing a format (this was expressed somebody earlier in this thread), setings etc. 2. avoid bugs. E.g your code switches context after ExplainState has been allocated. These states may leak depending upon when this function gets called. 3. Building features on top as James envisions will be easier. Considering the similarity with auto_explain I wondered whether this function should be part of auto_explain contrib module itself? If we do that users will need to load auto_explain extension and thus install executor hooks when this function doesn't need those. So may not be such a good idea. I didn't see any discussion on this. I tried following query to pass PID of a non-client backend to this function. #select pg_log_query_plan(pid), application_name, backend_type from pg_stat_activity where backend_type = 'autovacuum launcher'; pg_log_query_plan | application_name | backend_type -------------------+------------------+--------------------- t | | autovacuum launcher (1 row) I see "LOG: backend with PID 2733631 is not running a query or a subtransaction is aborted" in server logs. That's ok. But may be we should not send signal to these kinds of backends at all, thus avoiding some system calls. I am also wondering whether it's better to report the WARNING as status column in the output. E.g. instead of #select pg_log_query_plan(100); WARNING: PID 100 is not a PostgreSQL backend process pg_log_query_plan ------------------- f (1 row) we output #select pg_log_query_plan(100); pg_log_query_plan | status -------------------+--------------------------------------------- f | PID 100 is not a PostgreSQL backend process (1 row) That looks neater and can easily be handled by scripts, applications and such. But it will be inconsistent with other functions like pg_terminate_backend() and pg_log_backend_memory_contexts(). I do share a concern that was discussed earlier. If a query is running longer, there's something problematic with it. A diagnostic intervention breaking it further would be unwelcome. James has run experiments to shake this code for any loose breakages. He has not found any. So may be we are good. And we wouldn't know about very rare corner cases so easily without using it in the field. So fine with it. If we could add some safety net that will be great but may not be necessary for the first cut. -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: