Thread: Fix pg_log_backend_memory_contexts() 's delay
Hi, Log output takes time between several seconds to a few tens when using ‘SELECT pg_log_backend_memory_contexts(1234)’ with PID of ‘autovacuum launcher’. I made a patch for this problem. regards, Koyu Tanigawa
Attachment
On Tue, Oct 5, 2021 at 2:50 PM bt21tanigaway <bt21tanigaway@oss.nttdata.com> wrote: > > Hi, > > Log output takes time between several seconds to a few tens when using > ‘SELECT pg_log_backend_memory_contexts(1234)’ with PID of ‘autovacuum > launcher’. > I made a patch for this problem. Thanks for the patch. Do we also need to do the change in HandleMainLoopInterrupts, HandleCheckpointerInterrupts, HandlePgArchInterrupts, HandleWalWriterInterrupts as we don't call CHECK_FOR_INTERRUPTS() there? And there are also other processes: pgstat process/statistics collector, syslogger, walreceiver, walsender, background workers, parallel workers and so on. I think we need to change in all the processes where we don't call CHECK_FOR_INTERRUPTS() in their main loops. Before doing that, we need to be sure of whether we allow only the user sessions/backend process's memory contexts with pg_log_backend_memory_contexts or any process that is forked by postmaster i.e. auxiliary process? The function pg_log_backend_memory_contexts supports the processes that return a pgproc structure from this function BackendPidGetProc, it doesn't attempt to get pgproc structure from AuxiliaryPidGetProc. Regards, Bharath Rupireddy.
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes: > On Tue, Oct 5, 2021 at 2:50 PM bt21tanigaway > <bt21tanigaway@oss.nttdata.com> wrote: >> Log output takes time between several seconds to a few tens when using >> ‘SELECT pg_log_backend_memory_contexts(1234)’ with PID of ‘autovacuum >> launcher’. >> I made a patch for this problem. > Thanks for the patch. Do we also need to do the change in > HandleMainLoopInterrupts, HandleCheckpointerInterrupts, > HandlePgArchInterrupts, HandleWalWriterInterrupts as we don't call > CHECK_FOR_INTERRUPTS() there? It's not real clear to me why we need to care about this in those processes' idle loops. Their memory consumption is unlikely to be very interesting in that state, nor could it change before they wake up. regards, tom lane
On Tue, Oct 5, 2021 at 8:28 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > It's not real clear to me why we need to care about this in those > processes' idle loops. Their memory consumption is unlikely to be > very interesting in that state, nor could it change before they > wake up. Perhaps that's so, but it doesn't seem like a good reason not to make them more responsive. -- Robert Haas EDB: http://www.enterprisedb.com
On Tue, Oct 05, 2021 at 12:16:06PM -0400, Robert Haas wrote: > Perhaps that's so, but it doesn't seem like a good reason not to make > them more responsive. Yeah, that's still some information that the user asked for. Looking at the things that have a PGPROC entry, should we worry about the main loop of the logical replication launcher? -- Michael
Attachment
On Wed, Oct 6, 2021 at 5:10 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Oct 05, 2021 at 12:16:06PM -0400, Robert Haas wrote: > > Perhaps that's so, but it doesn't seem like a good reason not to make > > them more responsive. > > Yeah, that's still some information that the user asked for. Looking > at the things that have a PGPROC entry, should we worry about the main > loop of the logical replication launcher? IMHO, we can support all the processes which return a PGPROC entry by both BackendPidGetProc and AuxiliaryPidGetProc where the AuxiliaryPidGetProc would cover the following processes. I'm not sure one is interested in the memory context info of auxiliary processes. /* * We set aside some extra PGPROC structures for auxiliary processes, * ie things that aren't full-fledged backends but need shmem access. * * Background writer, checkpointer, WAL writer and archiver run during normal * operation. Startup process and WAL receiver also consume 2 slots, but WAL * writer is launched only after startup has exited, so we only need 5 slots. */ #define NUM_AUXILIARY_PROCS 5 Regards, Bharath Rupireddy.
Thanks for your review. >> Thanks for the patch. Do we also need to do the change in >> HandleMainLoopInterrupts, HandleCheckpointerInterrupts, >> HandlePgArchInterrupts, HandleWalWriterInterrupts as we don't call >> CHECK_FOR_INTERRUPTS() there? > Yeah, that's still some information that the user asked for. Looking > at the things that have a PGPROC entry, should we worry about the main > loop of the logical replication launcher? ・Now, the target of “pg_log_backend_memory_contexts()” is “autovacuum launcher” and “logical replication launcher”. I observed that the delay occurred only in “autovacuum launcher” not in “logical replication launcher”. ・”autovacuum launcher” used “HandleAutoVacLauncherInterrupts()” ( not including “ProcessLogMemoryContextInterrupt()” ) instead of “ProcessInterrupts() ( including “ProcessLogMemoryContextInterrupt()” ). “ProcessLogMemoryContextInterrupt()” will not be executed until the next “ProcessInterrupts()” is executed. So, I added “ProcessLogMemoryContextInterrupt()”. ・”logical replication launcher” uses only “ProcessInterrupts()”. So, We don’t have to fix it. > IMHO, we can support all the processes which return a PGPROC entry by > both BackendPidGetProc and AuxiliaryPidGetProc where the > AuxiliaryPidGetProc would cover the following processes. I'm not sure > one is interested in the memory context info of auxiliary processes. ・The purpose of this patch is to solve the delay problem, so I would like another patch to deal with “ BackendPidGetProc” and “AuxiliaryPidGetProc”. Regards, Koyu Tanigawa
On 2021/10/06 17:14, bt21tanigaway wrote: > Thanks for your review. > >>> Thanks for the patch. Do we also need to do the change in >>> HandleMainLoopInterrupts, HandleCheckpointerInterrupts, >>> HandlePgArchInterrupts, HandleWalWriterInterrupts as we don't call >>> CHECK_FOR_INTERRUPTS() there? > >> Yeah, that's still some information that the user asked for. Looking >> at the things that have a PGPROC entry, should we worry about the main >> loop of the logical replication launcher? > > ・Now, the target of “pg_log_backend_memory_contexts()” is “autovacuum launcher” and “logical replication launcher”. Iobserved that the delay occurred only in “autovacuum launcher” not in “logical replication launcher”. > ・”autovacuum launcher” used “HandleAutoVacLauncherInterrupts()” ( not including “ProcessLogMemoryContextInterrupt()” )instead of “ProcessInterrupts() ( including “ProcessLogMemoryContextInterrupt()” ). “ProcessLogMemoryContextInterrupt()”will not be executed until the next “ProcessInterrupts()” is executed. So, I added “ProcessLogMemoryContextInterrupt()”. > ・”logical replication launcher” uses only “ProcessInterrupts()”. So, We don’t have to fix it. Yes. >> IMHO, we can support all the processes which return a PGPROC entry by >> both BackendPidGetProc and AuxiliaryPidGetProc where the >> AuxiliaryPidGetProc would cover the following processes. I'm not sure >> one is interested in the memory context info of auxiliary processes. I like this idea because it seems helpful at least for debug purpose. > ・The purpose of this patch is to solve the delay problem, so I would like another patch to deal with “ BackendPidGetProc”and “AuxiliaryPidGetProc”. +1 to improve those things separately. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
On Fri, Oct 8, 2021 at 8:58 PM Fujii Masao <masao.fujii@oss.nttdata.com> wrote: > >>> Thanks for the patch. Do we also need to do the change in > >>> HandleMainLoopInterrupts, HandleCheckpointerInterrupts, > >>> HandlePgArchInterrupts, HandleWalWriterInterrupts as we don't call > >>> CHECK_FOR_INTERRUPTS() there? > > > >> Yeah, that's still some information that the user asked for. Looking > >> at the things that have a PGPROC entry, should we worry about the main > >> loop of the logical replication launcher? > > > > ・Now, the target of “pg_log_backend_memory_contexts()” is “autovacuum launcher” and “logical replication launcher”. I observed that the delay occurred only in “autovacuum launcher” not in “logical replication launcher”. > > ・”autovacuum launcher” used “HandleAutoVacLauncherInterrupts()” ( not including “ProcessLogMemoryContextInterrupt()”) instead of “ProcessInterrupts() ( including “ProcessLogMemoryContextInterrupt()” ).“ProcessLogMemoryContextInterrupt()” will not be executed until the next “ProcessInterrupts()” is executed. So, I added“ProcessLogMemoryContextInterrupt()”. > > ・”logical replication launcher” uses only “ProcessInterrupts()”. So, We don’t have to fix it. > > Yes. +1 to keep this thread for fixing the pg_log_backend_memory_contexts() issue for the autovacuum launcher. And the patch "fix_log_output_delay" looks good to me. I think we can add a CF entry. > >> IMHO, we can support all the processes which return a PGPROC entry by > >> both BackendPidGetProc and AuxiliaryPidGetProc where the > >> AuxiliaryPidGetProc would cover the following processes. I'm not sure > >> one is interested in the memory context info of auxiliary processes. > > I like this idea because it seems helpful at least for debug purpose. > > > > ・The purpose of this patch is to solve the delay problem, so I would like another patch to deal with “ BackendPidGetProc”and “AuxiliaryPidGetProc”. > > +1 to improve those things separately. I started a separate thread [1], and I have a couple of open points there. Please feel free to provide your thoughts in [1]. [1] https://www.postgresql.org/message-id/flat/CALj2ACU1nBzpacOK2q%3Da65S_4%2BOaz_rLTsU1Ri0gf7YUmnmhfQ%40mail.gmail.com Regards, Bharath Rupireddy.
Thanks for the patch! It might be self-evident, but since there are comments on other process handlings in HandleAutoVacLauncherInterrupts like below, how about adding a comment for the consistency? /* Process barrier events */ if (ProcSignalBarrierPending) ProcessProcSignalBarrier(); /* Process sinval catchup interrupts that happened while sleeping */ ProcessCatchupInterrupt(); Regards, -- Atsushi Torikoshi NTT DATA CORPORATION
On 2021/10/11 14:28, torikoshia wrote: > Thanks for the patch! > > It might be self-evident, but since there are comments on other process handlings in HandleAutoVacLauncherInterrupts likebelow, how about adding a comment for the consistency? +1 I applied such cosmetic changes to the patch. Patch attached. Barring any objection, I will commit it and back-port to v14. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION
Attachment
On 2021/10/11 14:40, Fujii Masao wrote: > > > On 2021/10/11 14:28, torikoshia wrote: >> Thanks for the patch! >> >> It might be self-evident, but since there are comments on other process handlings in HandleAutoVacLauncherInterrupts likebelow, how about adding a comment for the consistency? > > +1 > > I applied such cosmetic changes to the patch. Patch attached. > Barring any objection, I will commit it and back-port to v14. Pushed. Thanks! Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION