Re: Get memory contexts of an arbitrary backend process - Mailing list pgsql-hackers
From | torikoshia |
---|---|
Subject | Re: Get memory contexts of an arbitrary backend process |
Date | |
Msg-id | bbecd722d4f8e261b350186ac4bf68a7@oss.nttdata.com Whole thread Raw |
In response to | Re: Get memory contexts of an arbitrary backend process (Fujii Masao <masao.fujii@oss.nttdata.com>) |
Responses |
Re: Get memory contexts of an arbitrary backend process
|
List | pgsql-hackers |
On 2021-03-30 02:28, Fujii Masao wrote: Thanks for reviewing and kind suggestions! >> It adds pg_log_backend_memory_contexts(pid) which logs memory contexts >> of the specified backend process. >> >> The number of child contexts to be logged per parent is limited to 100 >> as with MemoryContextStats(). >> >> As written in commit 7b5ef8f2d07, which limits the verbosity of >> memory context statistics dumps, it supposes that practical cases >> where the dump gets long will typically be huge numbers of >> siblings under the same parent context; while the additional >> debugging value from seeing details about individual siblings >> beyond 100 will not be large. >> >> Thoughts? > > I'm OK with 100. We should comment why we chose 100 for that. Added following comments. + /* + * When a backend process is consuming huge memory, logging all its + * memory contexts might overrun available disk space. To prevent + * this, we limit the number of child contexts per parent to 100. + * + * As with MemoryContextStats(), we suppose that practical cases + * where the dump gets long will typically be huge numbers of + * siblings under the same parent context; while the additional + * debugging value from seeing details about individual siblings + * beyond 100 will not be large. + */ + MemoryContextStatsDetail(TopMemoryContext, 100, false); > > Here are some review comments. > > Isn't it better to move HandleProcSignalLogMemoryContext() and > ProcessLogMemoryContextInterrupt() to mcxt.c from procsignal.c > (like the functions for notify interrupt are defined in async.c) > because they are the functions for memory contexts? Agreed. Also renamed HandleProcSignalLogMemoryContext to HandleLogMemoryContextInterrupt. > + * HandleProcSignalLogMemoryContext > + * > + * Handle receipt of an interrupt indicating log memory context. > + * Signal handler portion of interrupt handling. > > IMO it's better to comment why we need to separate the function into > two, > i.e., HandleProcSignalLogMemoryContext() and > ProcessLogMemoryContextInterrupt(), > like the comment for other similar function explains. What about the > followings? Thanks! Changed them to the suggested one. > ------------------------------- > HandleLogMemoryContextInterrupt > > Handle receipt of an interrupt indicating logging of memory contexts. > > All the actual work is deferred to ProcessLogMemoryContextInterrupt(), > because we cannot safely emit a log message inside the signal handler. > ------------------------------- > ProcessLogMemoryContextInterrupt > > Perform logging of memory contexts of this backend process. > > Any backend that participates in ProcSignal signaling must arrange to > call this function if we see LogMemoryContextPending set. It is called > from CHECK_FOR_INTERRUPTS(), which is enough because the target process > for logging of memory contexts is a backend. > ------------------------------- > > > + if (CheckProcSignal(PROCSIG_LOG_MEMORY_CONTEXT)) > + HandleProcSignalLogMemoryContext(); > + > if (CheckProcSignal(PROCSIG_BARRIER)) > HandleProcSignalBarrierInterrupt(); > > The code for memory context logging interrupt came after barrier > interrupt > in other places, e.g., procsignal.h. Why is this order of code > different? Fixed. > +/* > + * pg_log_backend_memory_contexts > + * Print memory context of the specified backend process. > > Isn't it better to move pg_log_backend_memory_contexts() to mcxtfuncs.c > from mcxt.c because this is the SQL function memory contexts? Agreed. > IMO we should comment why we allow only superuser to call this > function. > What about the following? Thanks! Modified the patch according to the suggestions. > ----------------- > Signal a backend process to log its memory contexts. > > Only superusers are allowed to signal to log the memory contexts > because allowing any users to issue this request at an unbounded rate > would cause lots of log messages and which can lead to denial of > service. > ----------------- > > + PGPROC *proc = BackendPidGetProc(pid); > + > + /* Check whether the target process is PostgreSQL backend process. */ > + if (proc == NULL) > > What about adding more comments as follows? > > --------------------------------- > + /* > + * BackendPidGetProc returns NULL if the pid isn't valid; but by the > time > + * we reach kill(), a process for which we get a valid proc here > might > + * have terminated on its own. There's no way to acquire a lock on > an > + * arbitrary process to prevent that. But since this mechanism is > usually > + * used to debug a backend running and consuming lots of memory, > + * that it might end on its own first and its memory contexts are not > + * logged is not a problem. > + */ > + if (proc == NULL) > + { > + /* > + * This is just a warning so a loop-through-resultset will not abort > + * if one backend logged its memory contexts during the run. > + */ > + ereport(WARNING, > --------------------------------- > > + ereport(ERROR, > + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), > + errmsg("must be a superuser to log memory contexts"))); > + PG_RETURN_BOOL(false); > > This PG_RETURN_BOOL(false) is unnecessary because ereport() will emit > an ERROR? > > +$node->psql('postgres', 'select > pg_log_backend_memory_contexts(pg_backend_pid())'); > + > +my $logfile = slurp_file($node->logfile); > +like($logfile, qr/Grand total: \d+ bytes in \d+ blocks/, > + 'found expected memory context print in the log file'); > > Isn't there the case where the memory contexts have not been logged yet > when slurp_file() is called? It's rare, though. If there is the risk > for that, > we should wait for the memory contexts to be actually logged? > > BTW, I agree that we should have the regression test that calls > pg_log_backend_memory_contexts() and checks, e.g., that it doesn't > cause > segmentation fault. But I'm just wondering if it's a bit overkill to > make perl > scriptand start new node to test only this function... OK, I removed the perl TAP test and added a regression test running pg_log_backend_memory_contexts(pg_backend_pid()) and checking it returns 't'. Regards.
Attachment
pgsql-hackers by date: