Re: Printing backtrace of postgres processes - Mailing list pgsql-hackers
From | vignesh C |
---|---|
Subject | Re: Printing backtrace of postgres processes |
Date | |
Msg-id | CALDaNm2eiNpx6H3ONum_XWBnF7ma1x=wiszEZak3gzmTgMT_bQ@mail.gmail.com Whole thread Raw |
In response to | Re: Printing backtrace of postgres processes (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Printing backtrace of postgres processes
|
List | pgsql-hackers |
On Tue, Apr 5, 2022 at 9:18 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Wed, Mar 30, 2022 at 12:03 PM Justin Pryzby <pryzby@telsasoft.com> wrote: > > On Wed, Mar 30, 2022 at 11:53:52AM -0400, Greg Stark wrote: > > > Sadly the cfbot is showing a patch conflict again. It's just a trivial > > > conflict in the regression test schedule so I'm not going to update > > > the status but it would be good to rebase it so we continue to get > > > cfbot testing. > > > > Done. No changes. > > + if (chk_auxiliary_proc) > + ereport(WARNING, > + errmsg("PID %d is not a PostgreSQL server process", pid)); > + else > + ereport(WARNING, > + errmsg("PID %d is not a PostgreSQL backend process", pid)); > > This doesn't look right to me. I don't think that PostgreSQL server > processes are one kind of thing and PostgreSQL backend processes are > another kind of thing. I think they're two terms that are pretty > nearly interchangeable, or maybe at best you want to argue that > backend processes are some subset of server processes. I don't see > this sort of thing adding any clarity. I have changed it to "PID %d is not a PostgreSQL server process" which is similar to the existing warning message in pg_log_backend_memory_contexts. > -static void > +void > set_backtrace(ErrorData *edata, int num_skip) > { > StringInfoData errtrace; > @@ -978,7 +978,18 @@ set_backtrace(ErrorData *edata, int num_skip) > "backtrace generation is not supported by this installation"); > #endif > > - edata->backtrace = errtrace.data; > + if (edata) > + edata->backtrace = errtrace.data; > + else > + { > + /* > + * LOG_SERVER_ONLY is used to make sure the backtrace is never > + * sent to client. We want to avoid messing up the other client > + * session. > + */ > + elog(LOG_SERVER_ONLY, "current backtrace:%s", errtrace.data); > + pfree(errtrace.data); > + } > } > > This looks like a grotty hack. I have changed it so that the backtrace is set and returned to the caller. The caller will take care of logging or setting it to the error data context. Thoughts? > - PGPROC *proc = BackendPidGetProc(pid); > - > - /* > - * BackendPidGetProc returns NULL if the pid isn't valid; but by the time > - * we reach kill(), a process for which we get a valid proc here might > - * have terminated on its own. There's no way to acquire a lock on an > - * arbitrary process to prevent that. But since so far all the callers of > - * this mechanism involve some request for ending the process anyway, that > - * it might end on its own first is not a problem. > - * > - * Note that proc will also be NULL if the pid refers to an auxiliary > - * process or the postmaster (neither of which can be signaled via > - * pg_signal_backend()). > - */ > - if (proc == NULL) > - { > - /* > - * This is just a warning so a loop-through-resultset will not abort > - * if one backend terminated on its own during the run. > - */ > - ereport(WARNING, > - (errmsg("PID %d is not a PostgreSQL backend process", pid))); > + PGPROC *proc; > > + /* Users can only signal valid backend or an auxiliary process. */ > + proc = CheckPostgresProcessId(pid, false, NULL); > + if (!proc) > return SIGNAL_BACKEND_ERROR; > - } > > Incidentally changing the behavior of pg_signal_backend() doesn't seem > like a great idea. We can do that as a separate commit, after > considering whether documentation changes are needed. But it's not > something that should get folded into a commit on another topic. Agreed. I have kept the logic of pg_signal_backend as it is. pg_log_backtrace and pg_log_backend_memory_contexts use the same functionality to check and send signal. I have added a new function "CheckProcSendDebugSignal" which has the common code implementation and will be called by both pg_log_backtrace and pg_log_backend_memory_contexts. > + /* > + * BackendPidGetProc() and AuxiliaryPidGetProc() return NULL if the pid > + * isn't valid; but by the time we reach kill(), a process for which we > + * get a valid proc here might have terminated on its own. There's no way > + * to acquire a lock on an arbitrary process to prevent that. But since > + * this mechanism is usually used to debug a backend or an auxiliary > + * process running and consuming lots of memory, that it might end on its > + * own first without logging the requested info is not a problem. > + */ > > This comment made a lot more sense where it used to be than it does > where it is now. I think more work is required here than just cutting > and pasting. This function was called from pg_signal_backend earlier, this logic is now moved to CheckProcSendDebugSignal which will be called only by pg_log_backtrace and pg_log_backend_memory_contexts. This looks appropriate in CheckProcSendDebugSignal with slight change to specify it can consume lots of memory or a long running process. Thoughts? Attached v20 patch has the changes for the same. Regards, Vignesh
Attachment
pgsql-hackers by date: