Thread: Re: Getting FATAL: terminating connection due to administrator command
--------Peter Hopfgartner <peter.hopfgartner@r3-gis.com> wrote-------- Subject: Re: [GENERAL] Getting FATAL: terminating connection due to administrator command Date: 16.09.2010 16:56 >--------Tom Lane <tgl@sss.pgh.pa.us> wrote-------- > >Subject: Re: [GENERAL] Getting FATAL: terminating connection due to >administrator command > >Date: 15.09.2010 17:40 > > > >>Peter Hopfgartner <peter.hopfgartner@r3-gis.com> writes: > >>> --------Tom Lane <tgl@sss.pgh.pa.us> wrote-------- > >>>> This is a result of something sending SIGTERM to the backend process. > >> > >>> Can I trace where the SIGTERM comes from? > >> > >>If this is a recent Red-Hat-based release, I think that systemtap could > >>probably be used to determine that. There's a script here that solves > >>a related problem: > >>http://sourceware.org/systemtap/examples/process/sigmon.stp > Now we had the error, but systemtap did not report any SIGTERM. Is it possible to have this error without a SIGTERM beinginvolved? As mentioned in a previous mail, I've modified the script to report SIGTERM sent to any process. Peter
Peter Hopfgartner <peter.hopfgartner@r3-gis.com> writes: > Now we had the error, but systemtap did not report any SIGTERM. Is it > possible to have this error without a SIGTERM being involved? Hmph. I would have said not, but ... What PG version is this exactly? regards, tom lane
Re: Getting FATAL: terminating connection due to administrator command
From
fche@redhat.com (Frank Ch. Eigler)
Date:
Peter Hopfgartner <peter.hopfgartner@r3-gis.com> writes: > [...] > > >http://sourceware.org/systemtap/examples/process/sigmon.stp > Now we had the error, but systemtap did not report any SIGTERM. Is > it possible to have this error without a SIGTERM being involved? As > mentioned in a previous mail, I've modified the script to report > SIGTERM sent to any process. There are some other possibilities. It's possible that the version of stap you're using is not expanding signal.send to all possible paths of the kernel dispatching signals to your process. So one might try a few different things: ------------------------------------------------------------------------ # see what die() is getting to work with probe process("/usr/bin/postgres").function("die") { printf("%s[%d] received %d\n", execname(), pid(), $postgres_signal_arg) } # check for another process sending SIGTERM probe syscall.kill { if (sig == 15) { printf("%s[%d] sending %s\n", execname(), pid(), argstr) print_ubacktrace() } } # heck, trace the whole statement sequence during the signal handling probe process("/usr/bin/postgres").statement("die@*:*"), process("/usr/bin/postgres").statement("ProcessInterrupts@*:*") { printf("%s %s\n", pp(), $$vars) } ------------------------------------------------------------------------ You can run that in the background. The second probe will give systemwide SIGTERM activity, so you may need to filter it a bit. If you know the appropriate postmaster process-id, you could change the syscall.kill probe: < if (sig == 15) { > if (sig == 15 && pid == target_pid()) { and invoke the script with stap ... -x PID_OF_YOUR_POSTGRES_SERVER (In this case, "sig" and "pid" come from the syscall arguments, that is represent the intended signal recepient, rather than the sender; see also 'stap -L signal.send'.) Note that postgres does sometimes send signals to itself, so don't be surprised to see post* processes show up there. (A more modern system compiler & systemtap would give you much better variable-value dumping options.) - FChE