Thread: Server instrumentation: pg_terminate_backend, pg_reload_conf
This patch reenables pg_terminate_backend, allowing (superuser only, of course) to terminate a backend. As taken from the discussion some weeks earlier, SIGTERM seems to be used quite widely, without a report of misbehaviour so while the code path is officially not too well tested, in practice it's working ok and helpful. pg_reload_conf is a client-side issued SIGHUP, shouldn't provoke too much problems. Regards, Andreas Index: doc/src/sgml/func.sgml =================================================================== RCS file: /projects/cvsroot/pgsql/doc/src/sgml/func.sgml,v retrieving revision 1.250 diff -u -r1.250 func.sgml --- doc/src/sgml/func.sgml 23 May 2005 01:50:01 -0000 1.250 +++ doc/src/sgml/func.sgml 1 Jun 2005 20:49:09 -0000 @@ -8860,6 +8860,12 @@ <indexterm zone="functions-admin"> <primary>pg_cancel_backend</primary> </indexterm> + <indexterm zone="functions-admin"> + <primary>pg_terminate_backend</primary> + </indexterm> + <indexterm zone="functions-admin"> + <primary>pg_reload_conf</primary> + </indexterm> <indexterm zone="functions-admin"> <primary>signal</primary> @@ -8889,17 +8895,46 @@ <entry><type>int</type></entry> <entry>Cancel a backend's current query</entry> </row> + <row> + <entry> + <literal><function>pg_terminate_backend</function>(<parameter>pid</parameter>)</literal> + </entry> + <entry><type>int</type></entry> + <entry>Terminate a backend process</entry> + </row> + <row> + <entry> + <literal><function>pg_reload_conf</function>(<parameter></parameter>)</literal> + </entry> + <entry><type>int</type></entry> + <entry>Triggers the server processes to reload configuration files</entry> + </row> </tbody> </tgroup> </table> <para> - This function returns 1 if successful, 0 if not successful. + These functions return 1 if successful, 0 if not successful. The process ID (<literal>pid</literal>) of an active backend can be found from the <structfield>procpid</structfield> column in the <structname>pg_stat_activity</structname> view, or by listing the <command>postgres</command> processes on the server with <application>ps</>. </para> + <para> + Terminating a backend with <function>pg_terminate_backend</> + should be used only as a last resort, i.e. if the backend process + doesn't react to <function>pg_cancel_backend</> any more and can't + be controlled otherwise. Since the exact state of the + backend at the moment of termination isn't precisely known, some + locked resources might remain in the server's shared memory + structure, effectively blocking other backends. In this case, + you'd have to stop and restart the postmaster. + </para> + <para> + <function>pg_reload_conf</> sends a SIGHUP event to the + postmaster, and thus triggers a reload of the configuration files + in all backend processes. + </para> <indexterm zone="functions-admin"> <primary>pg_start_backup</primary> @@ -8970,6 +9005,83 @@ For details about proper usage of these functions, see <xref linkend="backup-online">. </para> Index: src/backend/utils/adt/misc.c =================================================================== RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/misc.c,v retrieving revision 1.43 diff -u -r1.43 misc.c --- src/backend/utils/adt/misc.c 19 May 2005 21:35:47 -0000 1.43 +++ src/backend/utils/adt/misc.c 1 Jun 2005 20:49:13 -0000 @@ -101,22 +101,40 @@ return 1; } + Datum pg_cancel_backend(PG_FUNCTION_ARGS) { PG_RETURN_INT32(pg_signal_backend(PG_GETARG_INT32(0), SIGINT)); } -#ifdef NOT_USED - -/* Disabled in 8.0 due to reliability concerns; FIXME someday */ Datum pg_terminate_backend(PG_FUNCTION_ARGS) { PG_RETURN_INT32(pg_signal_backend(PG_GETARG_INT32(0), SIGTERM)); } -#endif + + +Datum +pg_reload_conf(PG_FUNCTION_ARGS) +{ + if (!superuser()) + ereport(ERROR, + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE), + (errmsg("only superuser can signal the postmaster")))); + + if (kill(PostmasterPid, SIGHUP)) + { + ereport(WARNING, + (errmsg("failed to send signal to postmaster: %m"))); + + PG_RETURN_INT32(0); + } + + PG_RETURN_INT32(1); +} + /* Function to find out which databases make use of a tablespace */ Index: src/include/catalog/pg_proc.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v retrieving revision 1.363 diff -u -r1.363 pg_proc.h --- src/include/catalog/pg_proc.h 20 May 2005 01:29:55 -0000 1.363 +++ src/include/catalog/pg_proc.h 1 Jun 2005 20:49:31 -0000 @@ -3016,12 +3016,16 @@ DESCR("is conversion visible in search path?"); +DATA(insert OID = 2168 ( pg_terminate_backend PGNSP PGUID 12 f f t f v 1 23 "23" _null_ _null_ _null_ pg_terminate_backend- _null_ )); +DESCR("Terminate a server process"); DATA(insert OID = 2171 ( pg_cancel_backend PGNSP PGUID 12 f f t f v 1 23 "23" _null_ _null_ _null_ pg_cancel_backend- _null_ )); DESCR("Cancel a server process' current query"); DATA(insert OID = 2172 ( pg_start_backup PGNSP PGUID 12 f f t f v 1 25 "25" _null_ _null_ _null_ pg_start_backup- _null_ )); DESCR("Prepare for taking an online backup"); DATA(insert OID = 2173 ( pg_stop_backup PGNSP PGUID 12 f f t f v 0 25 "" _null_ _null_ _null_ pg_stop_backup- _null_ )); DESCR("Finish taking an online backup"); +DATA(insert OID = 2284 ( pg_reload_conf PGNSP PGUID 12 f f t f v 0 23 "" _null_ _null_ _null_ pg_reload_conf - _null_)); +DESCR("Reloads configuration files"); /* Aggregates (moved here from pg_aggregate for 7.3) */ Index: src/include/utils/builtins.h =================================================================== RCS file: /projects/cvsroot/pgsql/src/include/utils/builtins.h,v retrieving revision 1.257 diff -u -r1.257 builtins.h --- src/include/utils/builtins.h 27 May 2005 00:57:49 -0000 1.257 +++ src/include/utils/builtins.h 1 Jun 2005 20:49:34 -0000 @@ -362,8 +362,10 @@ extern Datum nonnullvalue(PG_FUNCTION_ARGS); extern Datum current_database(PG_FUNCTION_ARGS); extern Datum pg_cancel_backend(PG_FUNCTION_ARGS); +extern Datum pg_terminate_backend(PG_FUNCTION_ARGS); +extern Datum pg_reload_conf(PG_FUNCTION_ARGS); extern Datum pg_tablespace_databases(PG_FUNCTION_ARGS); /* not_in.c */ extern Datum int4notin(PG_FUNCTION_ARGS); extern Datum oidnotin(PG_FUNCTION_ARGS);
Andreas Pflug wrote: > This patch reenables pg_terminate_backend, allowing (superuser only, of > course) to terminate a backend. As taken from the discussion some weeks > earlier, SIGTERM seems to be used quite widely, without a report of > misbehavior so while the code path is officially not too well tested, > in practice it's working ok and helpful. I thought we had a discussion that the places we accept SIGTERM might be places that can exit if the postmaster is shutting down, but might not be places we can exit if the postmaster continues running, e.g. holding locks. Have you checked all the places we honor SIGTERM to check that we are safe to exit? I know Tom had concerns about that. Looking at ProcessInterrupts() and friends, when it is called with QueryCancelPending(), it does elog(ERROR) and longjumps out of elog, and that cleans up some stuff. The problem with SIGTERM/ProcDiePending is that it just does a FATAL and I assume doesn't do the same cleanups that elog(ERROR) does to cancel a query. Ideally we would use another signal number, that would do a query cancel, then up in the recovery code after the longjump, after we had reset everything, we could then exit. The problem, I think, is that we don't have another signal available for use. I see this in postgres.c: pqsignal(SIGHUP, SigHupHandler); /* set flag to read config file */ pqsignal(SIGINT, StatementCancelHandler); /* cancel current query */ pqsignal(SIGTERM, die); /* cancel current query and exit */ pqsignal(SIGQUIT, quickdie); /* hard crash time */ pqsignal(SIGALRM, handle_sig_alarm); /* timeout conditions */ /* * Ignore failure to write to frontend. Note: if frontend closes * connection, we will notice it and exit cleanly when control next * returns to outer loop. This seems safer than forcing exit in the * midst of output during who-knows-what operation... */ pqsignal(SIGPIPE, SIG_IGN); pqsignal(SIGUSR1, CatchupInterruptHandler); pqsignal(SIGUSR2, NotifyInterruptHandler); pqsignal(SIGFPE, FloatExceptionHandler); It would be neat if we could do a combined Cancel/Terminate signal, but signals don't work that way. Any ideas on how we can do a combined cancel/terminate? Do we have a shared area that both the postmaster and the backends can see? Could we set a flag when the postmaster is shutting down and then when a backend sets a SIGTERM, it could either shut down right away or do the cancel and then shut down? I don't think we can do query cancel for server-wide backend shutdowns --- it should be as quick as possible. -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian wrote: > Andreas Pflug wrote: > >>This patch reenables pg_terminate_backend, allowing (superuser only, of >>course) to terminate a backend. As taken from the discussion some weeks >>earlier, SIGTERM seems to be used quite widely, without a report of >>misbehavior so while the code path is officially not too well tested, >>in practice it's working ok and helpful. > > > I thought we had a discussion that the places we accept SIGTERM might be > places that can exit if the postmaster is shutting down, but might not > be places we can exit if the postmaster continues running, e.g. holding > locks. Have you checked all the places we honor SIGTERM to check that > we are safe to exit? I know Tom had concerns about that. My patch is purely to enable a supervisor to issue a SIGTERM using a pgsql client, instead of doing it from a server command line. It's not meant to fix the underlying problems. Regards, Andreas
Andreas Pflug wrote: > Bruce Momjian wrote: > > Andreas Pflug wrote: > > > >>This patch reenables pg_terminate_backend, allowing (superuser only, of > >>course) to terminate a backend. As taken from the discussion some weeks > >>earlier, SIGTERM seems to be used quite widely, without a report of > >>misbehavior so while the code path is officially not too well tested, > >>in practice it's working ok and helpful. > > > > > > I thought we had a discussion that the places we accept SIGTERM might be > > places that can exit if the postmaster is shutting down, but might not > > be places we can exit if the postmaster continues running, e.g. holding > > locks. Have you checked all the places we honor SIGTERM to check that > > we are safe to exit? I know Tom had concerns about that. > > My patch is purely to enable a supervisor to issue a SIGTERM using a > pgsql client, instead of doing it from a server command line. It's not > meant to fix the underlying problems. We don't support sending SIGTERM from the server command line to individual backends, so why add support for it in SQL? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073
Bruce Momjian wrote: > Andreas Pflug wrote: > >>Bruce Momjian wrote: >> >>>Andreas Pflug wrote: >>> >>> >>>>This patch reenables pg_terminate_backend, allowing (superuser only, of >>>>course) to terminate a backend. As taken from the discussion some weeks >>>>earlier, SIGTERM seems to be used quite widely, without a report of >>>>misbehavior so while the code path is officially not too well tested, >>>>in practice it's working ok and helpful. >>> >>> >>>I thought we had a discussion that the places we accept SIGTERM might be >>>places that can exit if the postmaster is shutting down, but might not >>>be places we can exit if the postmaster continues running, e.g. holding >>>locks. Have you checked all the places we honor SIGTERM to check that >>>we are safe to exit? I know Tom had concerns about that. >> >>My patch is purely to enable a supervisor to issue a SIGTERM using a >>pgsql client, instead of doing it from a server command line. It's not >>meant to fix the underlying problems. > > > We don't support sending SIGTERM from the server command line to > individual backends, so why add support for it in SQL? I don't want to slip into discussion whether it's good to SIGTERM a backend or not, it is in use. So drop it if you don't like clients to have the same facilities as console users. BTW, I got a lot of other instrumentation stuff pending, which I originally wanted to post one by one to allow individual discussion but I'm running out of time for feature freeze. Apparently I'll have to post all at once. Regards, Andreas