Thread: Server instrumentation: pg_terminate_backend, pg_reload_conf

Server instrumentation: pg_terminate_backend, pg_reload_conf

From
Andreas Pflug
Date:
This patch reenables pg_terminate_backend, allowing (superuser only, of
course) to terminate a backend. As taken from the discussion some weeks
earlier, SIGTERM seems to be used quite widely, without a report of
misbehaviour so while the code path is officially not too well tested,
in practice it's working ok and helpful.

pg_reload_conf is a client-side issued SIGHUP, shouldn't provoke too
much problems.

Regards,
Andreas
Index: doc/src/sgml/func.sgml
===================================================================
RCS file: /projects/cvsroot/pgsql/doc/src/sgml/func.sgml,v
retrieving revision 1.250
diff -u -r1.250 func.sgml
--- doc/src/sgml/func.sgml    23 May 2005 01:50:01 -0000    1.250
+++ doc/src/sgml/func.sgml    1 Jun 2005 20:49:09 -0000
@@ -8860,6 +8860,12 @@
    <indexterm zone="functions-admin">
     <primary>pg_cancel_backend</primary>
    </indexterm>
+   <indexterm zone="functions-admin">
+    <primary>pg_terminate_backend</primary>
+   </indexterm>
+   <indexterm zone="functions-admin">
+    <primary>pg_reload_conf</primary>
+   </indexterm>

    <indexterm zone="functions-admin">
     <primary>signal</primary>
@@ -8889,17 +8895,46 @@
        <entry><type>int</type></entry>
        <entry>Cancel a backend's current query</entry>
       </row>
+      <row>
+       <entry>
+        <literal><function>pg_terminate_backend</function>(<parameter>pid</parameter>)</literal>
+        </entry>
+       <entry><type>int</type></entry>
+       <entry>Terminate a backend process</entry>
+      </row>
+      <row>
+       <entry>
+        <literal><function>pg_reload_conf</function>(<parameter></parameter>)</literal>
+        </entry>
+       <entry><type>int</type></entry>
+       <entry>Triggers the server processes to reload configuration files</entry>
+      </row>
      </tbody>
     </tgroup>
    </table>

    <para>
-    This function returns 1 if successful, 0 if not successful.
+    These functions return 1 if successful, 0 if not successful.
     The process ID (<literal>pid</literal>) of an active backend can be found
     from the <structfield>procpid</structfield> column in the
     <structname>pg_stat_activity</structname> view, or by listing the <command>postgres</command>
     processes on the server with <application>ps</>.
    </para>
+   <para>
+    Terminating a backend with <function>pg_terminate_backend</>
+    should be used only as a last resort, i.e. if the backend process
+    doesn't react to <function>pg_cancel_backend</> any more and can't
+    be controlled otherwise. Since the exact state of the
+    backend at the moment of termination isn't precisely known, some
+    locked resources might remain in the server's shared memory
+    structure, effectively blocking other backends. In this case,
+    you'd have to stop and restart the postmaster.
+   </para>
+   <para>
+    <function>pg_reload_conf</> sends a SIGHUP event to the
+    postmaster, and thus triggers a reload of the configuration files
+    in all backend processes.
+   </para>

    <indexterm zone="functions-admin">
     <primary>pg_start_backup</primary>
@@ -8970,6 +9005,83 @@
     For details about proper usage of these functions, see
     <xref linkend="backup-online">.
    </para>
Index: src/backend/utils/adt/misc.c
===================================================================
RCS file: /projects/cvsroot/pgsql/src/backend/utils/adt/misc.c,v
retrieving revision 1.43
diff -u -r1.43 misc.c
--- src/backend/utils/adt/misc.c    19 May 2005 21:35:47 -0000    1.43
+++ src/backend/utils/adt/misc.c    1 Jun 2005 20:49:13 -0000
@@ -101,22 +101,40 @@
     return 1;
 }

+
 Datum
 pg_cancel_backend(PG_FUNCTION_ARGS)
 {
     PG_RETURN_INT32(pg_signal_backend(PG_GETARG_INT32(0), SIGINT));
 }

-#ifdef NOT_USED
-
-/* Disabled in 8.0 due to reliability concerns; FIXME someday */

 Datum
 pg_terminate_backend(PG_FUNCTION_ARGS)
 {
     PG_RETURN_INT32(pg_signal_backend(PG_GETARG_INT32(0), SIGTERM));
 }
-#endif
+
+
+Datum
+pg_reload_conf(PG_FUNCTION_ARGS)
+{
+    if (!superuser())
+        ereport(ERROR,
+                (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+                 (errmsg("only superuser can signal the postmaster"))));
+
+    if (kill(PostmasterPid, SIGHUP))
+    {
+        ereport(WARNING,
+                (errmsg("failed to send signal to postmaster: %m")));
+
+        PG_RETURN_INT32(0);
+    }
+
+    PG_RETURN_INT32(1);
+}
+


 /* Function to find out which databases make use of a tablespace */
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.363
diff -u -r1.363 pg_proc.h
--- src/include/catalog/pg_proc.h    20 May 2005 01:29:55 -0000    1.363
+++ src/include/catalog/pg_proc.h    1 Jun 2005 20:49:31 -0000
@@ -3016,12 +3016,16 @@
 DESCR("is conversion visible in search path?");


+DATA(insert OID = 2168 ( pg_terminate_backend    PGNSP PGUID 12 f f t f v 1 23 "23" _null_ _null_ _null_
pg_terminate_backend- _null_ )); 
+DESCR("Terminate a server process");
 DATA(insert OID = 2171 ( pg_cancel_backend        PGNSP PGUID 12 f f t f v 1 23 "23" _null_ _null_ _null_
pg_cancel_backend- _null_ )); 
 DESCR("Cancel a server process' current query");
 DATA(insert OID = 2172 ( pg_start_backup        PGNSP PGUID 12 f f t f v 1 25 "25" _null_ _null_ _null_
pg_start_backup- _null_ )); 
 DESCR("Prepare for taking an online backup");
 DATA(insert OID = 2173 ( pg_stop_backup            PGNSP PGUID 12 f f t f v 0 25 "" _null_ _null_ _null_
pg_stop_backup- _null_ )); 
 DESCR("Finish taking an online backup");
+DATA(insert OID = 2284 ( pg_reload_conf         PGNSP PGUID 12 f f t f v 0 23 "" _null_ _null_ _null_ pg_reload_conf -
_null_)); 
+DESCR("Reloads configuration files");


 /* Aggregates (moved here from pg_aggregate for 7.3) */
Index: src/include/utils/builtins.h
===================================================================
RCS file: /projects/cvsroot/pgsql/src/include/utils/builtins.h,v
retrieving revision 1.257
diff -u -r1.257 builtins.h
--- src/include/utils/builtins.h    27 May 2005 00:57:49 -0000    1.257
+++ src/include/utils/builtins.h    1 Jun 2005 20:49:34 -0000
@@ -362,8 +362,10 @@
 extern Datum nonnullvalue(PG_FUNCTION_ARGS);
 extern Datum current_database(PG_FUNCTION_ARGS);
 extern Datum pg_cancel_backend(PG_FUNCTION_ARGS);
+extern Datum pg_terminate_backend(PG_FUNCTION_ARGS);
+extern Datum pg_reload_conf(PG_FUNCTION_ARGS);
 extern Datum pg_tablespace_databases(PG_FUNCTION_ARGS);

 /* not_in.c */
 extern Datum int4notin(PG_FUNCTION_ARGS);
 extern Datum oidnotin(PG_FUNCTION_ARGS);

Re: Server instrumentation: pg_terminate_backend, pg_reload_conf

From
Bruce Momjian
Date:
Andreas Pflug wrote:
> This patch reenables pg_terminate_backend, allowing (superuser only, of
> course) to terminate a backend. As taken from the discussion some weeks
> earlier, SIGTERM seems to be used quite widely, without a report of
> misbehavior so while the code path is officially not too well tested,
> in practice it's working ok and helpful.

I thought we had a discussion that the places we accept SIGTERM might be
places that can exit if the postmaster is shutting down, but might not
be places we can exit if the postmaster continues running, e.g. holding
locks.  Have you checked all the places we honor SIGTERM to check that
we are safe to exit?  I know Tom had concerns about that.

Looking at ProcessInterrupts() and friends, when it is called with
QueryCancelPending(), it does elog(ERROR) and longjumps out of elog, and
that cleans up some stuff.  The problem with SIGTERM/ProcDiePending is
that it just does a FATAL and I assume doesn't do the same cleanups that
elog(ERROR) does to cancel a query.

Ideally we would use another signal number, that would do a query
cancel, then up in the recovery code after the longjump, after we had
reset everything, we could then exit.  The problem, I think, is that we
don't have another signal available for use.  I see this in postgres.c:

    pqsignal(SIGHUP, SigHupHandler);    /* set flag to read config file */
    pqsignal(SIGINT, StatementCancelHandler);   /* cancel current query */
    pqsignal(SIGTERM, die);     /* cancel current query and exit */
    pqsignal(SIGQUIT, quickdie);    /* hard crash time */
    pqsignal(SIGALRM, handle_sig_alarm);        /* timeout conditions */

    /*
     * Ignore failure to write to frontend. Note: if frontend closes
     * connection, we will notice it and exit cleanly when control next
     * returns to outer loop.  This seems safer than forcing exit in the
     * midst of output during who-knows-what operation...
     */
    pqsignal(SIGPIPE, SIG_IGN);
    pqsignal(SIGUSR1, CatchupInterruptHandler);
    pqsignal(SIGUSR2, NotifyInterruptHandler);
    pqsignal(SIGFPE, FloatExceptionHandler);

It would be neat if we could do a combined Cancel/Terminate signal, but
signals don't work that way.  Any ideas on how we can do a combined
cancel/terminate?  Do we have a shared area that both the postmaster and
the backends can see?  Could we set a flag when the postmaster is
shutting down and then when a backend sets a SIGTERM, it could either
shut down right away or do the cancel and then shut down?  I don't think
we can do query cancel for server-wide backend shutdowns --- it should
be as quick as possible.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Server instrumentation: pg_terminate_backend, pg_reload_conf

From
Andreas Pflug
Date:
Bruce Momjian wrote:
> Andreas Pflug wrote:
>
>>This patch reenables pg_terminate_backend, allowing (superuser only, of
>>course) to terminate a backend. As taken from the discussion some weeks
>>earlier, SIGTERM seems to be used quite widely, without a report of
>>misbehavior so while the code path is officially not too well tested,
>>in practice it's working ok and helpful.
>
>
> I thought we had a discussion that the places we accept SIGTERM might be
> places that can exit if the postmaster is shutting down, but might not
> be places we can exit if the postmaster continues running, e.g. holding
> locks.  Have you checked all the places we honor SIGTERM to check that
> we are safe to exit?  I know Tom had concerns about that.

My patch is purely to enable a supervisor to issue a SIGTERM using a
pgsql client, instead of doing it from a server command line. It's not
meant to fix the underlying problems.

Regards,
Andreas

Re: Server instrumentation: pg_terminate_backend, pg_reload_conf

From
Bruce Momjian
Date:
Andreas Pflug wrote:
> Bruce Momjian wrote:
> > Andreas Pflug wrote:
> >
> >>This patch reenables pg_terminate_backend, allowing (superuser only, of
> >>course) to terminate a backend. As taken from the discussion some weeks
> >>earlier, SIGTERM seems to be used quite widely, without a report of
> >>misbehavior so while the code path is officially not too well tested,
> >>in practice it's working ok and helpful.
> >
> >
> > I thought we had a discussion that the places we accept SIGTERM might be
> > places that can exit if the postmaster is shutting down, but might not
> > be places we can exit if the postmaster continues running, e.g. holding
> > locks.  Have you checked all the places we honor SIGTERM to check that
> > we are safe to exit?  I know Tom had concerns about that.
>
> My patch is purely to enable a supervisor to issue a SIGTERM using a
> pgsql client, instead of doing it from a server command line. It's not
> meant to fix the underlying problems.

We don't support sending SIGTERM from the server command line to
individual backends, so why add support for it in SQL?

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: Server instrumentation: pg_terminate_backend, pg_reload_conf

From
Andreas Pflug
Date:
Bruce Momjian wrote:
> Andreas Pflug wrote:
>
>>Bruce Momjian wrote:
>>
>>>Andreas Pflug wrote:
>>>
>>>
>>>>This patch reenables pg_terminate_backend, allowing (superuser only, of
>>>>course) to terminate a backend. As taken from the discussion some weeks
>>>>earlier, SIGTERM seems to be used quite widely, without a report of
>>>>misbehavior so while the code path is officially not too well tested,
>>>>in practice it's working ok and helpful.
>>>
>>>
>>>I thought we had a discussion that the places we accept SIGTERM might be
>>>places that can exit if the postmaster is shutting down, but might not
>>>be places we can exit if the postmaster continues running, e.g. holding
>>>locks.  Have you checked all the places we honor SIGTERM to check that
>>>we are safe to exit?  I know Tom had concerns about that.
>>
>>My patch is purely to enable a supervisor to issue a SIGTERM using a
>>pgsql client, instead of doing it from a server command line. It's not
>>meant to fix the underlying problems.
>
>
> We don't support sending SIGTERM from the server command line to
> individual backends, so why add support for it in SQL?

I don't want to slip into discussion whether it's good to SIGTERM a
backend or not, it is in use. So drop it if you don't like clients to
have the same facilities as console users.

BTW, I got a lot of other instrumentation stuff pending, which I
originally wanted to post one by one to allow individual discussion but
I'm running out of time for feature freeze. Apparently I'll have to post
all at once.

Regards,
Andreas