Thread: Re: [COMMITTERS] pgsql: Add pg_terminate_backend() to allow terminating only a single

Bruce Momjian wrote:
> Log Message:
> -----------
> Add pg_terminate_backend() to allow terminating only a single session.

I wonder if it's OK to grab an LWLock in a signal handler.


-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Alvaro Herrera wrote:
> Bruce Momjian wrote:
> > Log Message:
> > -----------
> > Add pg_terminate_backend() to allow terminating only a single session.
> 
> I wonder if it's OK to grab an LWLock in a signal handler.

I am not in a signal handler there --- pg_terminate_backend() happens
for the person requesting the termination, not the terminated backend
that gets the signal.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


"Bruce Momjian" <bruce@momjian.us> writes:

>> > Log Message:
>> > -----------
>> > Add pg_terminate_backend() to allow terminating only a single session.

I'm interested in this because I was already looking for a solution to the
"out of signals" problem we have.

I think we could expand this by having a bunch of boolean flags, one each for
different conditions including the sinval processing conditions, interrupt,
info, and terminate. (Any more?)

The two things we would have to check to be sure of is:

1) Do we care about how many times events are processed? Ie, if you press
interrupt twice is it important that that be handled as an interrupt twice? It
doesn't work that way currently for interrupt but are any of the other
conditions sensitive to this? I don't think so.

2) Do we care what order things happen in? Ie, if you send an info request and
then a cancel request is it ok if the cancel is handled first. I don't see why
not myself. And if it's a terminate request we *clear* don't want to bother
handling any other events first.

It seems to me we could replace all of the above with either SIGINT or USR1
and have a bunch of boolean flags in MyProc. I'm not sure of the implication
for sinval processing of having to get a whole bunch of LWLocks though.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production
Tuning


pg_terminate_backend() idea

From
Bruce Momjian
Date:
Gregory Stark wrote:
> >> > Add pg_terminate_backend() to allow terminating only a single session.
> 
> I'm interested in this because I was already looking for a solution to the
> "out of signals" problem we have.
> 
> I think we could expand this by having a bunch of boolean flags, one each for
> different conditions including the sinval processing conditions, interrupt,
> info, and terminate. (Any more?)
> 
> The two things we would have to check to be sure of is:
> 
> 1) Do we care about how many times events are processed? Ie, if you press
> interrupt twice is it important that that be handled as an interrupt twice? It
> doesn't work that way currently for interrupt but are any of the other
> conditions sensitive to this? I don't think so.

Not that I can think of.

> 2) Do we care what order things happen in? Ie, if you send an info request and
> then a cancel request is it ok if the cancel is handled first. I don't see why
> not myself. And if it's a terminate request we *clear* don't want to bother
> handling any other events first.

I don't think we care.

> It seems to me we could replace all of the above with either SIGINT or USR1
> and have a bunch of boolean flags in MyProc. I'm not sure of the implication
> for sinval processing of having to get a whole bunch of LWLocks though.

Keep in mind PGPROC->terminate was easier because once it was set it was
never cleared.  If you need to clear the flag and keep going the code is
going to need to be a little more sophisticated to avoid lost interrupts.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Bruce Momjian <bruce@momjian.us> writes:
> Alvaro Herrera wrote:
>> I wonder if it's OK to grab an LWLock in a signal handler.

> I am not in a signal handler there --- pg_terminate_backend() happens
> for the person requesting the termination, not the terminated backend
> that gets the signal.

A more interesting question is what makes you think that taking
ProcArrayLock here has any value whatsoever.
        regards, tom lane


Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Alvaro Herrera wrote:
> >> I wonder if it's OK to grab an LWLock in a signal handler.
> 
> > I am not in a signal handler there --- pg_terminate_backend() happens
> > for the person requesting the termination, not the terminated backend
> > that gets the signal.
> 
> A more interesting question is what makes you think that taking
> ProcArrayLock here has any value whatsoever.

I took the lock so I would be sure the PGPROC array was the matching pid
and not some other pid that was created between my check and the setting
of the variable.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> A more interesting question is what makes you think that taking
>> ProcArrayLock here has any value whatsoever.

> I took the lock so I would be sure the PGPROC array was the matching pid
> and not some other pid that was created between my check and the setting
> of the variable.

Unfortunately, that doesn't work at all, even disregarding the fact that
you forgot to make proc.c clear the flag when setting up a new process's
PGPROC.

The race condition would go away if ProcArrayRemove zeroed the pid
field, but I'm afraid that that might break something else; in
particular I think we still need to be able to manipulate LWLocks after
ProcArrayRemove, so I suspect this is not safe either.

I think the only really safe way to do it is to make a new procarray.c
function that searches like BackendPidGetProc, but hands the proc back
with the ProcArrayLock still held.  Or maybe we should just redefine
BackendPidGetProc that way.

There are other race conditions in this patch too; notably that lots of
places feel free to clear QueryCancelPending on-the-fly, thus possibly
letting them disregard a terminate signal.

Even assuming that we can fix the garden-variety bugs like these,
there's still a fundamental problem that an uncooperative user function
(particularly one in plperl/pltcl/plpython) can indefinitely delay
response to pg_terminate_backend.  Maybe that's okay, seeing that it can
similarly hold off or disregard QueryCancel, but I'm not sure the people
asking for this are really gonna be satisfied.  (One thing we should
consider is making ERRCODE_ADMIN_SHUTDOWN unconditionally untrappable
by plpgsql exception blocks, which'd at least fix the issue for plpgsql
functions.)

I'm also thinking that this doesn't fix the problem that we're relying
on die() to not leave shared memory in a bad state.  All it does is
restrict things so that we only try that from the main loop rather than
at any random CHECK_FOR_INTERRUPTS.  That certainly reduces the odds of
hitting a bug of this type, but I don't see that it can be said to make
them zero.

All in all, this patch wasn't ready to apply without review.  I'm
currently looking to see whether it's salvageable or not.
        regards, tom lane


Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> A more interesting question is what makes you think that taking
> >> ProcArrayLock here has any value whatsoever.
> 
> > I took the lock so I would be sure the PGPROC array was the matching pid
> > and not some other pid that was created between my check and the setting
> > of the variable.
> 
> Unfortunately, that doesn't work at all, even disregarding the fact that
> you forgot to make proc.c clear the flag when setting up a new process's
> PGPROC.

I assume the PGPROC was cleared with zeros on start.  I can fix that.

> The race condition would go away if ProcArrayRemove zeroed the pid
> field, but I'm afraid that that might break something else; in
> particular I think we still need to be able to manipulate LWLocks after
> ProcArrayRemove, so I suspect this is not safe either.

Oh, good point.  The PGPROC might be dead already but the pid is still
there.  Don't we mark it dead somehow?  Is it only because it isn't in
ProcArrayStruct that it is dead?

> I think the only really safe way to do it is to make a new procarray.c
> function that searches like BackendPidGetProc, but hands the proc back
> with the ProcArrayLock still held.  Or maybe we should just redefine
> BackendPidGetProc that way.

Yea, we could do that and it would clean up my code. 
BackendPidGetProc() doesn't get called many places.  I was a little
worried of looping over ProcArrayStruct while holding an exclusive
lock.

> There are other race conditions in this patch too; notably that lots of
> places feel free to clear QueryCancelPending on-the-fly, thus possibly
> letting them disregard a terminate signal.

True.

> Even assuming that we can fix the garden-variety bugs like these,
> there's still a fundamental problem that an uncooperative user function
> (particularly one in plperl/pltcl/plpython) can indefinitely delay
> response to pg_terminate_backend.  Maybe that's okay, seeing that it can
> similarly hold off or disregard QueryCancel, but I'm not sure the people
> asking for this are really gonna be satisfied.  (One thing we should
> consider is making ERRCODE_ADMIN_SHUTDOWN unconditionally untrappable
> by plpgsql exception blocks, which'd at least fix the issue for plpgsql
> functions.)

They are going to be more satisfied than they are now.  This is the
first listed Admin TODO item.  Every other database allows this simple
capability and we should support it.  Cancel is not the same as exit.

> I'm also thinking that this doesn't fix the problem that we're relying
> on die() to not leave shared memory in a bad state.  All it does is
> restrict things so that we only try that from the main loop rather than
> at any random CHECK_FOR_INTERRUPTS.  That certainly reduces the odds of
> hitting a bug of this type, but I don't see that it can be said to make
> them zero.

Well, OK, can we use proc_exit() instead?  That seems fine too and
probably better than die().

> All in all, this patch wasn't ready to apply without review.  I'm
> currently looking to see whether it's salvageable or not.

OK.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


I wrote:
> All in all, this patch wasn't ready to apply without review.  I'm
> currently looking to see whether it's salvageable or not.

After further study, I've concluded that it is in fact not salvageable,
and I respectfully request that it be reverted.

I was able to get things to more or less work most of the time with
three or four additional ugly hacks in postgres.c, but I still don't
have any great confidence that there aren't windows where a terminate
request wouldn't be ignored (even without considering the uncooperative-
function scenario).  Moreover it's entirely unclear that this approach
actually dodges any of the hypothetical bugs in SIGTERM handling.
Given the complexity that it adds, I think it's fair to say that this
approach is probably buggier than the other way.

I think if we want pg_terminate_backend, we have to do it the way that
was originally discussed: have it issue SIGTERM and fix whatever needs
to be fixed in the SIGTERM code path.  As the TODO list used to say
before it got editorialized upon, this is mostly a matter of testing
(something that I can tell this patch was sadly lacking :-().
We do need also to go around and fix any places that think a PG_TRY
block is a sufficient way to clean up state in shared memory --- cf
this AFAIK-still-unfixed bug:
http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php
        regards, tom lane


pg_terminate_backend() issues

From
Bruce Momjian
Date:
Tom Lane wrote:
> I wrote:
> > All in all, this patch wasn't ready to apply without review.  I'm
> > currently looking to see whether it's salvageable or not.
> 
> After further study, I've concluded that it is in fact not salvageable,
> and I respectfully request that it be reverted.

OK, reverted.

> I was able to get things to more or less work most of the time with
> three or four additional ugly hacks in postgres.c, but I still don't
> have any great confidence that there aren't windows where a terminate
> request wouldn't be ignored (even without considering the uncooperative-
> function scenario).  Moreover it's entirely unclear that this approach
> actually dodges any of the hypothetical bugs in SIGTERM handling.

I don't understand.  If we call proc_exit(0) instead, it is the same as
someone exiting the backend via PQfinish().

> Given the complexity that it adds, I think it's fair to say that this
> approach is probably buggier than the other way.

> I think if we want pg_terminate_backend, we have to do it the way that
> was originally discussed: have it issue SIGTERM and fix whatever needs
> to be fixed in the SIGTERM code path.  As the TODO list used to say
> before it got editorialized upon, this is mostly a matter of testing
> (something that I can tell this patch was sadly lacking :-().
> We do need also to go around and fix any places that think a PG_TRY
> block is a sufficient way to clean up state in shared memory --- cf
> this AFAIK-still-unfixed bug:
> http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php

Well, with no movement on this TODO item since it was removed in 8.0, I
am willing to give users something that works most of the time.  As you
said already, cancel isn't going to be 100% reliable anyway because of
places we can't check for cancel.

I will work on a new version of the patch that people can see.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_terminate_backend() issues

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> I was able to get things to more or less work most of the time with
>> three or four additional ugly hacks in postgres.c, but I still don't
>> have any great confidence that there aren't windows where a terminate
>> request wouldn't be ignored (even without considering the uncooperative-
>> function scenario).  Moreover it's entirely unclear that this approach
>> actually dodges any of the hypothetical bugs in SIGTERM handling.

> I don't understand.  If we call proc_exit(0) instead, it is the same as
> someone exiting the backend via PQfinish().

Well, using proc_exit() instead of die() might have alleviated that
particular objection, but there are still the others.

>> I think if we want pg_terminate_backend, we have to do it the way that
>> was originally discussed: have it issue SIGTERM and fix whatever needs
>> to be fixed in the SIGTERM code path.

> Well, with no movement on this TODO item since it was removed in 8.0, I
> am willing to give users something that works most of the time.

If the users want it so bad, why has no one stepped up to do the
testing?
        regards, tom lane


Re: pg_terminate_backend() issues

From
Tom Lane
Date:
I did a quick grep for PG_CATCH uses to see what we have along the lines
of this bug:
http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php

In current sources there are three places at risk:

btbulkdelete, as noted in the above message
pg_start_backup    needs to reset forcePageWrites = false
createdb wants to UnlockSharedObject and delete any already-copied files

In createdb, the Unlock actually doesn't need to get cleaned up, since
transaction abort would release the lock anyway.  But leaving a possibly
large mess of orphaned files doesn't seem nice.

ISTM that there will be more cases like this in future, so we need a
general solution anyway.  I propose the following sort of code structure
for these situations:
on_shmem_exit(cleanup_routine, arg);PG_TRY();{    ... do something ...    cancel_shmem_exit(cleanup_routine,
arg);}PG_CATCH();{   cancel_shmem_exit(cleanup_routine, arg);    cleanup_routine(arg);    PG_RE_THROW();}PG_END_TRY();
 

where cancel_shmem_exit is defined to pop the latest shmem_exit_list
entry if it matches the passed arguments (I don't see any particular
need to search further than that in the list).  This structure
guarantees that cleanup_routine will be called on the way out of
either a plain ERROR or a FATAL exit.

Thoughts?  We clearly must do something about this before we can even
think of calling retail SIGTERM a supported feature ...
        regards, tom lane


Re: pg_terminate_backend() idea

From
Gregory Stark
Date:
"Bruce Momjian" <bruce@momjian.us> writes:

> Gregory Stark wrote:
>
>> It seems to me we could replace all of the above with either SIGINT or USR1
>> and have a bunch of boolean flags in MyProc. I'm not sure of the implication
>> for sinval processing of having to get a whole bunch of LWLocks though.
>
> Keep in mind PGPROC->terminate was easier because once it was set it was
> never cleared.  If you need to clear the flag and keep going the code is
> going to need to be a little more sophisticated to avoid lost interrupts.

That's kind of the point. I don't think we care about lost interrupts for any
of these things. As long as we know a sinval flush happened I think we don't
care how many more happened. Or a cancel request or terminate request.

For sinval of course it would be important to clear the flags (with MyProc
locked) prior to processing the queue in case one arrives as soon as we're
finished. But I think that's the only detail.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!
 


Re: pg_terminate_backend() issues

From
Bruce Momjian
Date:
Tom Lane wrote:
> I did a quick grep for PG_CATCH uses to see what we have along the lines
> of this bug:
> http://archives.postgresql.org/pgsql-hackers/2007-04/msg00218.php
> 
> In current sources there are three places at risk:
> 
> btbulkdelete, as noted in the above message
> pg_start_backup    needs to reset forcePageWrites = false
> createdb wants to UnlockSharedObject and delete any already-copied files
> 
> In createdb, the Unlock actually doesn't need to get cleaned up, since
> transaction abort would release the lock anyway.  But leaving a possibly
> large mess of orphaned files doesn't seem nice.
> 
> ISTM that there will be more cases like this in future, so we need a
> general solution anyway.  I propose the following sort of code structure
> for these situations:
> 
>     on_shmem_exit(cleanup_routine, arg);
>     PG_TRY();
>     {
>         ... do something ...
>         cancel_shmem_exit(cleanup_routine, arg);
>     }
>     PG_CATCH();
>     {
>         cancel_shmem_exit(cleanup_routine, arg);
>         cleanup_routine(arg);
>         PG_RE_THROW();
>     }
>     PG_END_TRY();
> 
> where cancel_shmem_exit is defined to pop the latest shmem_exit_list
> entry if it matches the passed arguments (I don't see any particular
> need to search further than that in the list).  This structure
> guarantees that cleanup_routine will be called on the way out of
> either a plain ERROR or a FATAL exit.
> 
> Thoughts?  We clearly must do something about this before we can even
> think of calling retail SIGTERM a supported feature ...

Here is a little different idea.  Seems we want something more like:
PG_ON_EXIT();{        cleanup_routine(arg);}
>     PG_TRY();
>     {
>         ... do something ...
>     }
>     PG_CATCH();
>     {
>         PG_RE_THROW();
>     }
>     PG_END_TRY();

PG_ON_EXIT() would place cleanup_routine() into the shmem_exit_list and
it would be called at the end of PG_CATCH and removed from
shmem_exit_list by PG_END_TRY().  Another idea would be to define the
PG_CATCH() first so it would be put in the shmem_exit_list before the
TRY was tried, but I assume you don't want all those functions to run on
exit.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_terminate_backend() issues

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Here is a little different idea.  Seems we want something more like:

Yeah, I thought about building a set of macros that would have this
ability folded in, but it didn't seem like much notational advantage
given that you have to write the cleanup routine out-of-line anyway.

Also, until we have a lot more than three use-sites, more macros
don't seem to be worth the trouble.
        regards, tom lane


Re: pg_terminate_backend() issues

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Here is a little different idea.  Seems we want something more like:
> 
> Yeah, I thought about building a set of macros that would have this
> ability folded in, but it didn't seem like much notational advantage
> given that you have to write the cleanup routine out-of-line anyway.

Yea, agreed.  I am a little worried about the out-of-line issue because
some cases might become more complex because you have to pass lots of
parameters, perhaps.

> Also, until we have a lot more than three use-sites, more macros
> don't seem to be worth the trouble.

Oh, only three use sites.  I thought we had three wrong sites but that
more had to be changed.  Yea, go for it!

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_terminate_backend() issues

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> >> I think if we want pg_terminate_backend, we have to do it the way that
> >> was originally discussed: have it issue SIGTERM and fix whatever needs
> >> to be fixed in the SIGTERM code path.
> 
> > Well, with no movement on this TODO item since it was removed in 8.0, I
> > am willing to give users something that works most of the time.
> 
> If the users want it so bad, why has no one stepped up to do the
> testing?

Good question.  Tom and I talked about this on the phone today.

I think the problem is testing to try to prove the lack of a bug.  How
long does someone test to know they have reasonably proven a bug doesn't
exist?  

I think the other problem is what to test.  One SIGTERM problem that was
reported was related to schema changes.  Certainly that has to be
tested, but what else:  prepared statements, queries, HOT updates,
connect/disconnect, notify?  There are lots of place to test and it is
hard to know which ones, and for how long.

I am starting to think we need to analyze the source code rather than
testing, because of what we are testing for.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_terminate_backend() issues

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> I was able to get things to more or less work most of the time with
> >> three or four additional ugly hacks in postgres.c, but I still don't
> >> have any great confidence that there aren't windows where a terminate
> >> request wouldn't be ignored (even without considering the uncooperative-
> >> function scenario).  Moreover it's entirely unclear that this approach
> >> actually dodges any of the hypothetical bugs in SIGTERM handling.
>
> > I don't understand.  If we call proc_exit(0) instead, it is the same as
> > someone exiting the backend via PQfinish().
>
> Well, using proc_exit() instead of die() might have alleviated that
> particular objection, but there are still the others.

Tom, thanks for you feedback.  It was very helpful.

I have adjusted the patch to return a locked PGPROC entry, initialize
the PGPROC->terminate boolean, called proc_exit(), and moved the
PGPROC->terminate test out of the cancel loop entirely so lost cancel
signals are not as big a problem anymore.

I agree it would be best for pg_terminate_backend() and for the
postmaster use of SIGTERM if SIGTERM was more reliable about cleanup, so
hopefully that will work for 8.4.  I have attached my patch in hopes we
can use it as a backup in case we can't get SIGTERM to work for
individual backends for 8.4.

[ I have posted about testing in a separate email.]

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + If your life is a hard drive, Christ can be your backup. +
Index: doc/src/sgml/func.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v
retrieving revision 1.432
diff -c -c -r1.432 func.sgml
*** doc/src/sgml/func.sgml    15 Apr 2008 20:28:46 -0000    1.432
--- doc/src/sgml/func.sgml    16 Apr 2008 03:59:20 -0000
***************
*** 11849,11854 ****
--- 11849,11857 ----
      <primary>pg_cancel_backend</primary>
     </indexterm>
     <indexterm>
+     <primary>pg_terminate_backend</primary>
+    </indexterm>
+    <indexterm>
      <primary>pg_reload_conf</primary>
     </indexterm>
     <indexterm>
***************
*** 11885,11890 ****
--- 11888,11900 ----
        </row>
        <row>
         <entry>
+         <literal><function>pg_terminate_backend</function>(<parameter>pid</parameter> <type>int</>)</literal>
+         </entry>
+        <entry><type>boolean</type></entry>
+        <entry>Terminate a backend</entry>
+       </row>
+       <row>
+        <entry>
          <literal><function>pg_reload_conf</function>()</literal>
          </entry>
         <entry><type>boolean</type></entry>
***************
*** 11907,11915 ****
     </para>

     <para>
!     <function>pg_cancel_backend</> sends a query cancel
!     (<systemitem>SIGINT</>) signal to a backend process identified by
!     process ID.  The process ID of an active backend can be found from
      the <structfield>procpid</structfield> column in the
      <structname>pg_stat_activity</structname> view, or by listing the
      <command>postgres</command> processes on the server with
--- 11917,11926 ----
     </para>

     <para>
!     <function>pg_cancel_backend</> and <function>pg_terminate_backend</>
!     send a query cancel (<systemitem>SIGINT</>) signal to a backend process
!     identified by process ID.  The
!     process ID of an active backend can be found from
      the <structfield>procpid</structfield> column in the
      <structname>pg_stat_activity</structname> view, or by listing the
      <command>postgres</command> processes on the server with
Index: doc/src/sgml/runtime.sgml
===================================================================
RCS file: /cvsroot/pgsql/doc/src/sgml/runtime.sgml,v
retrieving revision 1.413
diff -c -c -r1.413 runtime.sgml
*** doc/src/sgml/runtime.sgml    15 Apr 2008 20:28:46 -0000    1.413
--- doc/src/sgml/runtime.sgml    16 Apr 2008 03:59:20 -0000
***************
*** 1372,1377 ****
--- 1372,1384 ----
      well.
     </para>
    </important>
+
+   <para>
+    To terminate a session while allowing other sessions to continue, use
+    <function>pg_terminate_backend()</> (<xref
+    linkend="functions-admin-signal-table">) rather than sending a signal
+    to the child process.
+   </para>
   </sect1>

   <sect1 id="preventing-server-spoofing">
Index: src/backend/access/transam/twophase.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/access/transam/twophase.c,v
retrieving revision 1.41
diff -c -c -r1.41 twophase.c
*** src/backend/access/transam/twophase.c    25 Mar 2008 22:42:42 -0000    1.41
--- src/backend/access/transam/twophase.c    16 Apr 2008 03:59:20 -0000
***************
*** 283,288 ****
--- 283,289 ----
      gxact->proc.databaseId = databaseid;
      gxact->proc.roleId = owner;
      gxact->proc.inCommit = false;
+     gxact->proc.terminate = false;
      gxact->proc.vacuumFlags = 0;
      gxact->proc.lwWaiting = false;
      gxact->proc.lwExclusive = false;
Index: src/backend/storage/ipc/procarray.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/storage/ipc/procarray.c,v
retrieving revision 1.43
diff -c -c -r1.43 procarray.c
*** src/backend/storage/ipc/procarray.c    26 Mar 2008 18:48:59 -0000    1.43
--- src/backend/storage/ipc/procarray.c    16 Apr 2008 03:59:20 -0000
***************
*** 935,941 ****
   * answer to be used ...
   */
  PGPROC *
! BackendPidGetProc(int pid)
  {
      PGPROC       *result = NULL;
      ProcArrayStruct *arrayP = procArray;
--- 935,941 ----
   * answer to be used ...
   */
  PGPROC *
! BackendPidGetProc(int pid, bool want_shlock)
  {
      PGPROC       *result = NULL;
      ProcArrayStruct *arrayP = procArray;
***************
*** 957,964 ****
          }
      }

!     LWLockRelease(ProcArrayLock);
!
      return result;
  }

--- 957,966 ----
          }
      }

!     /* Do they want the proc returned with an shared lock? */
!     if (!result || !want_shlock)
!         LWLockRelease(ProcArrayLock);
!
      return result;
  }

***************
*** 1009,1015 ****
  bool
  IsBackendPid(int pid)
  {
!     return (BackendPidGetProc(pid) != NULL);
  }


--- 1011,1017 ----
  bool
  IsBackendPid(int pid)
  {
!     return (BackendPidGetProc(pid, false) != NULL);
  }


Index: src/backend/storage/lmgr/proc.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/storage/lmgr/proc.c,v
retrieving revision 1.199
diff -c -c -r1.199 proc.c
*** src/backend/storage/lmgr/proc.c    26 Jan 2008 19:55:08 -0000    1.199
--- src/backend/storage/lmgr/proc.c    16 Apr 2008 03:59:20 -0000
***************
*** 291,296 ****
--- 291,297 ----
      MyProc->databaseId = InvalidOid;
      MyProc->roleId = InvalidOid;
      MyProc->inCommit = false;
+     MyProc->terminate = false;
      MyProc->vacuumFlags = 0;
      if (IsAutoVacuumWorkerProcess())
          MyProc->vacuumFlags |= PROC_IS_AUTOVACUUM;
***************
*** 430,435 ****
--- 431,437 ----
      MyProc->databaseId = InvalidOid;
      MyProc->roleId = InvalidOid;
      MyProc->inCommit = false;
+     MyProc->terminate = false;
      /* we don't set the "is autovacuum" flag in the launcher */
      MyProc->vacuumFlags = 0;
      MyProc->lwWaiting = false;
***************
*** 1275,1281 ****
  void
  ProcSendSignal(int pid)
  {
!     PGPROC       *proc = BackendPidGetProc(pid);

      if (proc != NULL)
          PGSemaphoreUnlock(&proc->sem);
--- 1277,1283 ----
  void
  ProcSendSignal(int pid)
  {
!     PGPROC       *proc = BackendPidGetProc(pid, false);

      if (proc != NULL)
          PGSemaphoreUnlock(&proc->sem);
Index: src/backend/tcop/postgres.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/tcop/postgres.c,v
retrieving revision 1.550
diff -c -c -r1.550 postgres.c
*** src/backend/tcop/postgres.c    15 Apr 2008 20:28:46 -0000    1.550
--- src/backend/tcop/postgres.c    16 Apr 2008 03:59:23 -0000
***************
*** 2541,2547 ****
           * waiting for input, however.
           */
          if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
!             CritSectionCount == 0 && !DoingCommandRead)
          {
              /* bump holdoff count to make ProcessInterrupts() a no-op */
              /* until we are done getting ready for it */
--- 2541,2548 ----
           * waiting for input, however.
           */
          if (ImmediateInterruptOK && InterruptHoldoffCount == 0 &&
!             CritSectionCount == 0 &&
!             (!DoingCommandRead || MyProc->terminate))
          {
              /* bump holdoff count to make ProcessInterrupts() a no-op */
              /* until we are done getting ready for it */
***************
*** 2621,2626 ****
--- 2622,2631 ----
              ereport(ERROR,
                      (errcode(ERRCODE_QUERY_CANCELED),
                       errmsg("canceling autovacuum task")));
+         else if (MyProc->terminate)
+             ereport(ERROR,
+                     (errcode(ERRCODE_ADMIN_SHUTDOWN),
+                      errmsg("terminating backend due to administrator command")));
          else
              ereport(ERROR,
                      (errcode(ERRCODE_QUERY_CANCELED),
***************
*** 3492,3497 ****
--- 3497,3505 ----

          initStringInfo(&input_message);

+         if (MyProc->terminate)
+             proc_exit(0);
+
          /*
           * (1) If we've reached idle state, tell the frontend we're ready for
           * a new query.
Index: src/backend/utils/adt/misc.c
===================================================================
RCS file: /cvsroot/pgsql/src/backend/utils/adt/misc.c,v
retrieving revision 1.61
diff -c -c -r1.61 misc.c
*** src/backend/utils/adt/misc.c    15 Apr 2008 20:28:46 -0000    1.61
--- src/backend/utils/adt/misc.c    16 Apr 2008 03:59:23 -0000
***************
*** 27,32 ****
--- 27,33 ----
  #include "postmaster/syslogger.h"
  #include "storage/fd.h"
  #include "storage/pmsignal.h"
+ #include "storage/proc.h"
  #include "storage/procarray.h"
  #include "utils/builtins.h"
  #include "tcop/tcopprot.h"
***************
*** 89,95 ****
   * Functions to send signals to other backends.
   */
  static bool
! pg_signal_backend(int pid, int sig)
  {
      if (!superuser())
          ereport(ERROR,
--- 90,96 ----
   * Functions to send signals to other backends.
   */
  static bool
! pg_signal_check(int pid)
  {
      if (!superuser())
          ereport(ERROR,
***************
*** 106,112 ****
--- 107,122 ----
                  (errmsg("PID %d is not a PostgreSQL server process", pid)));
          return false;
      }
+     else
+         return true;
+ }

+ /*
+  * Functions to send signals to other backends.
+  */
+ static bool
+ pg_signal_backend(int pid, int sig)
+ {
      /* If we have setsid(), signal the backend's whole process group */
  #ifdef HAVE_SETSID
      if (kill(-pid, sig))
***************
*** 125,131 ****
  Datum
  pg_cancel_backend(PG_FUNCTION_ARGS)
  {
!     PG_RETURN_BOOL(pg_signal_backend(PG_GETARG_INT32(0), SIGINT));
  }

  Datum
--- 135,168 ----
  Datum
  pg_cancel_backend(PG_FUNCTION_ARGS)
  {
!     int pid = PG_GETARG_INT32(0);
!
!     if (pg_signal_check(pid))
!         PG_RETURN_BOOL(pg_signal_backend(pid, SIGINT));
!     else
!         PG_RETURN_BOOL(false);
! }
!
! /*
!  *    To cleanly terminate a backend, we set PGPROC(pid)->terminate
!  *    then send a cancel signal.  SIGTERM isn't 100% safe for
!  *    cases where other backend will continue to run.
!  */
! Datum
! pg_terminate_backend(PG_FUNCTION_ARGS)
! {
!     int pid = PG_GETARG_INT32(0);
!     volatile PGPROC *term_proc;
!
!     /* Is this the super-user, and can we find the PGPROC entry for the pid? */
!     if (pg_signal_check(pid) && (term_proc = BackendPidGetProc(pid, true)) != NULL)
!     {
!         term_proc->terminate = true;
!         LWLockRelease(ProcArrayLock);
!         PG_RETURN_BOOL(pg_signal_backend(pid, SIGINT));
!     }
!
!     PG_RETURN_BOOL(false);
  }

  Datum
***************
*** 169,185 ****
      PG_RETURN_BOOL(true);
  }

- #ifdef NOT_USED
-
- /* Disabled in 8.0 due to reliability concerns; FIXME someday */
- Datum
- pg_terminate_backend(PG_FUNCTION_ARGS)
- {
-     PG_RETURN_INT32(pg_signal_backend(PG_GETARG_INT32(0), SIGTERM));
- }
- #endif
-
-
  /* Function to find out which databases make use of a tablespace */

  typedef struct
--- 206,211 ----
Index: src/include/catalog/pg_proc.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/catalog/pg_proc.h,v
retrieving revision 1.491
diff -c -c -r1.491 pg_proc.h
*** src/include/catalog/pg_proc.h    15 Apr 2008 20:28:46 -0000    1.491
--- src/include/catalog/pg_proc.h    16 Apr 2008 03:59:23 -0000
***************
*** 3157,3162 ****
--- 3157,3164 ----

  DATA(insert OID = 2171 ( pg_cancel_backend        PGNSP PGUID 12 1 0 f f t f v 1 16 "23" _null_ _null_ _null_
pg_cancel_backend- _null_ _null_ )); 
  DESCR("cancel a server process' current query");
+ DATA(insert OID = 2096 ( pg_terminate_backend        PGNSP PGUID 12 1 0 f f t f v 1 16 "23" _null_ _null_ _null_
pg_terminate_backend- _null_ _null_ )); 
+ DESCR("terminate a server process");
  DATA(insert OID = 2172 ( pg_start_backup        PGNSP PGUID 12 1 0 f f t f v 1 25 "25" _null_ _null_ _null_
pg_start_backup- _null_ _null_ )); 
  DESCR("prepare for taking an online backup");
  DATA(insert OID = 2173 ( pg_stop_backup            PGNSP PGUID 12 1 0 f f t f v 0 25 "" _null_ _null_ _null_
pg_stop_backup- _null_ _null_ )); 
Index: src/include/storage/proc.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/storage/proc.h,v
retrieving revision 1.106
diff -c -c -r1.106 proc.h
*** src/include/storage/proc.h    15 Apr 2008 20:28:47 -0000    1.106
--- src/include/storage/proc.h    16 Apr 2008 03:59:25 -0000
***************
*** 91,96 ****
--- 91,98 ----

      bool        inCommit;        /* true if within commit critical section */

+     bool        terminate;        /* admin requested termination */
+
      uint8        vacuumFlags;    /* vacuum-related flags, see above */

      /* Info about LWLock the process is currently waiting for, if any. */
Index: src/include/storage/procarray.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/storage/procarray.h,v
retrieving revision 1.21
diff -c -c -r1.21 procarray.h
*** src/include/storage/procarray.h    26 Mar 2008 16:20:48 -0000    1.21
--- src/include/storage/procarray.h    16 Apr 2008 03:59:25 -0000
***************
*** 35,41 ****
  extern int    GetTransactionsInCommit(TransactionId **xids_p);
  extern bool HaveTransactionsInCommit(TransactionId *xids, int nxids);

! extern PGPROC *BackendPidGetProc(int pid);
  extern int    BackendXidGetPid(TransactionId xid);
  extern bool IsBackendPid(int pid);

--- 35,41 ----
  extern int    GetTransactionsInCommit(TransactionId **xids_p);
  extern bool HaveTransactionsInCommit(TransactionId *xids, int nxids);

! extern PGPROC *BackendPidGetProc(int pid, bool shlock);
  extern int    BackendXidGetPid(TransactionId xid);
  extern bool IsBackendPid(int pid);

Index: src/include/utils/builtins.h
===================================================================
RCS file: /cvsroot/pgsql/src/include/utils/builtins.h,v
retrieving revision 1.314
diff -c -c -r1.314 builtins.h
*** src/include/utils/builtins.h    15 Apr 2008 20:28:47 -0000    1.314
--- src/include/utils/builtins.h    16 Apr 2008 03:59:25 -0000
***************
*** 416,421 ****
--- 416,422 ----
  extern Datum current_database(PG_FUNCTION_ARGS);
  extern Datum current_query(PG_FUNCTION_ARGS);
  extern Datum pg_cancel_backend(PG_FUNCTION_ARGS);
+ extern Datum pg_terminate_backend(PG_FUNCTION_ARGS);
  extern Datum pg_reload_conf(PG_FUNCTION_ARGS);
  extern Datum pg_tablespace_databases(PG_FUNCTION_ARGS);
  extern Datum pg_rotate_logfile(PG_FUNCTION_ARGS);

Re: pg_terminate_backend() issues

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> I am starting to think we need to analyze the source code rather than
> testing, because of what we are testing for.

I was thinking about that a bit more, in connection with trying to
verbalize why I don't like your patch ;-)

The fundamental assumption here is that we think that proc_exit() from
the main loop of PostgresMain() should be safe, because that's exercised
anytime a client quits or dies unexpectedly, and so it should be a
well-tested path.  What is lacking such test coverage is the many code
paths that start proc_exit() from any random CHECK_FOR_INTERRUPTS()
point in the backend.  Some of those points might be (in fact are known
to be, as of today) holding locks or otherwise in a state that prevents
a clean proc_exit().

Your patch proposes to replace that code path by one that fakes a
QueryCancel error and then does proc_exit() once control reaches the
outer level.  While that certainly *should* work (ignoring the question
of whether control gets to the outer level in any reasonable amount of
time), it's a bit of a stretch to claim that it's a well-tested case.
At the moment it would only arise if a QueryCancel interrupt arrived at
approximately the same time that the client died or sent a disconnect
message, which is surely far less probable than client death by itself.
So I don't buy the argument that this is a better-tested path, and
I definitely don't want to introduce new complexity in order to make it
happen like that.

Now, as to whether a SIGTERM-style exit is safe.  With the exception
of the PG_CATCH issues that we already know about, any other case seems
like a clear coding error: you'd have to be doing something that you
know could throw an error, without having made provision to clean up
whatever unclean state you are in.  It'd be nice to think that we
haven't been that silly anywhere.  Experience however teaches that such
an assumption is hubris.

I think that the way forward is to fix the PG_CATCH issues (which I will
try to get done later this week) and then get someone to mount a serious
effort to break it, as outlined in previous discussions:
http://archives.postgresql.org/pgsql-hackers/2006-08/msg00174.php

I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
if there is some reasonable amount of testing done during this
development cycle to try to expose any problems.  If no one is willing
to do any such testing, then as far as I'm concerned there is no market
for the feature and we should drop it from TODO.
        regards, tom lane


Re: pg_terminate_backend() issues

From
Magnus Hagander
Date:
Bruce Momjian wrote:
> Tom Lane wrote:
> > Bruce Momjian <bruce@momjian.us> writes:
> > >> I think if we want pg_terminate_backend, we have to do it the
> > >> way that was originally discussed: have it issue SIGTERM and fix
> > >> whatever needs to be fixed in the SIGTERM code path.
> > 
> > > Well, with no movement on this TODO item since it was removed in
> > > 8.0, I am willing to give users something that works most of the
> > > time.
> > 
> > If the users want it so bad, why has no one stepped up to do the
> > testing?
> 
> Good question.  Tom and I talked about this on the phone today.
> 
> I think the problem is testing to try to prove the lack of a bug.  How
> long does someone test to know they have reasonably proven a bug
> doesn't exist?  

Right. I think we have *a lot* of users that regularly use SIGTERM to
kill the backends, becuase they have no idea it's dangerous. They just
use "ps" to find the backend and "kill" to see it go away.

If we had a *big* problem with it, we'd hear about it. So I doubt we
have. But it's quite possible that we have a narrow problem that can
cause problems in some issues - but I expect most people who run this
stuff without knowing it's dangerous aren't running the worlds largest
and most loaded databases...

//Magnus


Re: pg_terminate_backend() issues

From
Magnus Hagander
Date:
Tom Lane wrote:
> I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
> if there is some reasonable amount of testing done during this
> development cycle to try to expose any problems.  If no one is willing
> to do any such testing, then as far as I'm concerned there is no
> market for the feature and we should drop it from TODO.

If someone can come up with an automated script to do this kind of
testing, I can commit a VM or three to running this 24/7 for a month,
easily... But I don't trust myself in coming up with a test-case that's
good enough :-P

//Magnus


Re: pg_terminate_backend() issues

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> ISTM that there will be more cases like this in future, so we need a
> general solution anyway.  I propose the following sort of code structure
> for these situations:
>
>     on_shmem_exit(cleanup_routine, arg);
>     PG_TRY();
>     {
>         ... do something ...
>         cancel_shmem_exit(cleanup_routine, arg);
>     }
>     PG_CATCH();
>     {
>         cancel_shmem_exit(cleanup_routine, arg);
>         cleanup_routine(arg);
>         PG_RE_THROW();
>     }
>     PG_END_TRY();

[We would also have to block SIGTERM around the second cancel_shmem_exit and
cleanup_routine, no? Or if it's idempotent (actually, wouldn't it have to be?)
run them in the reverse order.]

This seems exceptionally fragile. Someone's bound to add another case without
realizing it. I thought about doing something like having a flag
in_unprotected_pg_try which PG_TRY would set to true, and on_shmem_exit would
clear. Then any LWLock or other shmem operation could assert it's cleared.

But that doesn't work for nested PG_TRY blocks. Possibly we could get it to
work by keeping a pair of counters but I'm not sure that's sufficient.

Are all the known cases LWLocks? Is there a reason we couldn't just have a
generic cleanup routine which just releases all LWLocks we hold before
exiting?

Or weakly -- an assert in CHECK_FOR_INTERRUPTS that there are no LWLocks held
or if there are that there's a cleanup routine in place. We wouldn't know for
sure that it's the *right* cleanup routine or enough cleanup routines but
hopefully we would catch the error often enough to notice.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!


Re: pg_terminate_backend() issues

From
Gregory Stark
Date:
"Gregory Stark" <stark@enterprisedb.com> writes:


> Or weakly -- an assert in CHECK_FOR_INTERRUPTS 

oops, that's irrelevant of course. 


--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support!


Re: pg_terminate_backend() issues

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> ISTM that there will be more cases like this in future, so we need a
>> general solution anyway.  I propose the following sort of code structure
>> for these situations:

> [We would also have to block SIGTERM around the second cancel_shmem_exit and
> cleanup_routine, no? Or if it's idempotent (actually, wouldn't it have to be?)
> run them in the reverse order.]

No, we wouldn't, because a SIGTERM can only actually fire at a
CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
one in the cleanup code.

> Are all the known cases LWLocks?

*None* of the known cases are LWLocks, nor any other resource that we
have generic cleanup code for.  The problem cases are one-off resources
that it seemed we could avoid having a real cleanup mechanism for.
        regards, tom lane


Re: pg_terminate_backend() issues

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

>> [We would also have to block SIGTERM around the second cancel_shmem_exit and
>> cleanup_routine, no? Or if it's idempotent (actually, wouldn't it have to be?)
>> run them in the reverse order.]
>
> No, we wouldn't, because a SIGTERM can only actually fire at a
> CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
> one in the cleanup code.

Wait, huh? In that case I don't see what advantage any of this has over
Bruce's patch. And his approach seemed a lot more robust.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's On-Demand Production
Tuning


Re: pg_terminate_backend() issues

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>> No, we wouldn't, because a SIGTERM can only actually fire at a
>> CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
>> one in the cleanup code.

> Wait, huh? In that case I don't see what advantage any of this has over
> Bruce's patch. And his approach seemed a lot more robust.

Maybe I missed something, but I thought he was just proposing some
macro syntactic sugar over the same code that I described.

I'm not against syntactic sugar, but it doesn't seem to be totally
clean; I think you would need two repetitions of the function and
arg anyway:
PG_ENSURE_ERROR_CLEANUP(cleanup_function, arg);{    ... risky code here
...}PG_END_ENSURE_ERROR_CLEANUP(cleanup_function,arg);
 

because both the "before" and "after" parts need to know the function
name and the arg to pass it.  The only way to avoid duplication would be
if the whole controlled code block was actually an argument to the
macro:
PG_ENSURE_ERROR_CLEANUP(cleanup_function, arg,    ... risky code here ...);

but there are very good reasons not to do that when the controlled
code can be large; for instance that #if can't be used portably
in a macro argument.
        regards, tom lane


Re: pg_terminate_backend() issues

From
Gregory Stark
Date:
"Tom Lane" <tgl@sss.pgh.pa.us> writes:

> Gregory Stark <stark@enterprisedb.com> writes:
>> "Tom Lane" <tgl@sss.pgh.pa.us> writes:
>>> No, we wouldn't, because a SIGTERM can only actually fire at a
>>> CHECK_FOR_INTERRUPTS() call.  You'd just need to be sure there wasn't
>>> one in the cleanup code.
>
>> Wait, huh? In that case I don't see what advantage any of this has over
>> Bruce's patch. And his approach seemed a lot more robust.
>
> Maybe I missed something, but I thought he was just proposing some
> macro syntactic sugar over the same code that I described.

No, I meant the earlier patch which you rejected with the flag in MyProc. I
realize there were other issues but the initial concern was that it wouldn't
respond promptly because it would wait for CHECK_FOR_INTERRUPTS. But if
sigterm was never handled except at a CHECK_FOR_INTERRUPTS then that was never
a factor.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication
support!


Re: pg_terminate_backend() issues

From
Tom Lane
Date:
Gregory Stark <stark@enterprisedb.com> writes:
> No, I meant the earlier patch which you rejected with the flag in MyProc. I
> realize there were other issues but the initial concern was that it wouldn't
> respond promptly because it would wait for CHECK_FOR_INTERRUPTS.

No, that's not the concern in the least.  The concern is that something
could trap the attempted throwing of the error *after*
CHECK_FOR_INTERRUPTS has noticed the signal.
        regards, tom lane


Re: pg_terminate_backend() issues

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > I am starting to think we need to analyze the source code rather than
> > testing, because of what we are testing for.
> 
> I was thinking about that a bit more, in connection with trying to
> verbalize why I don't like your patch ;-)
> 
> The fundamental assumption here is that we think that proc_exit() from
> the main loop of PostgresMain() should be safe, because that's exercised
> anytime a client quits or dies unexpectedly, and so it should be a
> well-tested path.  What is lacking such test coverage is the many code
> paths that start proc_exit() from any random CHECK_FOR_INTERRUPTS()
> point in the backend.  Some of those points might be (in fact are known
> to be, as of today) holding locks or otherwise in a state that prevents
> a clean proc_exit().
> 
> Your patch proposes to replace that code path by one that fakes a
> QueryCancel error and then does proc_exit() once control reaches the
> outer level.  While that certainly *should* work (ignoring the question
> of whether control gets to the outer level in any reasonable amount of
> time), it's a bit of a stretch to claim that it's a well-tested case.
> At the moment it would only arise if a QueryCancel interrupt arrived at
> approximately the same time that the client died or sent a disconnect
> message, which is surely far less probable than client death by itself.
> So I don't buy the argument that this is a better-tested path, and
> I definitely don't want to introduce new complexity in order to make it
> happen like that.

I would like my use of SIGINT to be like someone pressing ^C and then \q
in psql, but I can see that the SIGINT might be delivered in places
where libpq would not deliver it, like during shutdown.  Hopefully that
could be addressed, but I agree using SIGTERM is better and more useful.

> Now, as to whether a SIGTERM-style exit is safe.  With the exception
> of the PG_CATCH issues that we already know about, any other case seems
> like a clear coding error: you'd have to be doing something that you
> know could throw an error, without having made provision to clean up
> whatever unclean state you are in.  It'd be nice to think that we
> haven't been that silly anywhere.  Experience however teaches that such
> an assumption is hubris.
> 
> I think that the way forward is to fix the PG_CATCH issues (which I will
> try to get done later this week) and then get someone to mount a serious
> effort to break it, as outlined in previous discussions:
> http://archives.postgresql.org/pgsql-hackers/2006-08/msg00174.php

I was thinking of running two parallel regression tests and killing one
at random and see if the others complete.  One problem is that the
regression tests issue a lot of server error messages so somehow I would
need to filter out the normal errors, which I think I could do.

Magnus has offered to test too.

> I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
> if there is some reasonable amount of testing done during this
> development cycle to try to expose any problems.  If no one is willing
> to do any such testing, then as far as I'm concerned there is no market
> for the feature and we should drop it from TODO.

Agreed, but the feature is going to be asked for again soon so we would
probably have to put it on the features we don't want, and that is going
to look pretty odd.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_terminate_backend() issues

From
Tom Lane
Date:
Magnus Hagander <magnus@hagander.net> writes:
> Tom Lane wrote:
>> I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
>> if there is some reasonable amount of testing done during this
>> development cycle to try to expose any problems.

> If someone can come up with an automated script to do this kind of
> testing, I can commit a VM or three to running this 24/7 for a month,
> easily... But I don't trust myself in coming up with a test-case that's
> good enough :-P

The closest thing I can think of to an automated test is to run repeated
sets of the parallel regression tests, and each time SIGTERM a randomly
chosen backend at a randomly chosen time.  Then see if anything "funny"
happens.  The hard part here is distinguishing expected from unexpected
regression outputs, especially in view of the fact that some of the
tests depend on database contents set up by earlier tests.

I'm thinking that you could automatically discard the regression diff
for the specific test that got SIGTERM'd, as long as it looked like
the normal output up to the point where the "terminated by
administrator" error appears.  Then what you'd have is the potential for
downstream failures due to things not being created, which *should* fall
into a fairly stylized set of possible diffs.  So get the script to
throw away any diffs that exactly match ones seen previously.  Run it
for awhile, and then hand-validate the set of diffs that it's saved
... or if any of 'em look funny, report.

One gotcha I can think of is that killing the prepared_xacts test
can leave you with open 2PC transactions, which will interfere with
starting the next cycle of the tests (you have to kill them before you
can dropdb).  But you could add a "rollback prepared" to the driver
script to clean out any uncommitted prepared xact.

Whether this is workable or not depends on the size of the set of
"expected" downstream-failure diffs.  My gut feeling from many years of
watching regression test crashes is that it'd be large but not
completely impractical to look through by hand.

I haven't time to write something like that myself, but offhand it seems
like it could be done without more than a day or so's work, especially
if you start from the buildfarm infrastructure.

BTW, don't forget to include autovac workers in the set of SIGTERM
target candidates.
        regards, tom lane


Re: pg_terminate_backend() issues

From
Bruce Momjian
Date:
Tom Lane wrote:
> Magnus Hagander <magnus@hagander.net> writes:
> > Tom Lane wrote:
> >> I'm willing to enable a SIGTERM-based pg_terminate_backend for 8.4
> >> if there is some reasonable amount of testing done during this
> >> development cycle to try to expose any problems.
> 
> > If someone can come up with an automated script to do this kind of
> > testing, I can commit a VM or three to running this 24/7 for a month,
> > easily... But I don't trust myself in coming up with a test-case that's
> > good enough :-P
> 
> The closest thing I can think of to an automated test is to run repeated
> sets of the parallel regression tests, and each time SIGTERM a randomly
> chosen backend at a randomly chosen time.  Then see if anything "funny"

Yep, that was my plan, plus running the parallel regression tests you
get the possibility of >2 backends.

> happens.  The hard part here is distinguishing expected from unexpected
> regression outputs, especially in view of the fact that some of the
> tests depend on database contents set up by earlier tests.

Oh, that is a problem.  I was going to get the typical errors, sort them
and put them in a file.  Then when I run the test, I sort the log again
and use diff | grep '<' to see an differences.  If there are cases that
generate new errors on a previous failure, I will have to add that
output too.

> I'm thinking that you could automatically discard the regression diff
> for the specific test that got SIGTERM'd, as long as it looked like
> the normal output up to the point where the "terminated by
> administrator" error appears.  Then what you'd have is the potential for
> downstream failures due to things not being created, which *should* fall
> into a fairly stylized set of possible diffs.  So get the script to
> throw away any diffs that exactly match ones seen previously.  Run it
> for awhile, and then hand-validate the set of diffs that it's saved
> ... or if any of 'em look funny, report.

Yep, see above.

> One gotcha I can think of is that killing the prepared_xacts test
> can leave you with open 2PC transactions, which will interfere with
> starting the next cycle of the tests (you have to kill them before you
> can dropdb).  But you could add a "rollback prepared" to the driver
> script to clean out any uncommitted prepared xact.

Good idea.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_terminate_backend() issues

From
Tom Lane
Date:
Bruce Momjian <bruce@momjian.us> writes:
> Tom Lane wrote:
>> The closest thing I can think of to an automated test is to run repeated
>> sets of the parallel regression tests, and each time SIGTERM a randomly
>> chosen backend at a randomly chosen time.  Then see if anything "funny"

> Yep, that was my plan, plus running the parallel regression tests you
> get the possibility of >2 backends.

I was intentionally suggesting only one kill per test cycle.  Multiple
kills will probably create an O(N^2) explosion in the set of possible
downstream-failure deltas.  I doubt you'd really get any improvement
in testing coverage to justify the much larger amount of hand validation
needed.

It also strikes me that you could make some simple alterations to the
regression tests to reduce the set of observable downstream deltas.
For example, anyplace where a test loads a table with successive INSERTs
and that table is used by later tests, wrap the INSERT sequence with
BEGIN/END.  Then there is only one possible downstream delta (empty
table) and not N different possibilities for an N-row table.
        regards, tom lane


Re: pg_terminate_backend() issues

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> The closest thing I can think of to an automated test is to run repeated
> >> sets of the parallel regression tests, and each time SIGTERM a randomly
> >> chosen backend at a randomly chosen time.  Then see if anything "funny"
> 
> > Yep, that was my plan, plus running the parallel regression tests you
> > get the possibility of >2 backends.
> 
> I was intentionally suggesting only one kill per test cycle.  Multiple
> kills will probably create an O(N^2) explosion in the set of possible
> downstream-failure deltas.  I doubt you'd really get any improvement
> in testing coverage to justify the much larger amount of hand validation
> needed.

No, the point is that you have >2 backends to choose from to kill.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +


Re: pg_terminate_backend() issues

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <bruce@momjian.us> writes:
> > Tom Lane wrote:
> >> The closest thing I can think of to an automated test is to run repeated
> >> sets of the parallel regression tests, and each time SIGTERM a randomly
> >> chosen backend at a randomly chosen time.  Then see if anything "funny"
> 
> > Yep, that was my plan, plus running the parallel regression tests you
> > get the possibility of >2 backends.
> 
> I was intentionally suggesting only one kill per test cycle.  Multiple
> kills will probably create an O(N^2) explosion in the set of possible
> downstream-failure deltas.  I doubt you'd really get any improvement
> in testing coverage to justify the much larger amount of hand validation
> needed.
> 
> It also strikes me that you could make some simple alterations to the
> regression tests to reduce the set of observable downstream deltas.
> For example, anyplace where a test loads a table with successive INSERTs
> and that table is used by later tests, wrap the INSERT sequence with
> BEGIN/END.  Then there is only one possible downstream delta (empty
> table) and not N different possibilities for an N-row table.

I have added pg_terminate_backend() to use SIGTERM and will start
running tests as discussed with Tom.  I will post my scripts too.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + If your life is a hard drive, Christ can be your backup. +