Thread: pg_terminate_backend idea

pg_terminate_backend idea

From
"Magnus Hagander"
Date:
I had an idea about how to possibly solve the pg_terminate_backend
issue.

How about we add a field to PGPROC (or is there a better place?) called
shouldExit. pg_terminate_backend will set this field to "true" in the
target backend, and then send the normal "cancel query" signal.

The receiving backend will check for this flag in PostgresMain, and
perform a proc_exit(0) if it is set.

This way, the net effect of doing pg_terminate_backend() will be that of
query cancel followed by the "lost connection to client" path. Both of
which should be well tested by now.

I hacked up a really quick test, and it seems to be working in theory.
But it still requires me to send some data (such as a dummy query) to
the backend before it exits. This is because server side libpq blocks
when reading and ignores signals at this time. I believe the fix for
this would be to pass a flag down to the libpq routines that we want to
be abort in case of signal+flag, set only when doing the "main call" to
recv, so we can kill idle process.

But I naturally wanted to float the idea here before digging in with
further coding. Does this idea have any merit, or does it just seem
stupid? ;-) (hey, if it is, at least it was a try..)


//Magnus


Re: pg_terminate_backend idea

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
> But it still requires me to send some data (such as a dummy query) to
> the backend before it exits. This is because server side libpq blocks
> when reading and ignores signals at this time. I believe the fix for
> this would be to pass a flag down to the libpq routines that we want to
> be abort in case of signal+flag, set only when doing the "main call" to
> recv, so we can kill idle process.

Yech!  That code is messy enough already, lets not pile another kluge
atop it in order to handle something that's not even being requested
AFAIR.

In any case the correct way to solve the problem is to find out what's
being left corrupt by SIGTERM, rather than install more messiness in
order to avoid facing the real issue ...
        regards, tom lane


Re: pg_terminate_backend idea

From
Oliver Jowett
Date:
Tom Lane wrote:
> "Magnus Hagander" <mha@sollentuna.net> writes:
> 
>>But it still requires me to send some data (such as a dummy query) to
>>the backend before it exits. This is because server side libpq blocks
>>when reading and ignores signals at this time. I believe the fix for
>>this would be to pass a flag down to the libpq routines that we want to
>>be abort in case of signal+flag, set only when doing the "main call" to
>>recv, so we can kill idle process.
> 
> 
> Yech!  That code is messy enough already, lets not pile another kluge
> atop it in order to handle something that's not even being requested
> AFAIR.

I ran into the same problem back when I was trying to implement an
idle-in-transaction timeout, so solving this might be useful in more
than one place..

-O


Re: pg_terminate_backend idea

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Magnus Hagander" <mha@sollentuna.net> writes:
> > But it still requires me to send some data (such as a dummy query) to
> > the backend before it exits. This is because server side libpq blocks
> > when reading and ignores signals at this time. I believe the fix for
> > this would be to pass a flag down to the libpq routines that we want to
> > be abort in case of signal+flag, set only when doing the "main call" to
> > recv, so we can kill idle process.
> 
> Yech!  That code is messy enough already, lets not pile another kluge
> atop it in order to handle something that's not even being requested
> AFAIR.
> 
> In any case the correct way to solve the problem is to find out what's
> being left corrupt by SIGTERM, rather than install more messiness in
> order to avoid facing the real issue ...

I am confused.  Are you talking about the client SIGTERM or the server? 
I thought we agreed that using the cancel functionality, which we know
works and is tested, to do backend terminate was the right approach. 
TODO has:
* Allow administrators to safely terminate individual sessions  Right now, SIGTERM will terminate a session, but it is
treatedas  though the postmaster has paniced and shared memory might not be  cleaned up properly.  A new signal is
neededfor safe termination  because backends must first do a query cancel, then exit once they  have run the query
cancelcleanup routine.
 

I don't see us ever able to handle backend terminate in any other way. 
Are you complaining about the issue with terminating the client?  I had
not considered that.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: pg_terminate_backend idea

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Tom Lane wrote:
>> In any case the correct way to solve the problem is to find out what's
>> being left corrupt by SIGTERM, rather than install more messiness in
>> order to avoid facing the real issue ...

> I am confused.  Are you talking about the client SIGTERM or the server? 

I am talking about Rod Taylor's reports that SIGTERM'ing individual
backends tends to lead to "lock table corrupted" crashes awhile later.
Now, I've been playing the part of Chicken Little on this for awhile,
but seeing an actual report of problems from the field certainly
strengthens my feelings about it.

What I think we need to do is find a way to isolate and fix the behavior
Rod is seeing.  It may be that the bug occurs only for SIGTERM, or it
may be that it's a general problem that a SIGTERM just increases the
probability of seeing.  In any case I think we have to solve it, not
create new mechanisms to try to ignore it.

> TODO has:

>     * Allow administrators to safely terminate individual sessions
>       Right now, SIGTERM will terminate a session, but it is treated as
>       though the postmaster has paniced and shared memory might not be
>       cleaned up properly.

That statement is entirely incorrect.
        regards, tom lane


Re: pg_terminate_backend idea

From
Rod Taylor
Date:
On Tue, 2005-06-21 at 23:34 -0400, Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> In any case the correct way to solve the problem is to find out what's
> >> being left corrupt by SIGTERM, rather than install more messiness in
> >> order to avoid facing the real issue ...
> 
> > I am confused.  Are you talking about the client SIGTERM or the server? 
> 
> I am talking about Rod Taylor's reports that SIGTERM'ing individual
> backends tends to lead to "lock table corrupted" crashes awhile later.
> Now, I've been playing the part of Chicken Little on this for awhile,
> but seeing an actual report of problems from the field certainly
> strengthens my feelings about it.
> 
> What I think we need to do is find a way to isolate and fix the behavior
> Rod is seeing.  It may be that the bug occurs only for SIGTERM, or it
> may be that it's a general problem that a SIGTERM just increases the
> probability of seeing.  In any case I think we have to solve it, not
> create new mechanisms to try to ignore it.

If it helps, it seems to occur primarily (perhaps always) when there are
schema changes being performed when the SIGTERM is issued.

I don't remember seeing them on Intel or on v7.2 (we didn't stay on 7.4
very long), but on a fairly well loaded Solaris machine (v880 with
between 100 and 200 connections) it happens enough that we automatically
schedule a server restart during the first opportunity when we need to
kill connections in this way This is generally when the server doesn't
recognize the client has dropped -- pgpool can be clumsy with
connections).

> > TODO has:
> 
> >     * Allow administrators to safely terminate individual sessions
>     
> >       Right now, SIGTERM will terminate a session, but it is treated as
> >       though the postmaster has paniced and shared memory might not be
> >       cleaned up properly.
> 
> That statement is entirely incorrect.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
> 
-- 



Re: pg_terminate_backend idea

From
"Magnus Hagander"
Date:
> > But it still requires me to send some data (such as a dummy
> query) to
> > the backend before it exits. This is because server side
> libpq blocks
> > when reading and ignores signals at this time. I believe
> the fix for
> > this would be to pass a flag down to the libpq routines
> that we want
> > to be abort in case of signal+flag, set only when doing the "main
> > call" to recv, so we can kill idle process.
>
> Yech!  That code is messy enough already, lets not pile
> another kluge atop it in order to handle something that's not
> even being requested AFAIR.

While I agree it'sa  bit of a cludge, saying that it's not requested is
absolutely and totally untrue. It has been requested *a lot*. People
even use a method that is now *known* to be unsafe, simply because we do
not provide another alternative.

Therefor, I prefer a kludge than nothing at all. But a "proper solution"
is of course better.


> In any case the correct way to solve the problem is to find
> out what's being left corrupt by SIGTERM, rather than install
> more messiness in order to avoid facing the real issue ...

That is unfortunatly way over my head. And it doesn't seem like anybody
who actually has what it takes to do the "proper solution" is interested
in doing it.


//Magnus


Re: pg_terminate_backend idea

From
Andreas Pflug
Date:
Bruce Momjian wrote:
> Tom Lane wrote:
> 
>>"Magnus Hagander" <mha@sollentuna.net> writes:
>>
>>>But it still requires me to send some data (such as a dummy query) to
>>>the backend before it exits. This is because server side libpq blocks
>>>when reading and ignores signals at this time. I believe the fix for
>>>this would be to pass a flag down to the libpq routines that we want to
>>>be abort in case of signal+flag, set only when doing the "main call" to
>>>recv, so we can kill idle process.
>>
>>Yech!  That code is messy enough already, lets not pile another kluge
>>atop it in order to handle something that's not even being requested
>>AFAIR.
>>
>>In any case the correct way to solve the problem is to find out what's
>>being left corrupt by SIGTERM, rather than install more messiness in
>>order to avoid facing the real issue ...
> 
> 
> I am confused.  Are you talking about the client SIGTERM or the server? 
> I thought we agreed that using the cancel functionality, which we know
> works and is tested,

I've seen cancel *not* working. In 80 % this was the reason to use 
terminate.

Regards,
Andreas


Re: pg_terminate_backend idea

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
>> In any case the correct way to solve the problem is to find 
>> out what's being left corrupt by SIGTERM, rather than install 
>> more messiness in order to avoid facing the real issue ...

> That is unfortunatly way over my head. And it doesn't seem like anybody
> who actually has what it takes to do the "proper solution" is interested
> in doing it.

A test case --- even one that fails only a small percentage of the time
--- would make things far easier.  So far all I've seen are very vague
reports, and it's impossible to do anything about it without more info.
        regards, tom lane


Re: pg_terminate_backend idea

From
Tom Lane
Date:
Andreas Pflug <pgadmin@pse-consulting.de> writes:
>> I thought we agreed that using the cancel functionality, which we know
>> works and is tested,

> I've seen cancel *not* working. In 80 % this was the reason to use 
> terminate.

Even a moment's perusal of the code will prove that there is no
situation in which a backend will respond to SIGTERM but not SIGINT
--- there is only one InterruptPending flag and both cases are checked
in ProcessInterrupts().  So I don't believe the above argument for
using terminate in the slightest.

I can easily believe that we have missed some places that need a
CHECK_FOR_INTERRUPTS() call added, to ensure the backend can't go too
long without making these checks.  I added one in the planner main
loop just a couple weeks ago, for instance.  If you can identify what
a backend that's ignoring a cancel request is doing, please let us know.
        regards, tom lane


Re: pg_terminate_backend idea

From
Andrew - Supernews
Date:
On 2005-06-22, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andreas Pflug <pgadmin@pse-consulting.de> writes:
>>> I thought we agreed that using the cancel functionality, which we know
>>> works and is tested,
>
>> I've seen cancel *not* working. In 80 % this was the reason to use 
>> terminate.
>
> Even a moment's perusal of the code will prove that there is no
> situation in which a backend will respond to SIGTERM but not SIGINT

"idle in transaction". (or "idle" for that matter, but that's usually less
significant.)

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services


Re: pg_terminate_backend idea

From
Tom Lane
Date:
Andrew - Supernews <andrew+nonews@supernews.com> writes:
> On 2005-06-22, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Andreas Pflug <pgadmin@pse-consulting.de> writes:
>>> I've seen cancel *not* working.
>> 
>> Even a moment's perusal of the code will prove that there is no
>> situation in which a backend will respond to SIGTERM but not SIGINT

> "idle in transaction". (or "idle" for that matter, but that's usually less
> significant.)

In that case there's no query to cancel, so I would dispute the claim
that that constitutes "not working".  QueryCancel is defined to cancel
the current query, not necessarily to abort your whole transaction.
(Before 8.0 there wasn't much of a difference, but now there is:
QueryCancel is an ordinary error that can be trapped by a savepoint.
Are you arguing it should not be so trappable?)
        regards, tom lane


Re: pg_terminate_backend idea

From
Andrew - Supernews
Date:
On 2005-06-22, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andrew - Supernews <andrew+nonews@supernews.com> writes:
>> On 2005-06-22, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> Andreas Pflug <pgadmin@pse-consulting.de> writes:
>>>> I've seen cancel *not* working.
>>> 
>>> Even a moment's perusal of the code will prove that there is no
>>> situation in which a backend will respond to SIGTERM but not SIGINT
>
>> "idle in transaction". (or "idle" for that matter, but that's usually less
>> significant.)
>
> In that case there's no query to cancel, so I would dispute the claim
> that that constitutes "not working".

You are totally missing the point.

A backend that is "idle in transaction" is holding locks and an open xid
that cannot be cleared by anything short of SIGTERM.

Whether the fact that it ignores SIGINT is intentional or not is irrelevent,
the fact is that this is the classic scenario where SIGTERM is effective and
SIGINT is not.

-- 
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services


Re: pg_terminate_backend idea

From
"Magnus Hagander"
Date:
> >> In any case the correct way to solve the problem is to find out
> >> what's being left corrupt by SIGTERM, rather than install more
> >> messiness in order to avoid facing the real issue ...
>
> > That is unfortunatly way over my head. And it doesn't seem like
> > anybody who actually has what it takes to do the "proper
> solution" is
> > interested in doing it.
>
> A test case --- even one that fails only a small percentage
> of the time
> --- would make things far easier.  So far all I've seen are
> very vague reports, and it's impossible to do anything about
> it without more info.

Very well. Let me try putting it like this, then:

Assuming we don't get such a case, and a chance to fix it, before 8.1
(while still hoping we will get it fixed properly, we can't be sure, can
we? If we were, it'd be fixed already). In this case, will you consider
such a kludgy solution as a temporary fix to resolve a problem that a
lot of users are having? And then plan to have it removed once sending
SIGTERM directly to a backend can be considered safe?


//Magnus


Re: pg_terminate_backend idea

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
> Assuming we don't get such a case, and a chance to fix it, before 8.1
> (while still hoping we will get it fixed properly, we can't be sure, can
> we? If we were, it'd be fixed already). In this case, will you consider
> such a kludgy solution as a temporary fix to resolve a problem that a
> lot of users are having? And then plan to have it removed once sending
> SIGTERM directly to a backend can be considered safe?

Kluges tend to become institutionalized, so my reaction is "no".  It's
also worth pointing out that with so little understanding of the problem
Rod is reporting, it's tough to make a convincing case that this kluge
will avoid it.  SIGTERM exit *shouldn't* be leaving any corrupted
locktable entries behind; it's not that much different from the normal
case.  Until we find out what's going on, introducing still another exit
path isn't really going to make me feel more comfortable, no matter how
close it's alleged to be to the normal path.
        regards, tom lane


Re: pg_terminate_backend idea

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Magnus Hagander" <mha@sollentuna.net> writes:
> > Assuming we don't get such a case, and a chance to fix it, before 8.1
> > (while still hoping we will get it fixed properly, we can't be sure, can
> > we? If we were, it'd be fixed already). In this case, will you consider
> > such a kludgy solution as a temporary fix to resolve a problem that a
> > lot of users are having? And then plan to have it removed once sending
> > SIGTERM directly to a backend can be considered safe?
>
> Kluges tend to become institutionalized, so my reaction is "no".  It's
> also worth pointing out that with so little understanding of the problem
> Rod is reporting, it's tough to make a convincing case that this kluge
> will avoid it.  SIGTERM exit *shouldn't* be leaving any corrupted
> locktable entries behind; it's not that much different from the normal
> case.  Until we find out what's going on, introducing still another exit
> path isn't really going to make me feel more comfortable, no matter how
> close it's alleged to be to the normal path.

I have been running some tests to try to see the lock table corruption
but I have been unable to reproduce the problem.  I have attached my
crude test scripts.  I would run the scripts and then open another
session as a different user and do UPDATE and LOCK to cause the psql
session to block.

The only functional difference I can see between a SIGTERM exit and a
cancel followed by a normal exit is the call to
AbortCurrentTransaction().  Could that be significant?  Because I can't
reproduce the failure I can't know for sure.

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#!/usr/contrib/bin/expect --
set timeout -1
eval spawn sql test
expect -nocase "test=>"
send "begin;\r"
expect -nocase "test=>"
send "lock pg_class;\r"
expect -nocase "test=>"
send "select * from pg_locks;\r"
expect -nocase "test=>"
send "update test set x=3;\r"
expect -nocase "test=>"
expect eof
exit
while :
do
    XPID=`/letc/ps_sysv -ef | grep 'postgres test'|grep -v grep|awk '{print $2}'`
    if [ "$XPID" != "" ]
    then    kill $XPID
        echo $XPID
        XPID=`/letc/ps_sysv -ef | grep 'psql test'|grep -v execargs|awk '{print $2}'`
        kill $XPID
    fi
    sleep 1
done

Re: pg_terminate_backend idea

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I have been running some tests to try to see the lock table corruption
> but I have been unable to reproduce the problem.

It's possible that what Rod was complaining of is fixed in CVS tip ---
see discussion about RemoveFromWaitQueue() bug.  If so, it would have
been a bug that could be seen in other circumstances too, but maybe
SIGTERM made it more probable for some reason.
        regards, tom lane


Re: pg_terminate_backend idea

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > I have been running some tests to try to see the lock table corruption
> > but I have been unable to reproduce the problem.
> 
> It's possible that what Rod was complaining of is fixed in CVS tip ---
> see discussion about RemoveFromWaitQueue() bug.  If so, it would have
> been a bug that could be seen in other circumstances too, but maybe
> SIGTERM made it more probable for some reason.

Was that backpatched to 8.0.X?  If not, I can test that branch of CVS
for the problem.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: pg_terminate_backend idea

From
Rod Taylor
Date:
On Tue, 2005-06-21 at 23:34 -0400, Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Tom Lane wrote:
> >> In any case the correct way to solve the problem is to find out what's
> >> being left corrupt by SIGTERM, rather than install more messiness in
> >> order to avoid facing the real issue ...
> 
> > I am confused.  Are you talking about the client SIGTERM or the server? 
> 
> I am talking about Rod Taylor's reports that SIGTERM'ing individual
> backends tends to lead to "lock table corrupted" crashes awhile later.
> Now, I've been playing the part of Chicken Little on this for awhile,
> but seeing an actual report of problems from the field certainly
> strengthens my feelings about it.

Bringing this thread back to life.

I have not seen a lock table corruption issue with SIGTERM in 8.1 on
Solaris/Sun IV, Linux/AMD64, or Linux/Intel. I don't recall seeing one
on 8.0.3 either though I'm pretty sure there were several on 8.0.1.

There are times when locks for a process hang around for a few minutes
before getting cleared. I don't recall whether they were ungranted table
locks or entries waiting on a transaction ID lock, but the source was
Slony and a large pg_listener structure with more than 20000 pages (yes,
pages not tuples).

I have also seen processes refusing to acknowledge the signal and exit
during btree index builds, but that's not a data corruption issue.
--