Thread: Function to kill backend

Function to kill backend

From
"Magnus Hagander"
Date:
Hi!

When debugging on win32, I've created myself a little function that I
feel should be added to the "backend proper". While it adds a lot of
vlaue on win32, I think it adds quite a bit of value on non-win32
platforms as well...

The function is pg_kill_backend(<pid>,<signal>). superuser-only, of
course. Which simply sends a signal to the specified backend -
querycancel, terminate, etc.

The advantage over using the kill command from a shell account is, well,
you don't need shell access to the db server. On win32, that's going to
be more common than on Unix - plus, if you want to signal a specific
backend, you need a special tool (can't do from tas kmanager/service
manager etc - service manager can only do the postmaster, and task
manager can only do kill -9).

I also think a function like this could be good to have for e.g.
pgadmin, to implement some more "management functionality". For example,
in MSSQL I can go into a view called "current activity", pick a "bad
user", right-click and cancel query or terminate session. To do this
remote, a funciton like this is required. pg_stat_activity can be used
to get a list of sessions and their pids. The function should probably
be complemented with a pg_get_postmasterpid or something along that way,
to be able to send signals to th epostmaster itself.

So, would such a function be accepted into the backend code? And if so,
any preferences on where you want it put?


//Magnus



Re: Function to kill backend

From
Andrew Dunstan
Date:
Magnus Hagander wrote:

>Hi!
>
>When debugging on win32, I've created myself a little function that I
>feel should be added to the "backend proper". While it adds a lot of
>vlaue on win32, I think it adds quite a bit of value on non-win32
>platforms as well...
>
>The function is pg_kill_backend(<pid>,<signal>). superuser-only, of
>course. Which simply sends a signal to the specified backend -
>querycancel, terminate, etc.
>  
>

If' we're going to have this shouldn't it be a proper command? And maybe 
an internal shutdown command to go with it?

cheers

andrew





Re: Function to kill backend

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Magnus Hagander wrote:
>> The function is pg_kill_backend(<pid>,<signal>). superuser-only, of
>> course. Which simply sends a signal to the specified backend -
>> querycancel, terminate, etc.

> If' we're going to have this shouldn't it be a proper command? And maybe 
> an internal shutdown command to go with it?

I don't like the idea at all, but if we were to have something
it would definitely need to be a lot more constrained than
send-any-signal-to-any-postgres-process ... even for a superuser,
that's a mighty fat-gauge foot-gun.
        regards, tom lane


Re: Function to kill backend

From
"Magnus Hagander"
Date:
>>Hi!
>>
>>When debugging on win32, I've created myself a little function that I
>>feel should be added to the "backend proper". While it adds a lot of
>>vlaue on win32, I think it adds quite a bit of value on non-win32
>>platforms as well...
>>
>>The function is pg_kill_backend(<pid>,<signal>). superuser-only, of
>>course. Which simply sends a signal to the specified backend -
>>querycancel, terminate, etc.
>>
>>
>
>If' we're going to have this shouldn't it be a proper command?
>And maybe
>an internal shutdown command to go with it?

I guess it could be. I guess:

1) I don't know how to do one of those ;-) Which is why I didn't even
think it.

2) Won't that clutter up the namespace more, by introducing more
keywords that you can't use for other things?

3) Will it still be possible to do the kind of things Rod mentioned,
e.g.
SELECT pg_kill_backend(procpid, 'TERM') FROM pg_stat_activity WHERE
current_query LIKE '<IDLE>%';
?

//Magnus


Re: Function to kill backend

From
"Dann Corbit"
Date:
> -----Original Message-----
> From: Magnus Hagander [mailto:mha@sollentuna.net]
> Sent: Friday, April 02, 2004 2:34 PM
> To: Andrew Dunstan; pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Function to kill backend
>
>
> >>Hi!
> >>
> >>When debugging on win32, I've created myself a little
> function that I
> >>feel should be added to the "backend proper". While it adds
> a lot of
> >>vlaue on win32, I think it adds quite a bit of value on non-win32
> >>platforms as well...
> >>
> >>The function is pg_kill_backend(<pid>,<signal>). superuser-only, of
> >>course. Which simply sends a signal to the specified backend -
> >>querycancel, terminate, etc.
> >>
> >>
> >
> >If' we're going to have this shouldn't it be a proper command?
> >And maybe
> >an internal shutdown command to go with it?
>
> I guess it could be. I guess:
>
> 1) I don't know how to do one of those ;-) Which is why I
> didn't even think it.
>
> 2) Won't that clutter up the namespace more, by introducing
> more keywords that you can't use for other things?
>
> 3) Will it still be possible to do the kind of things Rod
> mentioned, e.g. SELECT pg_kill_backend(procpid, 'TERM') FROM
> pg_stat_activity WHERE current_query LIKE '<IDLE>%'; ?

Sybase had something like that:

Syb_kill <pid> to kill a dangling process.

And the undocumented:

Syb_terminate <pid>
to absolutely, positively kill it (Syb_kill only worked sometimes).

In general, I think this approach is a bit worrisome.  It reminds one of
the famous tagline:

"Tip: Don't kill -9 the postmaster."



Re: Function to kill backend

From
Andrew Dunstan
Date:

Tom Lane wrote:

>Andrew Dunstan <andrew@dunslane.net> writes:
>  
>
>>Magnus Hagander wrote:
>>    
>>
>>>The function is pg_kill_backend(<pid>,<signal>). superuser-only, of
>>>course. Which simply sends a signal to the specified backend -
>>>querycancel, terminate, etc.
>>>      
>>>
>
>  
>
>>If' we're going to have this shouldn't it be a proper command? And maybe 
>>an internal shutdown command to go with it?
>>    
>>
>
>I don't like the idea at all, but if we were to have something
>it would definitely need to be a lot more constrained than
>send-any-signal-to-any-postgres-process ... even for a superuser,
>that's a mighty fat-gauge foot-gun.
>
>  
>


What sort of constraints do you have in mind?

cheers

andrew



Re: Function to kill backend

From
Tom Lane
Date:
Andrew Dunstan <andrew@dunslane.net> writes:
> Tom Lane wrote:
>> it would definitely need to be a lot more constrained than
>> send-any-signal-to-any-postgres-process ... even for a superuser,
>> that's a mighty fat-gauge foot-gun.

> What sort of constraints do you have in mind?

I'd limit it to SIGINT (query cancel) and SIGTERM (fast shutdown),
and I'm not even real sure about SIGTERM.  That facility is designed to
work in the case of shutting down all backends together --- I'm not sure
I want to promise that it behaves pleasantly to SIGTERM one backend and
leave the rest going.  Nor do I see a real good use-case for it.

Also, no killing processes that aren't regular backends (eg, the
bgwriter, the stats processes, and most especially the postmaster).

Another point is that killing by PID is not necessarily what you want to
do --- kill by transaction ID might be a better API, especially for
query-cancel cases.
        regards, tom lane


Re: Function to kill backend

From
Bruce Momjian
Date:
Tom Lane wrote:
> Andrew Dunstan <andrew@dunslane.net> writes:
> > Tom Lane wrote:
> >> it would definitely need to be a lot more constrained than
> >> send-any-signal-to-any-postgres-process ... even for a superuser,
> >> that's a mighty fat-gauge foot-gun.
> 
> > What sort of constraints do you have in mind?
> 
> I'd limit it to SIGINT (query cancel) and SIGTERM (fast shutdown),
> and I'm not even real sure about SIGTERM.  That facility is designed to
> work in the case of shutting down all backends together --- I'm not sure
> I want to promise that it behaves pleasantly to SIGTERM one backend and
> leave the rest going.  Nor do I see a real good use-case for it.
> 
> Also, no killing processes that aren't regular backends (eg, the
> bgwriter, the stats processes, and most especially the postmaster).
> 
> Another point is that killing by PID is not necessarily what you want to
> do --- kill by transaction ID might be a better API, especially for
> query-cancel cases.

Seems like useful functionality.  Right now, how does an administrator
kill another backend from psql?  They can't.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Seems like useful functionality.  Right now, how does an administrator
> kill another backend from psql?  They can't.

The question to ask is "should they be able to?"

I think any such facility is inherently a security risk, since it means
that a remote attacker who's managed to break into your superuser
account can randomly zap other backends.  Now admittedly there's plenty
of other mischief he can do with superuser privs, but that doesn't mean
we should hand him a pre-loaded, pre-sighted cannon.

Having to log into the database server locally to execute such
operations doesn't seem that bad to me.
        regards, tom lane


Re: Function to kill backend

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Seems like useful functionality.  Right now, how does an administrator
> > kill another backend from psql?  They can't.
> 
> The question to ask is "should they be able to?"
> 
> I think any such facility is inherently a security risk, since it means
> that a remote attacker who's managed to break into your superuser
> account can randomly zap other backends.  Now admittedly there's plenty
> of other mischief he can do with superuser privs, but that doesn't mean
> we should hand him a pre-loaded, pre-sighted cannon.
> 
> Having to log into the database server locally to execute such
> operations doesn't seem that bad to me.

If they can read/write your data (as superuser), killing backends is the
least worry.

I can see it as useful as part of pg_stat_activity output.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
"Magnus Hagander"
Date:
> Tom Lane wrote:
>>> it would definitely need to be a lot more constrained than
>>> send-any-signal-to-any-postgres-process ... even for a superuser,
>>> that's a mighty fat-gauge foot-gun.
>
>> What sort of constraints do you have in mind?
>
>I'd limit it to SIGINT (query cancel) and SIGTERM (fast shutdown),

Actually, that is a restriction that's already there - just didn't get
into those details. Since the functino as I wrote it so far just takes
signal name as a string (can't rely on signal numbers being identical
across platforms, right?), and then comparing it with a fixed set of
signals.


>and I'm not even real sure about SIGTERM.  That facility is designed to
>work in the case of shutting down all backends together ---
>I'm not sure
>I want to promise that it behaves pleasantly to SIGTERM one backend and
>leave the rest going.  Nor do I see a real good use-case for it.

Really? Then what is the recommended way of shutting down a backend that
you are not connected to, as an administrator? Even if you are logged in
with shell access?

I may have been doing things wrong for a long time, because I have
certainly killed backends with TERM many times without problems. If
that's not safe, there really ought to be a tip on the mailinglists to
complement the  "don't kill -9 the postmaster" with "and don't ever kill
the backends, period"? I'm sure I'm not the only one who has done
that...

>Also, no killing processes that aren't regular backends (eg, the
>bgwriter, the stats processes, and most especially the postmaster).

That sounds like a reasonable limitation to add. Either by specifically
excluding these processes, or by limiting it to only work on the
backends currently listed in pg_stat_activity.


>Another point is that killing by PID is not necessarily what
>you want to
>do --- kill by transaction ID might be a better API, especially for
>query-cancel cases.

Well, in my scenarios, killing by PID is what I need. But I guess
transaction IDs might be added to the pg_stat_activity, which would give
me the same functionality (I usually check that one first to see what a
backend does, before I do anything) - and then some, because the
transaction id carries other information as well.
Question on that - how will it handle an idle backend (that has not
explicitly opened a transaction, and is not executing a command in an
implicit transaction)?

> I think any such facility is inherently a security risk, since it
means
> that a remote attacker who's managed to break into your superuser
> account can randomly zap other backends.  Now admittedly there's
plenty
> of other mischief he can do with superuser privs, but that doesn't
mean
> we should hand him a pre-loaded, pre-sighted cannon.
> Having to log into the database server locally to execute such
> operations doesn't seem that bad to me.

It does to me. I prefer being able to admin the server without having to
do a separate login. I also much prefer being able to delegate the
capability to terminate a backend, interrupt a long-running query, etc
to someone who does not have to have shell access on the server. I guess
it depends on the environment.

> Bruce Momjian <pgman@candle.pha.pa.us> writes:

>> If they can read/write your data (as superuser), killing backends is
the
>> least worry.

That's pretty much the assumption I was working under.


//Magnus


Re: Function to kill backend

From
Josh Berkus
Date:
Tom,

> > Seems like useful functionality.  Right now, how does an administrator
> > kill another backend from psql?  They can't.
>
> The question to ask is "should they be able to?"

And the answer is, "Yes".

This is a commonly requested feature by DBA's migrating from SQL Server and
Oracle.    In those databases, there is a GUI to monitor database requests,
and potentially kill them to resolve deadlocks or runaway queries (though, in
the case of SQL server, it does not work).  Right now, it is very difficult
for any of our GUI projects to construct such an interface due to the
necessity of root shell access.

> I think any such facility is inherently a security risk, since it means
> that a remote attacker who's managed to break into your superuser
> account can randomly zap other backends.  Now admittedly there's plenty
> of other mischief he can do with superuser privs, but that doesn't mean
> we should hand him a pre-loaded, pre-sighted cannon.

And requiring DBAs to use root shell access whenever they want to stop a
runaway query is somehow a good security approach?     If nothing else, it
exposes lots of DBAs to the temptation to use SIGKILL instead off SIGINT or
SIGTERM, making the database shut down.   And I, personally, worry about the
number of root shells I have to use, becuase every once in a while I forget
and leave one open at the end of the day.

Killing backends with runaway queries is a routine administrative task.   It
should be possible to accomplish it remotely, using tools provided by
PostgreSQL instead of the command shell, because then it is possible for us
to limit what those tools can do.

Further, if an intruder has superuser access, having them kill random backends
is the last thing I'm worried about.  "DROP DATABASE" ranks a lot higher.
In fact, it would be nice if they started killing random backends because
then I'd know something was wrong.

--
-Josh BerkusAglio Database SolutionsSan Francisco



Re: Function to kill backend

From
Christopher Kings-Lynne
Date:
> This is a commonly requested feature by DBA's migrating from SQL Server and 
> Oracle.    In those databases, there is a GUI to monitor database requests, 
> and potentially kill them to resolve deadlocks or runaway queries (though, in 
> the case of SQL server, it does not work).  Right now, it is very difficult 
> for any of our GUI projects to construct such an interface due to the 
> necessity of root shell access.

Yes, MySQL can do it too.

http://www.mysql.com/doc/en/KILL.html

I would love to have a KILL <pid> command in postgres.  I don't know how 
you would restrict it to only being able to kill postgres backends though.

Chris


Re: Function to kill backend

From
Neil Conway
Date:
On 2-Apr-04, at 5:16 PM, Andrew Dunstan wrote:
> If' we're going to have this shouldn't it be a proper command?

Why? What benefit would this offer over implementing this feature as a 
function?

-Neil



Re: Function to kill backend

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> Killing backends with runaway queries is a routine administrative
> task.

Cancelling runaway queries is a routine task.  I'm less convinced that a
remote kill (ie SIGTERM) facility is such a great idea.
        regards, tom lane


Re: Function to kill backend

From
Mike Mascari
Date:
Tom Lane wrote:

> Josh Berkus <josh@agliodbs.com> writes:
> 
>>Killing backends with runaway queries is a routine administrative
>>task.
> 
> 
> Cancelling runaway queries is a routine task.  I'm less convinced that a
> remote kill (ie SIGTERM) facility is such a great idea.

Of course, cancelling runaway queries on Oracle is only a necessity 
if the DBA hasn't made use of resource limits - PROFILEs. ;-)

Mike Mascari





Re: Function to kill backend

From
"Andrew Dunstan"
Date:
Neil Conway said:
> On 2-Apr-04, at 5:16 PM, Andrew Dunstan wrote:
>> If' we're going to have this shouldn't it be a proper command?
>
> Why? What benefit would this offer over implementing this feature as a
> function?
>

psql help

cheers

andrew




Re: Function to kill backend

From
"Magnus Hagander"
Date:
>> Killing backends with runaway queries is a routine administrative
>> task.
>
>Cancelling runaway queries is a routine task.  I'm less
>convinced that a
>remote kill (ie SIGTERM) facility is such a great idea.

Consider a scenario like:
User A starts transaction.
User A issues a LOCK TABLE (or does something to lock it)
User A goes on vacation without commit/rollback

User A might well be Program A instead, of course. Caught in a tight
loop, waiting for user input, or whatever.

In this case, SIGINT (query cancel) will not help, because all locks
held by the transaction will still be held.

If there was a way to "force rollback" a connection, that could be done.
Buf AFAIK there are none? And would those be safer/better than
terminating the backend?

//Magnus


Re: Function to kill backend

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
> In this case, SIGINT (query cancel) will not help, because all locks
> held by the transaction will still be held. 

Wrong.
        regards, tom lane


Re: Function to kill backend

From
"Magnus Hagander"
Date:
> > In this case, SIGINT (query cancel) will not help, because
> all locks
> > held by the transaction will still be held.
>
> Wrong.

Really?
Please point out where I am wrong in this:

SESSION A:        BEGIN TRANSACTION
SESSION A:        LOCK TABLE foo IN ACCESS EXCLUSIVE MODE
SESSION B:        SELECT * FROM foo      <-- blocks (expected)!
not-in-session:    KILL -INT <pid_of_session_A>(at this point, <pid_of_session_A> is listed as "idle in transaction")
SESSION B:        <-- still blcoks....
SESSION A:         ROLLBACK
SESSION B:        <-- now completes

If I kill it with TERM instead of INT, SESSION B is released at that
point.

Am I missing something very obvious here?


This is on:PostgreSQL 7.4.2 on i686-pc-linux-gnu, compiled by GCC 2.95.3


//Magnus


Re: Function to kill backend

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
>>> In this case, SIGINT (query cancel) will not help, because 
>>> all locks held by the transaction will still be held.
>> 
>> Wrong.

> Really? 

[ experiments... ]  My apologies --- you are correct about the present
behavior.  If a SIGINT arrives while waiting for client input, it's just
dropped on the floor.  The locks *will* be dropped if the SIGINT arrives
during actual query processing.

It strikes me that this is incorrect behavior, at least for the case
where the client has a transaction block open.  It'd be better to define
the interrupt as "transaction cancel".
        regards, tom lane


Re: Function to kill backend

From
Rod Taylor
Date:
On Fri, 2004-04-02 at 16:52, Magnus Hagander wrote:
> Hi!
>
> When debugging on win32, I've created myself a little function that I
> feel should be added to the "backend proper". While it adds a lot of
> vlaue on win32, I think it adds quite a bit of value on non-win32
> platforms as well...
>
> The function is pg_kill_backend(<pid>,<signal>). superuser-only, of
> course. Which simply sends a signal to the specified backend -
> querycancel, terminate, etc.

Nice.. My new favourite command is going to be:

SELECT pg_kill_backend(procpid, 'TERM') FROM pg_stat_activity WHERE
current_query LIKE '<IDLE>%';

--
Rod Taylor <rbt [at] rbt [dot] ca>

Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
PGP Key: http://www.rbt.ca/signature.asc

Re: Function to kill backend

From
Jan Wieck
Date:
Bruce Momjian wrote:
> Tom Lane wrote:
>> Bruce Momjian <pgman@candle.pha.pa.us> writes:
>> > Seems like useful functionality.  Right now, how does an administrator
>> > kill another backend from psql?  They can't.
>> 
>> The question to ask is "should they be able to?"
>> 
>> I think any such facility is inherently a security risk, since it means
>> that a remote attacker who's managed to break into your superuser
>> account can randomly zap other backends.  Now admittedly there's plenty
>> of other mischief he can do with superuser privs, but that doesn't mean
>> we should hand him a pre-loaded, pre-sighted cannon.
>> 
>> Having to log into the database server locally to execute such
>> operations doesn't seem that bad to me.
> 
> If they can read/write your data (as superuser), killing backends is the
> least worry.

Even as superuser, they still need to get a lock to drop the table. So 
killing other backends will ...

This is so pointless. If an attacker manages to become superuser in the 
compromised database, what good are restrictions against killing 
backends? I agree that it should be restricted to backends, with an 
identification based on Xid and SIGINT. But that's it.


Jan

-- 
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me.                                  #
#================================================== JanWieck@Yahoo.com #



Re: Function to kill backend

From
"Magnus Hagander"
Date:
> >>> In this case, SIGINT (query cancel) will not help, because
> >>> all locks held by the transaction will still be held.
> >>
> >> Wrong.
>
> > Really?
>
> [ experiments... ]  My apologies --- you are correct about
> the present behavior.  If a SIGINT arrives while waiting for
> client input, it's just dropped on the floor.  The locks
> *will* be dropped if the SIGINT arrives during actual query
> processing.
>
> It strikes me that this is incorrect behavior, at least for
> the case where the client has a transaction block open.  It'd
> be better to define the interrupt as "transaction cancel".

That sounds reasonable. That would certainly solve this part.

That leaves justo ne part of the issue - ability to "kill off" idle
backends. Since they do use up backend "slots" (of which there is a
limited number), I still think this wuold be good. Dunno if it might be
possible to use the same signal for that, under the following scheme:
if (query_running)  cancel_query abort transaction
else if (idle in transaction) abort transaction
else if (idle not in transaction) graceful shutdown

or if that is too confusing? Fromt he "idle not in transaction", it
should be possibleo treat it the same way a "connection lost" would be,
no? (Instead of forcibly going down as SIGTERM does).

With that, I would definitly think limiting to SIGINT would be enough.


A question on using xids for identification instead of PIDs: is there a
xid assigned to a process thati s in the state "idle not in
transaction"? Otherwise, there still needs to be a way to send a signal
to those. I don't see why the PID is not a good idea (assuming
only-backends-restriction, but is even that necessary if you just send
SIGINT?)


//Magnus


Re: Function to kill backend

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
> if (query_running) 
>   cancel_query
>   abort transaction
> else if (idle in transaction)
>   abort transaction
> else if (idle not in transaction)
>   graceful shutdown

> or if that is too confusing?

Too hazardous.  Say you meant to kill a slow query, only it finishes
just before you press RETURN.  The above definition silently promotes
your query cancel into a session kill, which is not what you intended.

(Silently promoting query cancel to transaction cancel seems okay to me,
since a successful query cancel aborts the xact anyway.  However this
might need closer thought to consider what will happen with nested
xacts.)

> A question on using xids for identification instead of PIDs: is there a
> xid assigned to a process thati s in the state "idle not in
> transaction"?

No, and I assert you do not need one if the facility is limited to query
or transaction cancel; more, an XID-based identification would be less
prone to race conditions of the kind just mentioned (you couldn't
mistakenly zap the transaction after the one you meant to).

If we are going to allow session kill then of course we need PID for
that.
        regards, tom lane


Re: Function to kill backend

From
"Magnus Hagander"
Date:
> > if (query_running)
> >   cancel_query
> >   abort transaction
> > else if (idle in transaction)
> >   abort transaction
> > else if (idle not in transaction)
> >   graceful shutdown
>
> > or if that is too confusing?
>
> Too hazardous.  Say you meant to kill a slow query, only it
> finishes just before you press RETURN.  The above definition
> silently promotes your query cancel into a session kill,
> which is not what you intended.

Ok. That's a good point :-)


> (Silently promoting query cancel to transaction cancel seems
> okay to me, since a successful query cancel aborts the xact
> anyway.  However this might need closer thought to consider
> what will happen with nested
> xacts.)

Sounds good to me.


> > A question on using xids for identification instead of
> PIDs: is there
> > a xid assigned to a process thati s in the state "idle not in
> > transaction"?
>
> No, and I assert you do not need one if the facility is
> limited to query or transaction cancel; more, an XID-based
> identification would be less prone to race conditions of the
> kind just mentioned (you couldn't mistakenly zap the
> transaction after the one you meant to).

Yeah, under that assumption, it certainly is enough.


> If we are going to allow session kill then of course we need
> PID for that.

I still say we need this. Is there some other good mechanism to do this
that you see? Since SIGTERM is not good... (I for one would definitly be
happy enough with being able to drop something that's in "idle without
transaction", since you can always use SIGINT to get it down to that
level)


//Magnus


Re: Function to kill backend

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
>> If we are going to allow session kill then of course we need 
>> PID for that.

> I still say we need this.

Well, that seems to be the consensus, so I won't stand in the way.

If you like the cancel-by-XID idea then I'd suggest providing two
functions: pg_cancel_query(XID) to send SIGINT and pg_kill_session(PID)
to send SIGTERM.  I'm not sure if the other two combinations (SIGINT by
PID and SIGTERM by XID) are worth providing.
        regards, tom lane


Re: Function to kill backend

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Magnus Hagander" <mha@sollentuna.net> writes:
> >> If we are going to allow session kill then of course we need 
> >> PID for that.
> 
> > I still say we need this.
> 
> Well, that seems to be the consensus, so I won't stand in the way.
> 
> If you like the cancel-by-XID idea then I'd suggest providing two
> functions: pg_cancel_query(XID) to send SIGINT and pg_kill_session(PID)
> to send SIGTERM.  I'm not sure if the other two combinations (SIGINT by
> PID and SIGTERM by XID) are worth providing.

Agreed.  Two functions, cancel by xid (skips cancel if xid changes
during command), and kill by pid for remote admin apps, is great.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
"Magnus Hagander"
Date:
> Tom Lane wrote:
> > "Magnus Hagander" <mha@sollentuna.net> writes:
> > >> If we are going to allow session kill then of course we need
> > >> PID for that.
> >
> > > I still say we need this.
> >
> > Well, that seems to be the consensus, so I won't stand in the way.
> >
> > If you like the cancel-by-XID idea then I'd suggest providing two
> > functions: pg_cancel_query(XID) to send SIGINT and
> > pg_kill_session(PID) to send SIGTERM.  I'm not sure if the
> other two
> > combinations (SIGINT by PID and SIGTERM by XID) are worth providing.
>
> Agreed.  Two functions, cancel by xid (skips cancel if xid
> changes during command), and kill by pid for remote admin
> apps, is great.

This sounds good to me, too. (Though I think SIGINT by PID might be
good, I see no need for SIGTERM by XID).

But are you saying it *is* safe with SIGTERM to a backend? I always
thought it was before, but after what you said earlier in this thread, I
changed my mind... (Unfort, I'm not well enough into the deep ends of
the code to say which is right...)

//Magnus


Re: Function to kill backend

From
Bruce Momjian
Date:
Tom Lane wrote:
> "Magnus Hagander" <mha@sollentuna.net> writes:
> > But are you saying it *is* safe with SIGTERM to a backend?
> 
> I'm saying I'm not happy about promoting that to the status of a
> supported feature.  Up to now it's always been "if it breaks you
> get to keep both pieces", but if there's a built-in function to do it
> then people will naturally expect a certain level of reliability.
> 
> Maybe it works fine, maybe it doesn't.  But I foresee having to expend
> work down the road because of this addition.

I assume admins are already doing this (via kill), so whether it is
supported or not, we should give folks a safe way to do this.  This
seems like basic functionality we should be giving admins, even if it
does take some work on our part.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> I assume admins are already doing this (via kill), so whether it is
> supported or not, we should give folks a safe way to do this.

I don't think it's an open-and-shut decision as to whether people
actually *need* to do session kills (as opposed to query/transaction
kills).  The arguments presented so far are not convincing to my mind,
certainly not convincing enough to buy into a commitment to do whatever
it takes to support that.

The fact that people do it doesn't make it a good idea.
        regards, tom lane


Re: Function to kill backend

From
Tom Lane
Date:
"Magnus Hagander" <mha@sollentuna.net> writes:
> But are you saying it *is* safe with SIGTERM to a backend?

I'm saying I'm not happy about promoting that to the status of a
supported feature.  Up to now it's always been "if it breaks you
get to keep both pieces", but if there's a built-in function to do it
then people will naturally expect a certain level of reliability.

Maybe it works fine, maybe it doesn't.  But I foresee having to expend
work down the road because of this addition.
        regards, tom lane


Re: Function to kill backend

From
Bruce Momjian
Date:
Josh Berkus wrote:
> Tom,
> 
> > I don't think it's an open-and-shut decision as to whether people
> > actually *need* to do session kills (as opposed to query/transaction
> > kills).  The arguments presented so far are not convincing to my mind,
> > certainly not convincing enough to buy into a commitment to do whatever
> > it takes to support that.
> 
> Hmmm ... well, I can make a real-world case from my supported apps for 
> transaction/statement kills.   But my support for session kills is just 
> hypothetical; any  time I've had to kill off sessions, it's because I had to 
> shut the database down, and that's better done from the command line.
> 
> My web apps which need to manage the number of connections do it through their 
> connection pool.
> 
> So I would vote for Yes on SIGINT by XID, but No on SIGTERM by PID, if Tom 
> thinks there will be any significant support & troubleshooting involved for 
> the latter.
> 
> Unless, of course, someone can give us a real business case that they have 
> actually encountered in production.

Someone already posted some pseudocode where they wanted to kill idle
backends, perhaps as part of connection pooling.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Josh Berkus
Date:
Bruce,

> Someone already posted some pseudocode where they wanted to kill idle
> backends, perhaps as part of connection pooling.

I'm not talking about code.  I'm talking about a *reason*.

i.e.:  "I'm administrator of the blah-blah project.   We had a lot of trouble 
managing idle connections to PG because of blah-blah.   A function to kill 
off idle connctions would really help us becuase blah-blah-blah."

So far, all we've heard in favor of SIGTERM-by-PID are *hypothetical* cases.   
Now Tom's telling us that there is a real cost attached to having this 
feature.   Before we do it anyway, I want to be convinced that someone really 
needs it.  It is *not* our practice to add features "just because we can."

Otherwise, I'll stick by my assertion that idle connection management should 
be done in the middleware and NOT by psql.

-- 
-Josh BerkusAglio Database SolutionsSan Francisco



Re: Function to kill backend

From
Josh Berkus
Date:
Tom,

> I don't think it's an open-and-shut decision as to whether people
> actually *need* to do session kills (as opposed to query/transaction
> kills).  The arguments presented so far are not convincing to my mind,
> certainly not convincing enough to buy into a commitment to do whatever
> it takes to support that.

Hmmm ... well, I can make a real-world case from my supported apps for 
transaction/statement kills.   But my support for session kills is just 
hypothetical; any  time I've had to kill off sessions, it's because I had to 
shut the database down, and that's better done from the command line.

My web apps which need to manage the number of connections do it through their 
connection pool.

So I would vote for Yes on SIGINT by XID, but No on SIGTERM by PID, if Tom 
thinks there will be any significant support & troubleshooting involved for 
the latter.

Unless, of course, someone can give us a real business case that they have 
actually encountered in production.

-- 
Josh Berkus
Aglio Database Solutions
San Francisco


Re: Function to kill backend

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> So I would vote for Yes on SIGINT by XID, but No on SIGTERM by PID, if Tom 
> thinks there will be any significant support & troubleshooting involved for 
> the latter.

Quite honestly, I don't know.  We know that some people have done manual
SIGTERMs and not been burnt; and I'm not aware of any reason why it
wouldn't work; but I don't think it's well enough tested to be sure that
it will work.  The sort of problem that I fear could arise is analogous
to the problem with kill -9'ing the postmaster: sure, the process is
gone and the database on disk is okay, but there might be resource leaks
or other problems left behind in shared memory.  We would not see such
problems in normal use because SIGTERM is associated with complete
database shutdown and release of shared memory.  With retail SIGTERM and
leaving the database up, though, it's a whole new ballgame and the
cleanup requirements become much stricter.

So like I say, I'm hesitant to buy into supporting this without a fairly
convincing argument that it's really needed.
        regards, tom lane


Re: Function to kill backend

From
Josh Berkus
Date:
Bruce,

> OK, you have a runaway report. You want to stop it.  Query cancel is
> only going to stop the current query, and once you do that the next
> query is fed in so there is no way to actually stop the report,
> especially if the report is not being run from the same machine as the
> server (you can't kill the report process).  How do you stop it without
> SIGTERM?  You don't want to shut down the postmaster.

Hmmm ... but, at least in the case of my apps, killing the PG connection 
wouldn't fix things.   Most apps I work on are designed to detect connection 
failure and reconnect.   I suspect that most platforms that use connection 
pooling are the same.  So your case would only work if you actually blocked 
all connections from that host -- not a capability we'd discussed.

-- 
-Josh BerkusAglio Database SolutionsSan Francisco



Re: Function to kill backend

From
Bruce Momjian
Date:
Josh Berkus wrote:
> Bruce,
> 
> > Someone already posted some pseudocode where they wanted to kill idle
> > backends, perhaps as part of connection pooling.
> 
> I'm not talking about code.  I'm talking about a *reason*.
> 
> i.e.:  "I'm administrator of the blah-blah project.   We had a lot of trouble 
> managing idle connections to PG because of blah-blah.   A function to kill 
> off idle connctions would really help us becuase blah-blah-blah."
> 
> So far, all we've heard in favor of SIGTERM-by-PID are *hypothetical* cases.   
> Now Tom's telling us that there is a real cost attached to having this 
> feature.   Before we do it anyway, I want to be convinced that someone really 
> needs it.  It is *not* our practice to add features "just because we can."
> 
> Otherwise, I'll stick by my assertion that idle connection management should 
> be done in the middleware and NOT by psql.

OK, you have a runaway report. You want to stop it.  Query cancel is
only going to stop the current query, and once you do that the next
query is fed in so there is no way to actually stop the report,
especially if the report is not being run from the same machine as the
server (you can't kill the report process).  How do you stop it without
SIGTERM?  You don't want to shut down the postmaster.

In fact, query cancel is actually more dangerous in this case because
unless the report is designed to handle statement failures, it might
just keep running and produce incorrect results because you canceled
some random queries in the report.  (I am thinking specifically of a
psql batch job here.)

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> If there is a problem, maybe we can fix it, or perhap have the kill
> function use SIGINT, then wait for the query to cancel, then SIGTERM.

Well, if someone could prove that the SIGTERM path is equivalent to
a transaction-aborting error followed by normal client disconnect,
I'd feel a lot better about its reliability.  We know those two code
paths work well.

Right now I do not think they are equivalent, but maybe they could be
made so.
        regards, tom lane


Re: Function to kill backend

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > If there is a problem, maybe we can fix it, or perhap have the kill
> > function use SIGINT, then wait for the query to cancel, then SIGTERM.
> 
> Well, if someone could prove that the SIGTERM path is equivalent to
> a transaction-aborting error followed by normal client disconnect,
> I'd feel a lot better about its reliability.  We know those two code
> paths work well.

As far as I know, most database systems have the ability to kill an
individual session, and I think we need that functionality too, as I
described in my report example.

> Right now I do not think they are equivalent, but maybe they could be
> made so.

Let me look into it.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Bruce Momjian
Date:
Josh Berkus wrote:
> Bruce,
> 
> > OK, you have a runaway report. You want to stop it.  Query cancel is
> > only going to stop the current query, and once you do that the next
> > query is fed in so there is no way to actually stop the report,
> > especially if the report is not being run from the same machine as the
> > server (you can't kill the report process).  How do you stop it without
> > SIGTERM?  You don't want to shut down the postmaster.
> 
> Hmmm ... but, at least in the case of my apps, killing the PG connection 
> wouldn't fix things.   Most apps I work on are designed to detect connection 
> failure and reconnect.   I suspect that most platforms that use connection 
> pooling are the same.  So your case would only work if you actually blocked 
> all connections from that host -- not a capability we'd discussed.

I don't think most apps reconnect on disconnect, except maybe pooled
connections where you don't expect your state to be stable between
connections.  Certainly most reports can't just reconnect and keep
going.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> OK, you have a runaway report. You want to stop it.  Query cancel is
> only going to stop the current query, and once you do that the next
> query is fed in so there is no way to actually stop the report,
> especially if the report is not being run from the same machine as the
> server (you can't kill the report process).

> I don't think most apps reconnect on disconnect, except maybe pooled
> connections where you don't expect your state to be stable between
> connections.  Certainly most reports can't just reconnect and keep
> going.

You're hypothecating a report generator that can recover from failed
queries, but not a failed connection?  Seems a rather contrived case.
Stupid apps are most likely gonna curl up and die on any unexpected
error (which is what the query cancel would look like to them).  Smart
apps may try harder to recover than you think.
        regards, tom lane


Re: Function to kill backend

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > OK, you have a runaway report. You want to stop it.  Query cancel is
> > only going to stop the current query, and once you do that the next
> > query is fed in so there is no way to actually stop the report,
> > especially if the report is not being run from the same machine as the
> > server (you can't kill the report process).
> 
> > I don't think most apps reconnect on disconnect, except maybe pooled
> > connections where you don't expect your state to be stable between
> > connections.  Certainly most reports can't just reconnect and keep
> > going.
> 
> You're hypothecating a report generator that can recover from failed
> queries, but not a failed connection?  Seems a rather contrived case.
> Stupid apps are most likely gonna curl up and die on any unexpected
> error (which is what the query cancel would look like to them).  Smart
> apps may try harder to recover than you think.

I figured some reports just continue on failed queries.  Is that
accurate?  I don't know, but I know a psql script will continue, no?
Will psql return an error code on exit from cancel?  I think it does but
am not sure.

First people objected to this on security grounds, now that those were
beat down, we have use-case complaints.  Not having a way to kill
backends is like having no way to kill a process except rebooting the
server.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Bruce Momjian
Date:
pgman wrote:
> Josh Berkus wrote:
> > Tom,
> > 
> > > I don't think it's an open-and-shut decision as to whether people
> > > actually *need* to do session kills (as opposed to query/transaction
> > > kills).  The arguments presented so far are not convincing to my mind,
> > > certainly not convincing enough to buy into a commitment to do whatever
> > > it takes to support that.
> > 
> > Hmmm ... well, I can make a real-world case from my supported apps for 
> > transaction/statement kills.   But my support for session kills is just 
> > hypothetical; any  time I've had to kill off sessions, it's because I had to 
> > shut the database down, and that's better done from the command line.
> > 
> > My web apps which need to manage the number of connections do it through their 
> > connection pool.
> > 
> > So I would vote for Yes on SIGINT by XID, but No on SIGTERM by PID, if Tom 
> > thinks there will be any significant support & troubleshooting involved for 
> > the latter.
> > 
> > Unless, of course, someone can give us a real business case that they have 
> > actually encountered in production.
> 
> Someone already posted some pseudocode where they wanted to kill idle
> backends, perhaps as part of connection pooling.

Tom, if you have concerns about SIGTERM while other backends keep
running, would you share those.  (Holding locks, shared memory?)  I
looked at die(), and it seemed pretty safe to me.  It just sets some
variables and returns.  It is not like quickdie that calls exit().

If there is a problem, maybe we can fix it, or perhap have the kill
function use SIGINT, then wait for the query to cancel, then SIGTERM.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> Not having a way to kill backends is like having no way to kill a
> process except rebooting the server.

Some people think that making a database hard to kill is a good thing.
        regards, tom lane


Re: Function to kill backend

From
Rod Taylor
Date:
On Tue, 2004-04-06 at 15:23, Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
> > So I would vote for Yes on SIGINT by XID, but No on SIGTERM by PID, if Tom 
> > thinks there will be any significant support & troubleshooting involved for 
> > the latter.
> 
> So like I say, I'm hesitant to buy into supporting this without a fairly
> convincing argument that it's really needed.

It doesn't necessarily have to be a SIGTERM. The goal is to get rid of
unwanted idlers (connections). Could SIGINT be extended with a command
telling the daemon to shutdown or rollback the transaction as requested?




Re: Function to kill backend

From
Rod Taylor
Date:
On Tue, 2004-04-06 at 15:10, Josh Berkus wrote:
> Bruce,
> 
> > OK, you have a runaway report. You want to stop it.  Query cancel is
> > only going to stop the current query, and once you do that the next
> > query is fed in so there is no way to actually stop the report,
> > especially if the report is not being run from the same machine as the
> > server (you can't kill the report process).  How do you stop it without
> > SIGTERM?  You don't want to shut down the postmaster.
> 
> Hmmm ... but, at least in the case of my apps, killing the PG connection 
> wouldn't fix things.   Most apps I work on are designed to detect connection 
> failure and reconnect.   I suspect that most platforms that use connection 
> pooling are the same.  So your case would only work if you actually blocked 
> all connections from that host -- not a capability we'd discussed.

That would be a likely second step (go into pg_hba to block). But you
still have to get rid of the idlers at some point.



Re: Function to kill backend

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Not having a way to kill backends is like having no way to kill a
> > process except rebooting the server.
> 
> Some people think that making a database hard to kill is a good thing.

I can't argue with that.  :-)

I am researching the SIGTERM processing right now and will post in a few
minutes.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Bruce Momjian
Date:
Rod Taylor wrote:
> On Tue, 2004-04-06 at 15:23, Tom Lane wrote:
> > Josh Berkus <josh@agliodbs.com> writes:
> > > So I would vote for Yes on SIGINT by XID, but No on SIGTERM by PID, if Tom 
> > > thinks there will be any significant support & troubleshooting involved for 
> > > the latter.
> > 
> > So like I say, I'm hesitant to buy into supporting this without a fairly
> > convincing argument that it's really needed.
> 
> It doesn't necessarily have to be a SIGTERM. The goal is to get rid of
> unwanted idlers (connections). Could SIGINT be extended with a command
> telling the daemon to shutdown or rollback the transaction as requested?

Nope, a signal is just a signal with no other info passed.  We could add
it, but it would be more code.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Bruce Momjian
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > If there is a problem, maybe we can fix it, or perhap have the kill
> > function use SIGINT, then wait for the query to cancel, then SIGTERM.
> 
> Well, if someone could prove that the SIGTERM path is equivalent to
> a transaction-aborting error followed by normal client disconnect,
> I'd feel a lot better about its reliability.  We know those two code
> paths work well.
> 
> Right now I do not think they are equivalent, but maybe they could be
> made so.

I just did a whitespace-ignore diff of the SIGINT (which is cancel and
we know works), and SIGTERM (which is under study).  Here is a diff of
SIGINT vs. SIGTERM functions:
1,2c1,2< static void< StatementCancelHandler(SIGNAL_ARGS)---> void> die(SIGNAL_ARGS)6,10c6,7<   /*<    * Don't joggle
theelbow of proc_exit, nor an already-in-progress<    * abort<    */<   if (!proc_exit_inprogress && !InError)--->   /*
Don'tjoggle the elbow of proc_exit */>   if (!proc_exit_inprogress)13c10<       QueryCancelPending = true;--->
ProcDiePending= true;16,18c13,14<        * If it's safe to interrupt, and we're waiting for a lock,<        * service
theinterrupt immediately.  No point in interrupting if<        * we're waiting for input, however.--->        * If it's
safeto interrupt, and we're waiting for input or a>        * lock, service the interrupt immediately26,33c22,26<
  if (LockWaitCancel())<           {<               DisableNotifyInterrupt();<               InterruptHoldoffCount--;<
            ProcessInterrupts();<           }<           else<               InterruptHoldoffCount--;--->
DisableNotifyInterrupt();>          /* Make sure CheckDeadLock won't run while shutting down... */>
LockWaitCancel();>          InterruptHoldoffCount--;>           ProcessInterrupts();
 

The big difference seems to be the InError test, and the test in SIGINT
whether LockWaitCancel() actually returns true or false.  Also,
DisableNotifyInterrupt() is called before LockWaitCancel().

Also, one sets QueryCancelPending and the other ProcDiePending.  Those
are handled by:voidProcessInterrupts(void){    /* OK to accept interrupt now? */    if (InterruptHoldoffCount != 0 ||
CritSectionCount!= 0)        return;    InterruptPending = false;    if (ProcDiePending)    {        ProcDiePending =
false;       QueryCancelPending = false;     /* ProcDie trumps QueryCancel */        ImmediateInterruptOK = false;   /*
notidle anymore */        DisableNotifyInterrupt();        ereport(FATAL,
(errcode(ERRCODE_ADMIN_SHUTDOWN),        errmsg("terminating connection due to administrator command")));    }    if
(QueryCancelPending)   {        QueryCancelPending = false;        ImmediateInterruptOK = false;   /* not idle anymore
*/       DisableNotifyInterrupt();        ereport(ERROR,                (errcode(ERRCODE_QUERY_CANCELED),
 errmsg("canceling query due to user request")));    }    /* If we get here, do nothing (probably, QueryCancelPending
wasreset) */}
 

and the only difference here seems to be the elog(SHUTDOWN) call.

On first glance, I don't see anything dangerous about SIGTERM.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
"Simon Riggs"
Date:
The "function to kill backend" seems no longer in doubt, but the
*reason* is still blurred, other than we don't want to bring down the
postmaster to do this.
So far, reasons given have been:
1. to kill idlers
2. to kill runaway queries/statements, which has transaction
implications

I'd like to be able to do both of those, though they are fairly
different situations, perhaps with different implementations.

>Tom Lane writes
> Some people think that making a database hard to kill is a good thing.

I'm with Tom on this. I'm working on PITR...what happens when you shut
down backends uncleanly...how will we test all the possible event/states
to ensure xlog history doesn't get broken? Please lets work on a
"humane" approach to handling offending sessions...IMHO the various
signals mentioned are merely blunt instruments; we have been exhorted
daily not to "kill the postmaster", now we seem to be suggesting that we
allow that to happen to backends.  Isn't there a better way...?

Other RDBMSs support the ability to terminate specific sessions, but
none of them to my knowledge offer this facility by means of an external
executable, which has many implications - how would you know (prove) it
had been used? what security protection does the DBA have over its use?

The way forward seems safest if this is a command, not an external
executable. e.g. ALTER SYSTEM STOP BACKEND <x>. That way we have control
over the implementation/porting, security, logging/audit. Anybody that
wants to can then wrap that in a script if they choose.

Internally, we could then implement it however we chose.

Best Regards, Simon Riggs



Re: Function to kill backend

From
Bruce Momjian
Date:
Simon Riggs wrote:
> 
> The "function to kill backend" seems no longer in doubt, but the
> *reason* is still blurred, other than we don't want to bring down the
> postmaster to do this.
> So far, reasons given have been:
> 1. to kill idlers
> 2. to kill runaway queries/statements, which has transaction
> implications
> 
> I'd like to be able to do both of those, though they are fairly
> different situations, perhaps with different implementations.
> 
> >Tom Lane writes
> > Some people think that making a database hard to kill is a good thing.
> 
> I'm with Tom on this. I'm working on PITR...what happens when you shut
> down backends uncleanly...how will we test all the possible event/states
> to ensure xlog history doesn't get broken? Please lets work on a
> "humane" approach to handling offending sessions...IMHO the various
> signals mentioned are merely blunt instruments; we have been exhorted
> daily not to "kill the postmaster", now we seem to be suggesting that we
> allow that to happen to backends.  Isn't there a better way...?
> 
> Other RDBMSs support the ability to terminate specific sessions, but
> none of them to my knowledge offer this facility by means of an external
> executable, which has many implications - how would you know (prove) it
> had been used? what security protection does the DBA have over its use?
> 
> The way forward seems safest if this is a command, not an external
> executable. e.g. ALTER SYSTEM STOP BACKEND <x>. That way we have control
> over the implementation/porting, security, logging/audit. Anybody that
> wants to can then wrap that in a script if they choose.
> 
> Internally, we could then implement it however we chose.

The current plan is to create server-side functions to do this.  We
believe sending a SIGTERM to a backend via kill already has this effect.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
"Simon Riggs"
Date:
>Bruce Momjian
> Simon Riggs wrote:
> > The "function to kill backend" seems no longer in doubt, but the
> > *reason* is still blurred, other than we don't want to
> bring down the
> > postmaster to do this.
> > So far, reasons given have been:
> > 1. to kill idlers
> > 2. to kill runaway queries/statements, which has transaction
> > implications
> >
> > I'd like to be able to do both of those, though they are fairly
> > different situations, perhaps with different implementations.
> >
> > >Tom Lane writes
> > > Some people think that making a database hard to kill is
> a good thing.
> >
> > I'm with Tom on this. I'm working on PITR...what happens
> when you shut
> > down backends uncleanly...how will we test all the possible
> event/states
> > to ensure xlog history doesn't get broken? Please lets work on a
> > "humane" approach to handling offending sessions...IMHO the various
> > signals mentioned are merely blunt instruments; we have
> been exhorted
> > daily not to "kill the postmaster", now we seem to be
> suggesting that we
> > allow that to happen to backends.  Isn't there a better way...?
> >
> > Other RDBMSs support the ability to terminate specific sessions, but
> > none of them to my knowledge offer this facility by means
> of an external
> > executable, which has many implications - how would you
> know (prove) it
> > had been used? what security protection does the DBA have
> over its use?
> >
> > The way forward seems safest if this is a command, not an external
> > executable. e.g. ALTER SYSTEM STOP BACKEND <x>. That way we
> have control
> > over the implementation/porting, security, logging/audit.
> Anybody that
> > wants to can then wrap that in a script if they choose.
> >
> > Internally, we could then implement it however we chose.
>
> The current plan is to create server-side functions to do this.  We
> believe sending a SIGTERM to a backend via kill already has
> this effect.

If I understand you then, you believe you have found a way to short cut
implementing the server-side function.

Even if the primary behaviour is indeed exactly similar, are the
secondary and subsequent behaviours similar also?

Does your plan allow for:
i) logging the activity (both the send and the effect of receiving it)
ii) full security control of the facility
other secondary behaviours...

Forgive me if I become over-protective on robustness issues...

Best Regards, Simon



Re: Function to kill backend

From
Thomas Swan
Date:
Bruce Momjian wrote:

>pgman wrote:
>  
>
>>Josh Berkus wrote:
>>    
>>
>>>Tom,
>>>
>>>      
>>>
>>>>I don't think it's an open-and-shut decision as to whether people
>>>>actually *need* to do session kills (as opposed to query/transaction
>>>>kills).  The arguments presented so far are not convincing to my mind,
>>>>certainly not convincing enough to buy into a commitment to do whatever
>>>>it takes to support that.
>>>>        
>>>>
>>>Hmmm ... well, I can make a real-world case from my supported apps for 
>>>transaction/statement kills.   But my support for session kills is just 
>>>hypothetical; any  time I've had to kill off sessions, it's because I had to 
>>>shut the database down, and that's better done from the command line.
>>>
>>>My web apps which need to manage the number of connections do it through their 
>>>connection pool.
>>>
>>>So I would vote for Yes on SIGINT by XID, but No on SIGTERM by PID, if Tom 
>>>thinks there will be any significant support & troubleshooting involved for 
>>>the latter.
>>>
>>>Unless, of course, someone can give us a real business case that they have 
>>>actually encountered in production.
>>>      
>>>
>>Someone already posted some pseudocode where they wanted to kill idle
>>backends, perhaps as part of connection pooling.
>>    
>>
>
>Tom, if you have concerns about SIGTERM while other backends keep
>running, would you share those.  (Holding locks, shared memory?)  I
>looked at die(), and it seemed pretty safe to me.  It just sets some
>variables and returns.  It is not like quickdie that calls exit().
>
>If there is a problem, maybe we can fix it, or perhap have the kill
>function use SIGINT, then wait for the query to cancel, then SIGTERM.
>
>  
>
One other option to consider while your looking at killing off children
as an admin is the ability to switch the database to single user mode
for the admin via psql or other connection interface.   In essence
shutdown all other transactions and session and limit it to only one so
that maintenance issues (reindexing, etc) can be done remotely and do
not require shell access on the host machine.

I think, ultimately, this may be what is being asked for:  a way to
manage postgresql internally via SQL or other functions rather than
relying on user access to the machine (ie sending signals to
processes).   This is what some people have wanted to do with managing
the connection settings in pg_hba.conf via a system table rather than or
in addition to a file.



Re: Function to kill backend

From
Bruce Momjian
Date:
Simon Riggs wrote:
> > > The way forward seems safest if this is a command, not an external
> > > executable. e.g. ALTER SYSTEM STOP BACKEND <x>. That way we
> > have control
> > > over the implementation/porting, security, logging/audit.
> > Anybody that
> > > wants to can then wrap that in a script if they choose.
> > >
> > > Internally, we could then implement it however we chose.
> >
> > The current plan is to create server-side functions to do this.  We
> > believe sending a SIGTERM to a backend via kill already has
> > this effect.
> 
> If I understand you then, you believe you have found a way to short cut
> implementing the server-side function.

No, we are going to create server-side function that mimick SIGINT
(though using xid, we can ignore if the xid change), and SIGTERM. 
Server-side functions are required for clients not running on the
server.


> Even if the primary behaviour is indeed exactly similar, are the
> secondary and subsequent behaviours similar also?
> 
You mean calling them several times?  Well, SIGTERM will terminate, and
SIGINT will keep canceling xid's, though you have to update the xid to
the newer xid because you canceled the old one.

> Does your plan allow for:
> i) logging the activity (both the send and the effect of receiving it)
> ii) full security control of the facility
> other secondary behaviours...
> 
> Forgive me if I become over-protective on robustness issues...

No, these are super-user-only commands.  Not much more we can do on that
front.

--  Bruce Momjian                        |  http://candle.pha.pa.us pgman@candle.pha.pa.us               |  (610)
359-1001+  If your life is a hard drive,     |  13 Roberts Road +  Christ can be your backup.        |  Newtown Square,
Pennsylvania19073
 


Re: Function to kill backend

From
Tom Lane
Date:
Bruce Momjian <pgman@candle.pha.pa.us> writes:
> On first glance, I don't see anything dangerous about SIGTERM.

You haven't thought about it very hard :-(

The major difference I see is that elog(FATAL) will call proc_exit
directly from elog, rather than longjmp'ing back to PostgresMain.
The case that we have confidence in involves elog(ERROR) returning to
PostgresMain and then calling proc_exit from there (in the path where
we get EOF from the client).

This leaves me with a couple of concerns:

* Notice all that cleanup/reset stuff in the "if (sigsetjmp())" block
in PostgresMain.  SIGTERM will cause proc_exit to be entered without
any of that being done first.  Does it work reliably?  Shouldn't this be
refactored to ensure the same things happen in both cases?

* There are various places, especially in the PLs, that try to hook into
error recovery by manipulating Warn_restart.  Will any of them have
problems if their error recovery code doesn't get called during SIGTERM
exit?

One possible refactoring is for elog(FATAL) to go ahead and longjmp back
to PostgresMain, and at the end of the error recovery block check a flag
and do proc_exit() if we're fataling.  However I am not sure that this
doesn't break the design for coping with elog's during proc_exit.

Alvaro's nested-transaction work is another thing that's got to be
thought about before touching this code.  I have not yet seen any design
for error recovery in the nested xact case, but I am sure it's going to
need some changes right around here.
        regards, tom lane


Re: Function to kill backend

From
Rod Taylor
Date:
> Otherwise, I'll stick by my assertion that idle connection management should
> be done in the middleware and NOT by psql.

Perhaps it should be, but as PostgreSQL picks up more and more vendor
applications this is difficult for the person administrating the
database.

Consider a 3rd party application which has the issue but the support
contract is such that you cannot affect the application itself (their
support staff deals with it).

If you need connections in the database for other applications and this
3rd party program is idling on several slots...

--
Rod Taylor <rbt [at] rbt [dot] ca>

Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
PGP Key: http://www.rbt.ca/signature.asc

Re: Function to kill backend

From
Kevin Brown
Date:
Tom Lane wrote:
> Bruce Momjian <pgman@candle.pha.pa.us> writes:
> > Not having a way to kill backends is like having no way to kill a
> > process except rebooting the server.
> 
> Some people think that making a database hard to kill is a good thing.

Sure.  But we're not talking about taking down the whole database, we're
talking about taking down a connection.

Killing the database is the equivalent of killing the OS.  It should be
hard to do either.  But it should be easy to kill a process on an OS if
you have the right permissions, and similiarly it should be easy to kill
a connection to the database if you have the right permissions.


With respect to nested transactions and other things that might make
properly shutting down difficult, it seems to me that the SIGINT case is
actually a harder case to deal with.  Why?  Because for the SIGTERM case,
you basically have to do whatever is done whenever the connection itself
drops.  If we can't handle the connection itself dropping out arbitrarily
then we have more serious problems than just how to handle SIGTERM.  :-)
But for SIGINT you have to decide whether to just abort the innermost
transaction or the outermost one, and if it's the outermost one you have
to abort then you have to provide the mechanism for it -- something that
you might not have to deal with otherwise.

So it seems that handling SIGTERM might actually be easy: you have the
signal handler close the backend's side of the connection and let the
connection-dropping logic kick in automatically, no?


Thoughts?  Am I completely off my rocker here?  :-)



-- 
Kevin Brown                          kevin@sysexperts.com