Re: pg_terminate_backend for same-role - Mailing list pgsql-hackers

From Noah Misch
Subject Re: pg_terminate_backend for same-role
Date
Msg-id 20120317213222.GA15562@tornado.leadboat.com
Whole thread Raw
In response to Re: pg_terminate_backend for same-role  (Daniel Farina <daniel@heroku.com>)
List pgsql-hackers
On Fri, Mar 16, 2012 at 04:42:07PM -0700, Daniel Farina wrote:
> On Fri, Mar 16, 2012 at 3:42 PM, Noah Misch <noah@leadboat.com> wrote:
> > On Thu, Mar 15, 2012 at 04:14:03PM -0700, Daniel Farina wrote:
> >> I imagine the problem is a race condition whereby a pid might be
> >> reused by another process owned by another user (doesn't that also
> >> affect pg_cancel_backend?). ?Shall we just do everything using the
> >> MyCancelKey (which I think could just be called "SessionKey",
> >> "SessionSecret", or even just "Session") as to ensure we have no case
> >> of mistaken identity? Or does that end up being problematic?
> >
> > No, I think the hazard you identify here is orthogonal to the question of when
> > to authorize pg_terminate_backend(). ?As you note downthread, protocol-level
> > cancellations available in released versions already exhibit this hazard. ?I
> > wouldn't mind a clean fix for this, but it's an independent subject.
> 
> Hmm. Well, here's a patch that implements exactly that, I think,
> similar to the one posted to this thread, but not using BackendIds,
> but rather the newly-introduced "SessionId".  Would appreciate
> comments.  Because an out-of-band signaling method has been integrated
> more complex behaviors -- such as closing the
> TERM-against-SECURITY-DEFINER-FUNCTION hazard -- can be addressed.
> For now I've only attempted to solve the problem of backend ambiguity,
> which basically necessitated out-of-line information transfer as per
> the usual means.

This patch still changes the policy for pg_terminate_backend(), and it does
not fix other SIGINT senders like processCancelRequest() and ProcSleep().  If
you're concerned about PID-reuse races, audit all backend signalling.  Either
fix all such problems or propose a plan to get there eventually.  Any further
discussion of this topic needs a new subject line; mixing its consideration
with proposals to change the policy behind pg_terminate_backend() reduces the
chances of the right people commenting on these distinct questions.

Currently, when pg_terminate_backend() follows a pg_cancel_backend() on which
the target has yet to act, the eventual outcome is a terminated process.  With
this patch, the pg_terminate_backend() becomes a no-op with this warning:

> !              ereport(WARNING,
> !                      (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
> !                       (errmsg("process is busy responding to administrative "
> !                               "request")),
> !                       (errhint("This is temporary, and may be retried."))));

That's less useful than the current behavior.


That being said, I can't get too excited about closing PID-reuse races.  I've
yet to see another program do so.  I've never seen a trouble report around
this race for any software.  Every OS I have used assigns PIDs so as to
maximize the reuse interval, which seems like an important POLA measure given
typical admin formulae like "kill `ps | grep ...`".  This defense can only
matter in fork-bomb conditions, at which point a stray signal is minor.

I do think it's worth keeping this idea in a back pocket for achieving those
"more complex behaviors," should we ever desire them.

> > Here I discussed a hazard specific to allowing pg_terminate_backend():
> > http://archives.postgresql.org/message-id/20110602045955.GC8246@tornado.gateway.2wire.net
> >
> > To summarize, user code can trap SIGINT cancellations, but it cannot trap
> > SIGTERM terminations. ?If a backend is executing a SECURITY DEFINER function
> > when another backend of the same role calls pg_terminate_backend() thereon,
> > the pg_terminate_backend() caller could achieve something he cannot achieve in
> > PostgreSQL 9.1. ?I vote that this is an acceptable loss.
> 
> I'll throw out a patch that just lets this hazard exist and see what
> happens, although it is obsoleted/incompatible with the one already
> attached.

+1.  Has anyone actually said that the PID-reuse race or the thing I mention
above should block such a patch?  I poked back through the threads I could
remember and found nothing.


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: Regarding column reordering project for GSoc 2012
Next
From: Tom Lane
Date:
Subject: Re: Command Triggers, patch v11