Re: Hot Standy introduced problem with query cancel behavior - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Hot Standy introduced problem with query cancel behavior
Date
Msg-id 1262216632.3252.4.camel@localhost.localdomain
Whole thread Raw
In response to Re: Hot Standy introduced problem with query cancel behavior  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Hot Standy introduced problem with query cancel behavior
List pgsql-hackers
----- Ursprüngliche Mitteilung -----
> On Wed, Dec 30, 2009 at 2:06 PM, Andres Freund <andres@anarazel.de> wrote:
> > On Wednesday 30 December 2009 01:13:01 Simon Riggs wrote:
> > > On Tue, 2009-12-29 at 11:13 -0500, Tom Lane wrote:
> > > > Andres Freund <andres@anarazel.de> writes:
> > > > > On Tuesday 29 December 2009 16:22:54 Tom Lane wrote:
> > > > > > This seems like a fairly bad idea.  One of the intended use-cases is
> > > > > > to be able to manually "kill -INT" a misbehaving backend.  Assuming
> > > > > > that there will be valid info about the signal in shared memory will
> > > > > > break that.
> > > > >
> > > > > Well. That already is the case now. MyProc->recoveryConflictMode is
> > > > > checked to recognize what kind of conflict is being resolved...
> > > >
> > > > In that case, HS has already broken it, and we need to fix it not make
> > > > it worse.
> > > >
> > > > My humble opinion is that SIGINT should not be overloaded with multiple
> > > > meanings.  We already have a multiplexed signal mechanism, which is what
> > > > should be used for any additional signal reasons HS may need to
> > > > introduce.
> > >
> > > It's a revelation to me, but yes, I see it now and agree.
> > >
> > > I'm looking at Fujii-san's multiplexing patch from Jul 31 to rewrite
> > > this code using that mechanism. It sounds like it's a neat fit and it
> > > should get around the bug report from Kris also if it all works.
> > Hm. I just read a bit of that multiplexing facility (out of a different reason)
> > and I have some doubt about it being used unmodified for canceling backends:
> >
> > procsignal.c:
> > /*
> >  * Note: Since there's no locking, it's possible that the target
> >  * process detaches from shared memory and exits right after this
> >  * test, before we set the flag and send signal. And the signal slot
> >  * might even be recycled by a new process, so it's remotely possible
> >  * that we set a flag for a wrong process. That's OK, all the signals
> >  * are such that no harm is done if they're mistakenly fired.
> >  */
> > procsignal.h:
> > ...
> >  * Also, because of race conditions, it's important that all the signals be
> >  * defined so that no harm is done if a process mistakenly receives one.
> >  */
> >
> > When cancelling a backend that behaviour could be a bit annoying ;-)
> >
> > I guess locking procarray during sending the signal should be enough?
>
> I think the idea is that you define the behavior of the signal to be
> "look at this other piece of state to see whether you should cancel
> yourself" rather than just "cancel yourself".  Then if a signal is
> delivered by mistake, it's no big deal - you just look at the other
> piece of state and decide that you don't need to do anything.
I dont see an easy way to pass enough information right now. You cant regenerate enough of it in the to be killed
backendas most of the relevant information is only available in the startup process.
 
Inventing yet another segment in shm just for this seems overcomplicated to me.
Thats why I suggested locking the procarray for this - without having looked at the code that should prevent a backend
slotfrom beimg reused.
 

Andres


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: [PATCH] Windows x64 [repost]
Next
From: Hiroshi Inoue
Date:
Subject: Re: krb_server_keyfile setting doesn't work on Windows