Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.) - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)
Date
Msg-id 20150131111118.GM24213@alap3.anarazel.de
Whole thread Raw
In response to Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Responses Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)
List pgsql-hackers
On 2015-01-31 00:52:03 +0100, Heikki Linnakangas wrote:
> On 01/15/2015 03:03 AM, Andres Freund wrote:
> >0004: Process 'die' interrupts while reading/writing from the client socket.
> 
> >+             * Check for interrupts here, in addition to secure_write(),
> >+             * because a interrupted write in secure_raw_write() can possibly
> >+             * only return to here, not to secure_write().
> >              */
> >+            ProcessClientWriteInterrupt(true);
> 
> Took me a while to parse that sentence. How about:
> 
> Check for interrupts here, in addition to secure_write(), because an
> interrupted write in secure_raw_write() will return here, and we cannot
> return to secure_write() until we've written something.

Yep, that's better.

> >0005: Move deadlock and other interrupt handling in proc.c out of signal handlers.
> >
> >       I'm surprised how comparatively simple this turned out to be. For
> >       robustness and understanding I think this is a great improvement.
> >
> >       New and not reviewed at all. Needs significant review. But works
> >       in my simple testcases.
> 
> >@@ -799,6 +791,7 @@ ProcKill(int code, Datum arg)
> >     proc = MyProc;
> >     MyProc = NULL;
> >     DisownLatch(&proc->procLatch);
> >+    SetLatch(MyLatch);
> >
> >     SpinLockAcquire(ProcStructLock);
> >
> >@@ -868,6 +861,7 @@ AuxiliaryProcKill(int code, Datum arg)
> >     proc = MyProc;
> >     MyProc = NULL;
> >     DisownLatch(&proc->procLatch);
> >+    SetLatch(MyLatch);
> >
> >     SpinLockAcquire(ProcStructLock);
> 
> These SetLatch calls are unnecessary. SwitchBackToLocalLatch() already sets
> the latch. (According to the comment in SwitchToSharedLatch() even that
> should not be necessary.)

Ick, that's some debugging leftovers where I was trying to find a bug
that was fundamentally caused by the missing barriers in the latch
code...

> >0006: Don't allow immediate interupts during authentication anymore.
> 
> In commit message: s/interupts/interrupts/.
> 
> >@@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
> >     if (*shadow_pass == '\0')
> >         return STATUS_ERROR;    /* empty password */
> >
> >-    /* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
> >-    ImmediateInterruptOK = true;
> >-    /* And don't forget to detect one that already arrived */
> >     CHECK_FOR_INTERRUPTS();
> >
> >     /*
> 
> That lone CHECK_FOR_INTERRUPTS() that remains looks pretty randomly placed
> now that the "ImmediateInterruptOK = true" is gone. I would just remove it.
> Add one to the caller if it's needed, but it probably isn't.

There's some method to the madness here actually - we just did some
database stuff that could in theory take a while... Whether it's
worthwhile to leave it here I'm not sure.

> >2) Does anybody see a significant problem with the reduction of cases
> >    where we immediately react to authentication_timeout? Since we still
> >    react immediately when we're waiting for the client (except maybe in
> >    sspi/kerberos?) I think that's ok. The waiting cases are slow
> >    ident/radius/kerberos/ldap/... servers. But we really shouldn't jump
> >    out of the relevant code anyway.
> 
> Yeah, I think that's OK. Doing those things with ImmediateInterruptOK=true
> was quite scary, and we need to stop doing that. It would be nice to have a
> wholesale solution for those lookups but I don't see one. So all the
> ident/radius/kerberos/ldap lookups will need to be done in a non-blocking
> way to have them react to the timeout. That requires some work, but we can
> leave that to a followup patch, if someone wants to write it.

Ok, cool.

> Overall, 1-8 look good to me, except for that one incomplete comment in
> ProcessClientWriteInterrupt() I mentioned in a separate email.

Thanks for the review! I plan to push the fixed up versions sometime
next week.

> I haven't reviewed patches 9 and 10 yet.

Yea, those aren't as close to being ready (and thus marked WIP).

I'd like to do the lwlock stuff for 9.5 because then any sane setup
(i.e. unless spinlocks are emulated) doesn't need any semaphores
anymore, which makes setup easier. Right now users need to adjust system
settings when changing max_connctions upwards or run multiple
clusters. But it's not really related to getting rid of
ImmediateInterruptOK.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Marc Mamin
Date:
Subject: Re: tablespaces inside $PGDATA considered harmful
Next
From: Michael Paquier
Date:
Subject: Re: Safe memory allocation functions