Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.) - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.) |
Date | |
Msg-id | 20150131111118.GM24213@alap3.anarazel.de Whole thread Raw |
In response to | Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.) (Heikki Linnakangas <hlinnakangas@vmware.com>) |
Responses |
Re: Re: Overhauling our interrupt handling (was Escaping
from blocked send() reprised.)
|
List | pgsql-hackers |
On 2015-01-31 00:52:03 +0100, Heikki Linnakangas wrote: > On 01/15/2015 03:03 AM, Andres Freund wrote: > >0004: Process 'die' interrupts while reading/writing from the client socket. > > >+ * Check for interrupts here, in addition to secure_write(), > >+ * because a interrupted write in secure_raw_write() can possibly > >+ * only return to here, not to secure_write(). > > */ > >+ ProcessClientWriteInterrupt(true); > > Took me a while to parse that sentence. How about: > > Check for interrupts here, in addition to secure_write(), because an > interrupted write in secure_raw_write() will return here, and we cannot > return to secure_write() until we've written something. Yep, that's better. > >0005: Move deadlock and other interrupt handling in proc.c out of signal handlers. > > > > I'm surprised how comparatively simple this turned out to be. For > > robustness and understanding I think this is a great improvement. > > > > New and not reviewed at all. Needs significant review. But works > > in my simple testcases. > > >@@ -799,6 +791,7 @@ ProcKill(int code, Datum arg) > > proc = MyProc; > > MyProc = NULL; > > DisownLatch(&proc->procLatch); > >+ SetLatch(MyLatch); > > > > SpinLockAcquire(ProcStructLock); > > > >@@ -868,6 +861,7 @@ AuxiliaryProcKill(int code, Datum arg) > > proc = MyProc; > > MyProc = NULL; > > DisownLatch(&proc->procLatch); > >+ SetLatch(MyLatch); > > > > SpinLockAcquire(ProcStructLock); > > These SetLatch calls are unnecessary. SwitchBackToLocalLatch() already sets > the latch. (According to the comment in SwitchToSharedLatch() even that > should not be necessary.) Ick, that's some debugging leftovers where I was trying to find a bug that was fundamentally caused by the missing barriers in the latch code... > >0006: Don't allow immediate interupts during authentication anymore. > > In commit message: s/interupts/interrupts/. > > >@@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass, > > if (*shadow_pass == '\0') > > return STATUS_ERROR; /* empty password */ > > > >- /* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */ > >- ImmediateInterruptOK = true; > >- /* And don't forget to detect one that already arrived */ > > CHECK_FOR_INTERRUPTS(); > > > > /* > > That lone CHECK_FOR_INTERRUPTS() that remains looks pretty randomly placed > now that the "ImmediateInterruptOK = true" is gone. I would just remove it. > Add one to the caller if it's needed, but it probably isn't. There's some method to the madness here actually - we just did some database stuff that could in theory take a while... Whether it's worthwhile to leave it here I'm not sure. > >2) Does anybody see a significant problem with the reduction of cases > > where we immediately react to authentication_timeout? Since we still > > react immediately when we're waiting for the client (except maybe in > > sspi/kerberos?) I think that's ok. The waiting cases are slow > > ident/radius/kerberos/ldap/... servers. But we really shouldn't jump > > out of the relevant code anyway. > > Yeah, I think that's OK. Doing those things with ImmediateInterruptOK=true > was quite scary, and we need to stop doing that. It would be nice to have a > wholesale solution for those lookups but I don't see one. So all the > ident/radius/kerberos/ldap lookups will need to be done in a non-blocking > way to have them react to the timeout. That requires some work, but we can > leave that to a followup patch, if someone wants to write it. Ok, cool. > Overall, 1-8 look good to me, except for that one incomplete comment in > ProcessClientWriteInterrupt() I mentioned in a separate email. Thanks for the review! I plan to push the fixed up versions sometime next week. > I haven't reviewed patches 9 and 10 yet. Yea, those aren't as close to being ready (and thus marked WIP). I'd like to do the lwlock stuff for 9.5 because then any sane setup (i.e. unless spinlocks are emulated) doesn't need any semaphores anymore, which makes setup easier. Right now users need to adjust system settings when changing max_connctions upwards or run multiple clusters. But it's not really related to getting rid of ImmediateInterruptOK. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
pgsql-hackers by date: