Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)
Date
Msg-id 54CC1923.7040204@vmware.com
Whole thread Raw
In response to Overhauling our interrupt handling (was Escaping from blocked send() reprised.)  (Andres Freund <andres@2ndquadrant.com>)
Responses Re: Re: Overhauling our interrupt handling (was Escaping from blocked send() reprised.)
List pgsql-hackers
On 01/15/2015 03:03 AM, Andres Freund wrote:
> 0002: Use a nonblocking socket for FE/BE communication and block using
>        latches.

s/suceeds/succeeds/

> 0004: Process 'die' interrupts while reading/writing from the client socket.

> +             * Check for interrupts here, in addition to secure_write(),
> +             * because a interrupted write in secure_raw_write() can possibly
> +             * only return to here, not to secure_write().
>               */
> +            ProcessClientWriteInterrupt(true);

Took me a while to parse that sentence. How about:

Check for interrupts here, in addition to secure_write(), because an 
interrupted write in secure_raw_write() will return here, and we cannot 
return to secure_write() until we've written something.

> 0005: Move deadlock and other interrupt handling in proc.c out of signal handlers.
>
>        I'm surprised how comparatively simple this turned out to be. For
>        robustness and understanding I think this is a great improvement.
>
>        New and not reviewed at all. Needs significant review. But works
>        in my simple testcases.

> @@ -799,6 +791,7 @@ ProcKill(int code, Datum arg)
>      proc = MyProc;
>      MyProc = NULL;
>      DisownLatch(&proc->procLatch);
> +    SetLatch(MyLatch);
>
>      SpinLockAcquire(ProcStructLock);
>
> @@ -868,6 +861,7 @@ AuxiliaryProcKill(int code, Datum arg)
>      proc = MyProc;
>      MyProc = NULL;
>      DisownLatch(&proc->procLatch);
> +    SetLatch(MyLatch);
>
>      SpinLockAcquire(ProcStructLock);

These SetLatch calls are unnecessary. SwitchBackToLocalLatch() already 
sets the latch. (According to the comment in SwitchToSharedLatch() even 
that should not be necessary.)

> 0006: Don't allow immediate interupts during authentication anymore.

In commit message: s/interupts/interrupts/.

> @@ -80,9 +73,6 @@ md5_crypt_verify(const Port *port, const char *role, char *client_pass,
>      if (*shadow_pass == '\0')
>          return STATUS_ERROR;    /* empty password */
>
> -    /* Re-enable immediate response to SIGTERM/SIGINT/timeout interrupts */
> -    ImmediateInterruptOK = true;
> -    /* And don't forget to detect one that already arrived */
>      CHECK_FOR_INTERRUPTS();
>
>      /*

That lone CHECK_FOR_INTERRUPTS() that remains looks pretty randomly 
placed now that the "ImmediateInterruptOK = true" is gone. I would just 
remove it. Add one to the caller if it's needed, but it probably isn't.

> 0007: Remove the option to service interrupts during PGSemaphoreLock().

s/don't/doesn't/ in commit message.

> Questions I have about the series right now:
>
> 1) Do others agree that getting rid of ImmediateInterruptOK is
>     worthwile. I personally think that we really should pursue that
>     aggressively as a goal.

Yes.

> 2) Does anybody see a significant problem with the reduction of cases
>     where we immediately react to authentication_timeout? Since we still
>     react immediately when we're waiting for the client (except maybe in
>     sspi/kerberos?) I think that's ok. The waiting cases are slow
>     ident/radius/kerberos/ldap/... servers. But we really shouldn't jump
>     out of the relevant code anyway.

Yeah, I think that's OK. Doing those things with 
ImmediateInterruptOK=true was quite scary, and we need to stop doing 
that. It would be nice to have a wholesale solution for those lookups 
but I don't see one. So all the ident/radius/kerberos/ldap lookups will 
need to be done in a non-blocking way to have them react to the timeout. 
That requires some work, but we can leave that to a followup patch, if 
someone wants to write it.

Overall, 1-8 look good to me, except for that one incomplete comment in 
ProcessClientWriteInterrupt() I mentioned in a separate email. I haven't 
reviewed patches 9 and 10 yet.

- Heikki




pgsql-hackers by date:

Previous
From: Peter Geoghegan
Date:
Subject: Re: PageRepairFragmentation performance
Next
From: Sawada Masahiko
Date:
Subject: Re: Proposal: knowing detail of config files via SQL