Re: Escaping from blocked send() reprised. - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Escaping from blocked send() reprised.
Date
Msg-id 53FB2C3D.6030303@vmware.com
Whole thread Raw
In response to Re: Escaping from blocked send() reprised.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Escaping from blocked send() reprised.
List pgsql-hackers
On 07/01/2014 06:26 AM, Kyotaro HORIGUCHI wrote:
> At Mon, 30 Jun 2014 11:27:47 -0400, Robert Haas <robertmhaas@gmail.com> wrote in
<CA+TgmoZfcGzAEmtbyoCe6VdHnq085x+ox752zuJ2AKN=Wc8PnQ@mail.gmail.com>
>> 1. I think it's the case that there are platforms around where a
>> signal won't cause send() to return EINTR.... and I'd be entirely
>> unsurprised if SSL_write() doesn't necessarily return EINTR in that
>> case.  I'm not sure what, if anything, we can do about that.

We use a custom "write" routine with SSL_write, where we call send() 
ourselves, so that's not a problem as long as we put the check in the 
right place (in secure_raw_write(), after my recent SSL refactoring - 
the patch needs to be rebased).

> man 2 send on FreeBSD has not description about EINTR.. And even
> on linux, send won't return EINTR for most cases, at least I
> haven't seen that. So send()=-1,EINTR seems to me as only an
> equivalent of send() = 0. I have no idea about what the
> implementer thought the difference is.

As the patch stands, there's a race condition: if the SIGTERM arrives 
*before* the send() call, the send() won't return EINTR anyway. So 
there's a chance that you still block. Calling pq_terminate_backend() 
again will dislodge it (assuming send() returns with EINTR on signal), 
but I don't think we want to define the behavior as "usually, 
pq_terminate_backend() will kill a backend that's blocked on sending to 
the client, but sometimes you have to call it twice (or more!) to really 
kill it".

A more robust way is to set ImmediateInterruptOK before calling send(). 
That wouldn't let you send data that can be sent without blocking 
though. For that, you could put the socket to non-blocking mode, and 
sleep with select(), also waiting for the process' latch at the same 
time (die() sets the latch, so that will wake up the select() if a 
termination request arrives).

Is it actually safe to process the die-interrupt where send() is called? 
ProcessInterrupts() does "ereport(FATAL, ...)", which will attempt to 
send a message to the client. If that happens in the middle of 
constructing some other message, that will violate the protocol.

>> 2. I think it would be reasonable to try to kill off the connection
>> without notifying the client if we're unable to send the data to the
>> client in a reasonable period of time.  But I'm unsure what "a
>> reasonable period of time" means.  This patch would basically do it
>> after no delay at all, which seems like it might be too aggressive.
>> However, I'm not sure.
>
> I think there's no such a reasonable time.

I agree it's pretty hard to define any reasonable timeout here. I think 
it would be fine to just cut the connection; even if you don't block 
while sending, you'll probably reach a CHECK_FOR_INTERRUPT() somewhere 
higher in the stack and kill the connection almost as abruptly anyway. 
(you can't violate the protocol, however)

- Heikki




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Switch pg_basebackup to use -X stream instead of -X fetch by default?
Next
From: Amit Kapila
Date:
Subject: Re: LIMIT for UPDATE and DELETE