Re: Escaping from blocked send() reprised. - Mailing list pgsql-hackers
From | Kyotaro HORIGUCHI |
---|---|
Subject | Re: Escaping from blocked send() reprised. |
Date | |
Msg-id | 20140930.153539.49495386.horiguchi.kyotaro@lab.ntt.co.jp Whole thread Raw |
In response to | Re: Escaping from blocked send() reprised. (Heikki Linnakangas <hlinnakangas@vmware.com>) |
List | pgsql-hackers |
Thank you for reviewing. I'll look close to the patch tomorrow. > I must say this scares the heck out of me. The current code goes > through some trouble to not throw an error while in a recv() > send(). For example, you removed the DoingCommandRead check from > prepare_for_client_read(). There's an comment in postgres.c that says > this: > > > /* > > * (2) Allow asynchronous signals to be executed immediately > > * if they > > * come in while we are waiting for client input. (This must > > * be > > * conditional since we don't want, say, reads on behalf of > > * COPY FROM > > * STDIN doing the same thing.) > > */ > > DoingCommandRead = true; Hmm. Sorry. That's my fault that I skipped over the issues about "COPY FROM STDIN". > With the patch, we do allow asynchronous signals to be processed while > blocked in COPY FROM STDIN. Maybe that's OK, but I don't feel > comfortable just changing it. (the comment is now wrong, of course) I don't see actual problem but I agree that the behavior should not be chenged. > This patch also enables processing query cancel signals while blocked, > not just SIGTERM. That's not good; we might be in the middle of > sending a message, and we cannot just error out of that or we might > violate the fe/be protocol. That's OK with a SIGTERM as you're > terminating the connection anyway, and we have the PqCommBusy > safeguard in place that prevents us from sending broken messages to > the client, but that's not good enough if we wanted to keep the > backend alive, as we won't be able to send anything to the client > anymore. Ok, since what I want is escaping from blocked send() only by SIGTERM, it needs another mechanism from current prepare_for_client_read(). > BTW, we've been talking about blocking in send(), but this patch also > let's a recv() in e.g. COPY FROM STDIN to be interrupted. That's > probably a good thing; surely you have exactly the same issues with > that as with send(). But I didn't realize we had a problem with that > too. I see. (But it is mere a side-effect of my carelessness, as you know:) > In summary, this patch is not ready as it is, but I think we can fix > it. The key question is: is it safe to handle SIGTERM in the signal > handler, calling the exit-handlers and exiting the backend, when > blocked in a recv() or send()? It's a change in the pqcomm.c API; most > pqcomm.c functions have not thrown errors or processed interrupts > before. But looking at the callers, I think it's safe, and there isn't > actually any comments explicitly saying that pqcomm.c will never throw > errors. > > I propose the attached patch. It adds a new flag ImmediateDieOK, which > is a weaker form of ImmediateInterruptOK that only allows handling a > pending die-signal in the signal handler. > > Robert, others, do you see a problem with this? The patch seems excluding all problems menthioned in the message, I have no objection to it. > Over IM, Robert pointed out that it's not safe to jump out of a signal > handler with siglongjmp, when we're inside library calls, like in a > callback called by OpenSSL. But even with current master branch, > that's exactly what we do. In secure_raw_read(), we set > ImmediateInterruptOK = true, which means that any incoming signal will > be handled directly in the signal handler, which can mean > elog(ERROR). Should we be worried? OpenSSL might get confused if > control never returns to the SSL_read() or SSL_write() function that > called secure_raw_read(). IMHO, it will soon die even if OpenSSL is confused. It seems a bit brute that sudden cutoff occurs even when the socket is *not* blocked, but the backend will soon die and frontend will immediately get ECONNRESET (..hmm it is not seen in manpages of recv/read(2)) and should safely exit from OpenSSL. I cannot run this patch right now, but it seems to be no problem. regards, -- Kyotaro Horiguchi NTT Open Source Software Center
pgsql-hackers by date: