Re: Escaping from blocked send() reprised. - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Escaping from blocked send() reprised.
Date
Msg-id 20140930.153539.49495386.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Escaping from blocked send() reprised.  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
Thank you for reviewing. I'll look close to the patch tomorrow.

> I must say this scares the heck out of me. The current code goes
> through some trouble to not throw an error while in a recv()
> send(). For example, you removed the DoingCommandRead check from
> prepare_for_client_read(). There's an comment in postgres.c that says
> this:
> 
> >         /*
> >          * (2) Allow asynchronous signals to be executed immediately
> >          * if they
> >          * come in while we are waiting for client input. (This must
> >          * be
> >          * conditional since we don't want, say, reads on behalf of
> >          * COPY FROM
> >          * STDIN doing the same thing.)
> >          */
> >         DoingCommandRead = true;

Hmm. Sorry. That's my fault that I skipped over the issues about
"COPY FROM STDIN".

> With the patch, we do allow asynchronous signals to be processed while
> blocked in COPY FROM STDIN. Maybe that's OK, but I don't feel
> comfortable just changing it. (the comment is now wrong, of course)

I don't see actual problem but I agree that the behavior should
not be chenged.

> This patch also enables processing query cancel signals while blocked,
> not just SIGTERM. That's not good; we might be in the middle of
> sending a message, and we cannot just error out of that or we might
> violate the fe/be protocol. That's OK with a SIGTERM as you're
> terminating the connection anyway, and we have the PqCommBusy
> safeguard in place that prevents us from sending broken messages to
> the client, but that's not good enough if we wanted to keep the
> backend alive, as we won't be able to send anything to the client
> anymore.

Ok, since what I want is escaping from blocked send() only by
SIGTERM, it needs another mechanism from current
prepare_for_client_read().

> BTW, we've been talking about blocking in send(), but this patch also
> let's a recv() in e.g. COPY FROM STDIN to be interrupted. That's
> probably a good thing; surely you have exactly the same issues with
> that as with send(). But I didn't realize we had a problem with that
> too.

I see. (But it is mere a side-effect of my carelessness, as you know:)

> In summary, this patch is not ready as it is, but I think we can fix
> it. The key question is: is it safe to handle SIGTERM in the signal
> handler, calling the exit-handlers and exiting the backend, when
> blocked in a recv() or send()? It's a change in the pqcomm.c API; most
> pqcomm.c functions have not thrown errors or processed interrupts
> before. But looking at the callers, I think it's safe, and there isn't
> actually any comments explicitly saying that pqcomm.c will never throw
> errors.
> 
> I propose the attached patch. It adds a new flag ImmediateDieOK, which
> is a weaker form of ImmediateInterruptOK that only allows handling a
> pending die-signal in the signal handler.
> 
> Robert, others, do you see a problem with this?

The patch seems excluding all problems menthioned in the message,
I have no objection to it.

> Over IM, Robert pointed out that it's not safe to jump out of a signal
> handler with siglongjmp, when we're inside library calls, like in a
> callback called by OpenSSL. But even with current master branch,
> that's exactly what we do. In secure_raw_read(), we set
> ImmediateInterruptOK = true, which means that any incoming signal will
> be handled directly in the signal handler, which can mean
> elog(ERROR). Should we be worried? OpenSSL might get confused if
> control never returns to the SSL_read() or SSL_write() function that
> called secure_raw_read().

IMHO, it will soon die even if OpenSSL is confused. It seems a
bit brute that sudden cutoff occurs even when the socket is *not*
blocked, but the backend will soon die and frontend will
immediately get ECONNRESET (..hmm it is not seen in manpages of
recv/read(2)) and should safely exit from OpenSSL.

I cannot run this patch right now, but it seems to be no problem.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




pgsql-hackers by date:

Previous
From: Kouhei Kaigai
Date:
Subject: Re: [v9.5] Custom Plan API
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Escaping from blocked send() reprised.