Re: Escaping from blocked send() reprised. - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Escaping from blocked send() reprised.
Date
Msg-id 20141002090250.GF7158@awork2.anarazel.de
Whole thread Raw
In response to Re: Escaping from blocked send() reprised.  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: Escaping from blocked send() reprised.
List pgsql-hackers
On 2014-10-02 17:47:39 +0900, Kyotaro HORIGUCHI wrote:
> Hello,
> 
> > > I propose the attached patch. It adds a new flag ImmediateDieOK, which is a
> > > weaker form of ImmediateInterruptOK that only allows handling a pending
> > > die-signal in the signal handler.
> > > 
> > > Robert, others, do you see a problem with this?
> > 
> > Per se I don't have a problem with it. There does exist the problem that
> > the user doesn't get a error message in more cases though. On the other
> > hand it's bad if any user can prevent the database from restarting.
> > 
> > > Over IM, Robert pointed out that it's not safe to jump out of a signal
> > > handler with siglongjmp, when we're inside library calls, like in a callback
> > > called by OpenSSL. But even with current master branch, that's exactly what
> > > we do. In secure_raw_read(), we set ImmediateInterruptOK = true, which means
> > > that any incoming signal will be handled directly in the signal handler,
> > > which can mean elog(ERROR). Should we be worried? OpenSSL might get confused
> > > if control never returns to the SSL_read() or SSL_write() function that
> > > called secure_raw_read().
> > 
> > But this is imo prohibitive. Yes, we're doing it for a long while. But
> > no, that's not ok. It actually prompoted me into prototyping the latch
> > thing (in some other thread). I don't think existing practice justifies
> > expanding it further.
> 
> I see, in that case, this approach seems basically
> applicable. But if I understand correctly, this patch seems not
> to return out of the openssl code even when latch was found to be
> set in secure_raw_write/read.

Correct. That's why I think it's the way forward. There's several
problems now where the inability to do real things while reading/writing
is a problem.

>  I tried setting errno = ECONNRESET
> and it went well but seems a bad deed.

Where and why did you do that?

> secure_raw_write(Port *port, const void *ptr, size_t len)
> {
>   n = send(port->sock, ptr, len, 0);
>   
>   if (!port->noblock && n < 0 && (errno == EWOULDBLOCK || errno == EAGAIN))
>   {
>     w = WaitLatchOrSocket(&MyProc->procLatch, ...
> 
>     if (w & WL_LATCH_SET)
>     {
>     ResetLatch(&MyProc->procLatch);
>         /*
>         * Force a return, so interrupts can be processed when not
>         * (possibly) underneath a ssl library.
>         */
>         errno = EINTR;
>         (return n;  // n is negative)
> 
> 
> my_sock_write(BIO *h, const char *buf, int size)
> {
>   res = secure_raw_write(((Port *) h->ptr), buf, size);
>   BIO_clear_retry_flags(h);
>   if (res <= 0)
>   {
>     if (errno == EINTR || errno == EWOULDBLOCK || errno == EAGAIN)
>     {
>       BIO_set_retry_write(h);

Hm, this seems, besides one comment, the code from the last patch in my
series. Do you have a particular question about it?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Replication identifiers, take 3
Next
From: Andres Freund
Date:
Subject: Re: Replication identifiers, take 3