Re: Hot Standby conflict resolution handling - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Hot Standby conflict resolution handling
Date
Msg-id 8463.1358435963@sss.pgh.pa.us
Whole thread Raw
In response to Re: Hot Standby conflict resolution handling  (Pavan Deolasee <pavan.deolasee@gmail.com>)
Responses Re: Hot Standby conflict resolution handling
List pgsql-hackers
Pavan Deolasee <pavan.deolasee@gmail.com> writes:
> On Thu, Jan 17, 2013 at 12:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> ISTM that if we dare not interrupt for fear of confusing OpenSSL, we
>> cannot safely attempt to send an error message to the client either;
>> but ereport(FATAL) will try exactly that.

> I thought since FATAL will force the backend to exit, we don't care much
> about corrupted OpenSSL state. I even thought that's why we raise ERROR to
> FATAL so that the backend can start in a clean state. But clearly I'm
> missing a point here because you don't think that way.

If we were to simply exit(1), leaving the kernel to close the client
socket, it'd be safe enough because control would never have returned to
OpenSSL.  But this code doesn't do that.  What we're looking at is that
we've interrupted OpenSSL at some arbitrary point, and now we're going
to make fresh calls to it to try to pump the FATAL error message out to
the client.  It seems fairly unlikely that that's safe.  I'm not sure
I credit Andres' worry of arbitrary code execution, but I do fear that
OpenSSL could get confused to the point of freezing up, or even more
likely that it would transmit garbage to the client, which rather
defeats the purpose.

Don't see a nice fix.  The COMMERROR approach (ie, don't try to send
anything to the client, only the log) is not nice at all since the
client would get the impression that the server crashed.  On the other
hand, anything else requires waiting till we get control back from
OpenSSL, which might be a long time, and meanwhile we're still holding
locks that prevent WAL recovery from proceeding.
        regards, tom lane



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave
Next
From: Andres Freund
Date:
Subject: Re: Slave enters in recovery and promotes when WAL stream with master is cut + delay master/slave