Home > mailing lists

Re: Replication server timeout patch - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Replication server timeout patch
Date	March 30, 2011 04:25:09
Msg-id	4D92DAC5.20605@enterprisedb.com Whole thread Raw
In response to	Re: Replication server timeout patch (Fujii Masao <masao.fujii@gmail.com>)
Responses	Re: Replication server timeout patch
List	pgsql-hackers

Tree view

On 29.03.2011 07:55, Fujii Masao wrote:
> On Mon, Mar 28, 2011 at 7:49 PM, Heikki Linnakangas
> <heikki.linnakangas@enterprisedb.com>  wrote:
>>> pq_flush_if_writable() calls internal_flush() without using PG_TRY block.
>>> This seems unsafe because for example pgwin32_waitforsinglesocket()
>>> called by secure_write() can throw ERROR.
>>
>> Perhaps it's time to give up on the assumption that the socket is in
>> blocking mode except within those two functions. Attached patch adds the
>> pq_set_nonblocking() function from your patch, and adds calls to it before
>> all secure_read/write operations to put the socket in the right mode.
>> There's only a few of those operations.
>
> Sounds good.
>
> +        pq_set_nonblocking(false); /* XXX: Is this required? */
>
> No. Since secure_close and close_SSL don't use MyProcPort->sock and
> MyProcPort->noblock which can be changed in pq_set_nonblocking,
> I don't think that is required.

Ok, I took that out.

> +    pq_putmessage_noblock('d', msgbuf, 1 + sizeof(WalDataMessageHeader) + nbytes);
>
> Don't we need to check the return value of pq_putmessage_noblock? That
> can return EOF when trouble happens (for example the send system call fails).

No, pq_putmessage_noblock doesn't call send() because it enlarges the 
buffer to make sure the message fits, and it doesn't anything else that 
could fail else. I changed its return type to void, and added an 
Assert() to check that the pq_putmessage() call it does internally 
indeed doesn't fail.

>> Should we use COMMERROR instead of ERROR if we fail to put the socket in the
>> right mode?
>
> Maybe.

I made it COMMERROR. ERRORs are sent to the client, and you could get 
into infinite recursion if sending the ERROR requires setting the 
blocking mode again.

Committed with those changes. I also reworded the docs a bit.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

pgsql-hackers by date:

From: Fujii Masao
Date: 30 March 2011, 03:25:42
Subject: Re: Problem with streaming replication, backups, and recovery (9.0.x)

From: Tatsuo Ishii
Date: 30 March 2011, 04:53:57
Subject: Re: pg_last_xlog_receive_location()

Re: Replication server timeout patch - Mailing list pgsql-hackers

Previous

Next