Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown
Date
Msg-id CAHGQGwGWMOc-hKTyjjswzYLykAsA2+t+xpcybi6c+DkN+5dA+A@mail.gmail.com
Whole thread Raw
In response to Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown  (Amit kapila <amit.kapila@huawei.com>)
Responses Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown  (Amit kapila <amit.kapila@huawei.com>)
Re: [BUGS] BUG #7534: walreceiver takes long time to detect n/w breakdown  (Amit Kapila <amit.kapila@huawei.com>)
List pgsql-hackers
On Sat, Sep 15, 2012 at 4:26 PM, Amit kapila <amit.kapila@huawei.com> wrote:
> On Saturday, September 15, 2012 11:27 AM Fujii Masao wrote:
> On Fri, Sep 14, 2012 at 10:01 PM, Amit kapila <amit.kapila@huawei.com> wrote:
>>
>> On Thursday, September 13, 2012 10:57 PM Fujii Masao
>> On Thu, Sep 13, 2012 at 1:22 PM, Amit Kapila <amit.kapila@huawei.com> wrote:
>>> On Wednesday, September 12, 2012 10:15 PM Fujii Masao
>>> On Wed, Sep 12, 2012 at 8:54 PM,  <amit.kapila@huawei.com> wrote:
>>>>>> The following bug has been logged on the website:
>
>>>>>  I would like to implement such feature for walreceiver, but there is one
>>>>> confusion that whether to use
>>>>>  same configuration parameter(replication_timeout) for walrecevier as for
>>>>> master or introduce a new
>>>>>  configuration parameter (receiver_replication_timeout).
>>
>>>>I like the latter. I believe some users want to set the different
>>>>timeout values,
>>>>for example, in the case where the master and standby servers are placed in
>>>>the same room, but cascaded standby is placed in other continent.
>>
>>> Thank you for your suggestion. I have implemented as per your suggestion to have separate timeout parameter for
walreceiver.
>>> The main changes are:
>>> 1. Introduce a new configuration parameter wal_receiver_replication_timeout for walreceiver.
>>> 2. In function WalReceiverMain(), check if there is no communication till wal_receiver_replication_timeout, exit
thewalreceiver.
 
>>>     This is same as walsender functionality.
>>
>>> As this is a feature, So I am uploading the attached patch in coming CommitFest.
>>
>>> Suggestions/Comments?
>
>> You also need to change walsender so that it periodically sends the heartbeat
>> message, like walreceiver does each wal_receiver_status_interval. Otherwise,
>> walreceiver will detect the timeout wrongly whenever there is no traffic in the
>> master.
>
> Doesn't current keepalive message from walsender will suffice that need?

No. Though the keepalive interval should be smaller than the timeout,
IIRC there is
no way to specify the keepalive interval now.

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: embedded list v2
Next
From: Andres Freund
Date:
Subject: Re: [COMMITTERS] pgsql: Properly set relpersistence for fake relcache entries.