Re: Replication server timeout patch - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Replication server timeout patch
Date
Msg-id AANLkTin0YGmAo5cFfEi5v9WX5rGshpZuCZdufeqxoNmz@mail.gmail.com
Whole thread Raw
In response to Re: Replication server timeout patch  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Replication server timeout patch  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Fri, Feb 18, 2011 at 7:55 AM, Josh Berkus <josh@agliodbs.com> wrote:
>> So, in summary, the position is that we have a timeout, but that timeout
>> doesn't work in all cases. But it does work in some, so that seems
>> enough for me to say "let's commit". Not committing gives us nothing at
>> all, which is as much use as a chocolate teapot.
>
> Can someone summarize the cases where it does and doesn't work?
> There's been a longish gap in this thread.

The timeout doesn't work when walsender gets blocked during sending the
WAL because the send buffer has been filled up, I'm afraid. IOW, it doesn't
work when the standby becomes unresponsive while WAL is generated on
the master one after another. Since walsender tries to continue sending the
WAL while the standby is unresponsive, the send buffer gets filled up and
the blocking send function (e.g., pq_flush) blocks the walsender.

OTOH, if the standby becomes unresponsive when there is no workload
which causes WAL, the timeout would work.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From:
Date:
Subject: Re: Debian readline/libedit breakage
Next
From: Alvaro Herrera
Date:
Subject: Re: arrays as pl/perl input arguments [PATCH]