Re: loss of transactions in streaming replication - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: loss of transactions in streaming replication
Date
Msg-id CAHGQGwEQ9qq8Rx83RAfVCtpiAM5Uguf_KHmcMzCwzLtzvxm3Uw@mail.gmail.com
Whole thread Raw
In response to Re: loss of transactions in streaming replication  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: loss of transactions in streaming replication
List pgsql-hackers
On Thu, Oct 13, 2011 at 10:08 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Wed, Oct 12, 2011 at 10:29 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Wed, Oct 12, 2011 at 5:45 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
>>> In 9.2dev and 9.1, when walreceiver detects an error while sending data to
>>> WAL stream, it always emits ERROR even if there are data available in the
>>> receive buffer. This might lead to loss of transactions because such
>>> remaining data are not received by walreceiver :(
>>
>> Won't it just reconnect?
>
> Yes if the master is running normally. OTOH, if the master is not running (i.e.,
> failover case), the standby cannot receive again the data which it failed to
> receive.
>
> I found this issue when I shut down the master. When the master shuts down,
> it sends the shutdown checkpoint record, but I found that the standby failed
> to receive it.

Patch attached.

The patch changes walreceiver so that it doesn't emit ERROR just yet even
if it fails to send data to WAL stream. Then, after all available data have been
received and flushed to the disk, it emits ERROR.

If the patch is OK, it should be backported to v9.1.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: patch for new feature: Buffer Cache Hibernation
Next
From: Fujii Masao
Date:
Subject: Re: Online base backup from the hot-standby