Re: streaming replication breaks horribly if master crashes - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: streaming replication breaks horribly if master crashes
Date
Msg-id AANLkTil01eBZVtqOWQqp2ZjAd1-JpY5l9PW3Lwn5P96o@mail.gmail.com
Whole thread Raw
In response to Re: streaming replication breaks horribly if master crashes  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Wed, Jun 16, 2010 at 22:26, Robert Haas <robertmhaas@gmail.com> wrote:
>>> and this just
>>> makes it more likely.  After the most recent crash, the master thought
>>> pg_current_xlog_location() was 1/86CD4000; the slave thought
>>> pg_last_xlog_receive_location() was 1/8733C000.  After reconnecting to
>>> the master, the slave then thought that
>>> pg_last_xlog_receive_location() was 1/87000000.
>>
>> So, *in this case*, detecting out-of-sequence xlogs (and PANICing) would
>> have actually prevented the slave from being corrupted.
>>
>> My question, though, is detecting out-of-sequence xlogs *enough*?  Are
>> there any crash conditions on the master which would cause the master to
>> reuse the same locations for different records, for example?  I don't
>> think so, but I'd like to be certain.
>
> The real problem here is that we're sending records to the slave which
> might cease to exist on the master if it unexpectedly reboots.  I
> believe that what we need to do is make sure that the master only
> sends WAL it has already fsync'd (Tom suggested on another thread that
> this might be necessary, and I think it's now clear that it is 100%
> necessary).  But I'm not sure how this will play with fsync=off - if
> we never fsync, then we can't ever really send any WAL without risking

Well, at this point we can just prevent streaming replication with
fsync=off if we can't think of an easy fix, and then design a "proper
fix" for 9.1. Given how late we are in the cycle.


-- Magnus HaganderMe: http://www.hagander.net/Work: http://www.redpill-linpro.com/


pgsql-hackers by date:

Previous
From: "Kevin Grittner"
Date:
Subject: Re: streaming replication breaks horribly if master crashes
Next
From: Rafael Martinez
Date:
Subject: Re: streaming replication breaks horribly if master crashes