Thread: Re: streaming replication breaks horribly if master crashes

Re: streaming replication breaks horribly if master crashes

From
"Kevin Grittner"
Date:
Robert Haas <robertmhaas@gmail.com> wrote:
> Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
>> Robert Haas <robertmhaas@gmail.com> wrote:
>>> So, obviously at this point my slave database is corrupted
>>> beyond repair due to nothing more than an unexpected crash on
>>> the master.
>>
>> Certainly that's true for resuming replication.  From your
>> description it sounds as though the slave would be usable for
>> purposes of taking over for an unrecoverable master.  Or am I
>> misunderstanding?
> 
> It depends on what you mean.  If you can prevent the slave from
> ever reconnecting to the master, then it's still safe to promote
> it.
Yeah, that's what I meant.
> But if the master comes up and starts generating WAL again, and
> the slave ever sees any of that WAL (either via SR or via the
> archive) then you're toast.
Well, if it *applies* what it sees, yes.  Effectively you've got
transactions from two alternative timelines applied in the same
database, which is not going to work.  At a minimum we need some
way to reliably detect that the incoming WAL stream is starting
before some applied WAL record and isn't a match.
-Kevin