On Tue, 2010-09-07 at 11:17 -0400, Tom Lane wrote:
> Markus Wanner <markus@bluegap.ch> writes:
> > On 09/07/2010 04:15 PM, Robert Haas wrote:
> >> In theory, that's true, but if we do that, then there's an even bigger
> >> problem: the slave might have replayed WAL ahead of the master
> >> location; therefore the slave is now corrupt and a new base backup
> >> must be taken.
>
> > The slave isn't corrupt. It would suffice to "late abort" committed
> > transactions the master doesn't know about.
>
> Oh yes it is. If the slave replays WAL that didn't happen on the
> master, it might for instance have heap tuples in TID slots that are
> empty on the master, or index pages laid out differently from the
> master. Trying to apply additional WAL from the master will fail badly.
>
> We can *not* allow the slave to replay WAL ahead of what is known
> committed to disk on the master. The only way to make that safe
> is the compare-notes-and-ship-WAL-back approach that Robert mentioned.
>
> If you feel that decoupling WAL application is absolutely essential
> to have a credible feature, then you'd better bite the bullet and
> start working on the ship-WAL-back code.
Why not just failover?
-- Simon Riggs www.2ndQuadrant.comPostgreSQL Development, 24x7 Support, Training and Services