On Wed, Sep 8, 2010 at 10:22 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On Wed, 2010-09-08 at 09:50 -0400, Robert Haas wrote:
>
>> So that means we have to make sure that none of the effects of a
>> transaction can be seen until WAL is fsync'd on the master AND the
>> slave has acked.
>
> Yes, that's right. And I like your example; one for the docs.
>
> There is a slight complexity there: An application might connect to the
> standby and see the changes made by the transaction, even though the
> master has not yet been notified, but will be in a moment. I don't see
> that as an issue though, but worth mentioning cos its just the
> "Byzantine Generals" problem.
I think that's OK too, because there's no way we can guarantee that
the transaction becomes visible exactly simultaneously on both nodes.
What we do need to guarantee is that it is known committed on both
nodes before it becomes visible on either, so that even if there is a
crash or failover it can't uncommit itself. So the order of events
must be:
- fsync WAL on master
- send WAL to slave
- wait for ack from slave
- allow transaction's effects to become visible on master
If the slave is only guaranteeing *receipt* of the WAL rather than
fsync or replay of the WAL, then there is still a possibility of a
disappearing transaction if the master and standby fail simultaneously
AND a failover then occurs. So don't pick that mode if a disappearing
transaction will result in someone dying or your $20B company going
bankrupt or ...
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company