On Mon, Dec 26, 2011 at 5:08 AM, Alexander Björnhagen
<alex.bjornhagen@gmail.com> wrote:
> I’m new here so maybe someone else already has this in the works ?
No, as far as I know.
> And so on ... any comments are welcome :)
Basically I like this whole idea, but I'd like to know why do you
think this functionality is required?
When is the replication mode switched from "standalone" to "sync"?
That happens as soon as
sync standby appears? or it has caught up with the master? The former
might block the
transactions for a long time until the standby has caught up with the
master even though
synchronous_standalone_master is enabled and a user wants to avoid
such a downtime.
When standalone master is enabled, you might lose some committed
transactions at failover
as follows:
1. While synchronous replication is running normally, replication
connection is closed because of network outage.
2. The master works standalone because of
synchronous_standalone_master=on and some new transactions are committed though their WAL records are not
replicated to the standby.
3. The master crashes for some reasons, the clusterware detects it and
triggers a failover.
4. The standby which doesn't have recent committed transactions
becomes the master at a failover...
Is this scenario acceptable?
To avoid such a loss of transactions, I'm thinking to introduce new
GUC parameter specifying
the shell command which is executed when replication mode is switched
from "sync" to "standalone".
If we set it to something like STONITH command, we can shut down
forcibly the standby before
the master resumes the transactions, and avoid the failover to the
obsolete standby when the
master crashes.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center