Fujii Masao <masao.fujii@gmail.com> writes:
> On the other hand, the primary postgres might *not* restart automatically.
> So, it's difficult for clusterware to choose whether to do failover when it
> detects the death of the primary postgres, I think.
I think the accepted way to handle this kind of situation is called STONITH --
"Shoot The Other Node In The Head".
You need some way when the cluster software decides to initiate failover to
ensure that the first node *cannot* come back up. That could mean shutting the
power to it at the PDU or disabling its network connection at the switch, or
various other options.
Gregory Stark http://mit.edu/~gsstark/resume.pdf