On Wed, 2006-10-11 at 16:12 -0500, Jim C. Nasby wrote:
> On Wed, Oct 11, 2006 at 10:28:44AM -0400, Andrew Sullivan wrote:
> > On Thu, Oct 05, 2006 at 08:43:21PM -0500, Jim Nasby wrote:
> > > Isn't it entirely possible that if the master gets trashed it would
> > > start sending garbage to the Slony slave as well?
> >
> > Well, maybe, but unlikely. What happens in a shared-disc failover is
> > that the second machine re-mounts the same partition as the old
> > machine had open. The risk is the case where your to-be-removed
> > machine hasn't actually stopped writing on the partition yet, but
> > your failover software thinks it's dead, and can fail over. Two
> > processes have the same Postgres data and WAL files mounted at the
> > same time, and blammo. As nearly as I can tell, it takes
> > approximately zero time for this arrangement to make such a mess that
> > you're not committing any transactions. Slony will only get the data
> > on COMMIT, so the risk is very small.
>
> Hrm... I guess it depends on how quickly the Slony master would stop
> processing if it was talking to a shared-disk that had become corrupt
> from another postmaster.
That doesn't depend on Slony, it depends on Postgres. If transactions
are committing on the master, Slony will replicate them. You could have
a situation where your HA failover trashes some of you database, but the
database still starts up. It starts accepting and replicating
transactions before the corruption is discovered.
Brad.