Home > mailing lists

Re: postgres in HA constellation - Mailing list pgsql-admin

From	Jim C. Nasby
Subject	Re: postgres in HA constellation
Date	October 11, 2006 21:12:54
Msg-id	20061011211246.GV13487@nasby.net Whole thread Raw
In response to	Re: postgres in HA constellation (Andrew Sullivan <ajs@crankycanuck.ca>)
Responses	Re: postgres in HA constellation (Brad Nicholson <bnichols@ca.afilias.info>)
List	pgsql-admin

Tree view

On Wed, Oct 11, 2006 at 10:28:44AM -0400, Andrew Sullivan wrote:
> On Thu, Oct 05, 2006 at 08:43:21PM -0500, Jim Nasby wrote:
> > Isn't it entirely possible that if the master gets trashed it would
> > start sending garbage to the Slony slave as well?
>
> Well, maybe, but unlikely.  What happens in a shared-disc failover is
> that the second machine re-mounts the same partition as the old
> machine had open.  The risk is the case where your to-be-removed
> machine hasn't actually stopped writing on the partition yet, but
> your failover software thinks it's dead, and can fail over.  Two
> processes have the same Postgres data and WAL files mounted at the
> same time, and blammo.  As nearly as I can tell, it takes
> approximately zero time for this arrangement to make such a mess that
> you're not committing any transactions.  Slony will only get the data
> on COMMIT, so the risk is very small.

Hrm... I guess it depends on how quickly the Slony master would stop
processing if it was talking to a shared-disk that had become corrupt
from another postmaster.

> > I think PITR would be a much better option to protect against this,
> > since you could probably recover up to the exact point of failover.
>
> That oughta work too, except that your remounted WAL gets corrupted
> under the imagined scenario, and then you copy the next updates to
> the WAL.  So you have to save all the incremental copies of the WAL
> you make, so that you don't have a garbage file to read.
>
> As I said, I don't think that it's a bad idea to use this sort of
> trick.  I just think it's a poor single line of defence, because when
> it fails, it fails hard.

Yeah, STONITH is *critical* for shared-disk.
--
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)

pgsql-admin by date:

From: Bruno Wolff III
Date: 11 October 2006, 15:03:57
Subject: Re: Copying data from table to table (cloned tables)

From: "Adriana Tanfara"
Date: 12 October 2006, 11:09:20
Subject: Error Message Importing Into PostgreSQL (Using phpPgAdmin)

Re: postgres in HA constellation - Mailing list pgsql-admin

Previous

Next