On Tue, Jul 27, 2010 at 10:53:45PM +0900, Fujii Masao wrote:
> On Tue, Jul 27, 2010 at 10:12 PM, Joshua Tolley <eggyknap@gmail.com> wrote:
> > My concern is that in a quorum system, if the quorum number is less than the
> > total number of replicas, there's no way to know *which* replicas composed the
> > quorum for any given transaction, so we can't know which servers to fail to if
> > the master dies.
>
> What about checking the current WAL receive location of each standby by
> using pg_last_xlog_receive_location()? The standby which has the newest
> location should be failed over to.
That makes sense. Thanks.
> > This isn't different from Oracle, where it looks like
> > essentially the "quorum" value is always 1. Your scenario shows that all
> > replicas are not created equal, and that sometimes we'll be interested in WAL
> > getting committed on a specific subset of the available servers. If I had two
> > nearby replicas called X and Y, and one at a remote site called Z, for
> > instance, I'd set quorum to 2, but really I'd want to say "wait for server X
> > and Y before committing, but don't worry about Z".
> >
> > I have no idea how to set up our GUCs to encode a situation like that :)
>
> Yeah, quorum commit alone cannot cover that situation. I think that
> current approach (i.e., quorum commit plus replication mode per standby)
> would cover that. In your example, you can choose "recv", "fsync" or
> "replay" as replication_mode in X and Y, and choose "async" in Z.
Clearly I need to read through the GUCs and docs better. I'll try to keep
quiet until that's finished :)
--
Joshua Tolley / eggyknap
End Point Corporation
http://www.endpoint.com