Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Support for N synchronous standby servers - take 2
Date
Msg-id CAHGQGwE_-HCzw687B4SdMWqAkkPcu-uxmF3MKyDB9mu38cJ7Jg@mail.gmail.com
Whole thread Raw
In response to Re: Support for N synchronous standby servers - take 2  (Josh Berkus <josh@agliodbs.com>)
Responses Re: Support for N synchronous standby servers - take 2
List pgsql-hackers
On Tue, Jun 30, 2015 at 2:40 AM, Josh Berkus <josh@agliodbs.com> wrote:
> On 06/29/2015 01:01 AM, Michael Paquier wrote:
>> On Mon, Jun 29, 2015 at 4:20 AM, Josh Berkus <josh@agliodbs.com> wrote:
>
>>> Right.  Well, another reason we should be using a system catalog and not
>>> a single GUC ...

The problem by using system catalog to configure the synchronous replication
is that even configuration change needs to wait for its WAL record (i.e., caused
by change of system catalog) to be replicated. Imagine the case where you have
one synchronous standby but it does down. To keep the system up, you'd like
to switch the replication mode to asynchronous by changing the corresponding
system catalog. But that change may need to wait until synchronous standby
starts up again and its WAL record is successfully replicated. This means that
you may need to wait forever...

One approach to address this problem is to introduce something like unlogged
system catalog. I'm not sure if that causes another big problem, though...

> You're confusing two separate things.  The primary manageability problem
> has nothing to do with altering the parameter.  The main problem is: if
> there is more than one synch candidate, how do we determine *after the
> master dies* which candidate replica was in synch at the time of
> failure?  Currently there is no way to do that.  This proposal plans to,
> effectively, add more synch candidate configurations without addressing
> that core design failure *at all*.  That's why I say that this patch
> decreases overall reliability of the system instead of increasing it.

I agree this is a problem even today, but it's basically independent from
the proposed feature *itself*. So I think that it's better to discuss and
work on the problem separately. If so, we might be able to provide
good way to find new master even if the proposed feature finally fails
to be adopted.

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Expending the use of xlog_internal.h's macros
Next
From: Peter Eisentraut
Date:
Subject: Re: pg_basebackup and replication slots