Re: Support for N synchronous standby servers - Mailing list pgsql-hackers

From Michael Paquier
Subject Re: Support for N synchronous standby servers
Date
Msg-id CAB7nPqS7znbSEoYP=KWd+gsJBQf06Pezt_H=XJbsiPCy24fWsw@mail.gmail.com
Whole thread Raw
In response to Re: Support for N synchronous standby servers  (Fujii Masao <masao.fujii@gmail.com>)
List pgsql-hackers
On Fri, Aug 15, 2014 at 9:28 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> You added check_synchronous_standby_num() as the GUC check function for
> synchronous_standby_num, and checked that there. But that seems to be wrong.
> You can easily see the following error messages even if synchronous_standby_num
> is smaller than max_wal_senders. The point is that synchronous_standby_num
> should be located before max_wal_senders in postgresql.conf.
>
> LOG:  invalid value for parameter "synchronous_standby_num": 0
> DETAIL:  synchronous_standby_num cannot be higher than max_wal_senders.
I am not sure what I can do here, so I am removing this check in the code, and simply add a note in the docs that a value of _num higher than max_wal_senders does not have much meaning.

> I still think that it's strange that replication can be async even when
> s_s_num is larger than zero. That is, I think that the transaction must
> wait for s_s_num sync standbys whether s_s_names is empty or not.
> OTOH, if s_s_num is zero, replication must be async whether s_s_names
> is empty or not. At least for me, it's intuitive to use s_s_num primarily
> to control the sync mode. Of course, other hackers may have different
> thoughts, so we need to keep our ear open for them.
Sure, the compromise looks to be what you propose, and I am fine with that.

> In the above design, one problem is that the number of parameters
> that those who want to set up only one sync replication need to change is
> incremented by one. That is, they need to change s_s_num additionally.
> If we are really concerned about this, we can treat a value of -1 in
> s_s_num as the special value, which allows us to control sync replication
> only by s_s_names as we do now. That is, if s_s_names is empty,
> replication would be async. Otherwise, only one standby with
> high-priority in s_s_names becomes sync one. Probably the default of
> s_s_num should be -1. Thought?

Taking into account those comments, attached is a patch doing the following things depending on the values of _num and _names:
- If _num = -1 and _names is empty, all the standbys are considered as async (same behavior as 9.1~, and default).
- If _num = -1 and _names has at least one item, wait for one standby, even if it is not connected at the time of commit. If one node is found as sync, other standbys listed in _names with higher priority than the sync one are in potential state (same as existing behavior).
- If _num = 0, all the standbys are async, whatever the values in _names. Priority is enforced to 0 for all the standbys. SyncStandbysDefined is set to false in this case.
- If _num > 0, must wait for _num standbys whatever the values in _names
The default value of _num is -1. Documentation has been updated in consequence.

> The source code comments at the top of syncrep.c need to be udpated.
> It's worth checking whether there are other comments to be updated.
Done. I have updated some comments in other places than the header.
Regards,
--
Michael
Attachment

pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: Compute attr_needed for child relations (was Re: inherit support for foreign tables)
Next
From:
Date:
Subject: Re: pg_receivexlog --status-interval add fsync feedback