Re: Quorum commit for multiple synchronous replication. - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Quorum commit for multiple synchronous replication.
Date
Msg-id CAD21AoA8ptbF75cnmQnaqQwyFj+JkxVN0vbtaKF+dujBrsL=Fg@mail.gmail.com
Whole thread Raw
In response to Re: Quorum commit for multiple synchronous replication.  (Fujii Masao <masao.fujii@gmail.com>)
Responses Re: Quorum commit for multiple synchronous replication.
Re: Quorum commit for multiple synchronous replication.
List pgsql-hackers
On Tue, Dec 6, 2016 at 1:11 PM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Mon, Nov 28, 2016 at 8:03 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>> On Sat, Nov 26, 2016 at 10:27 PM, Michael Paquier
>> <michael.paquier@gmail.com> wrote:
>>> On Tue, Nov 15, 2016 at 7:08 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>>>> Attached latest version patch incorporated review comments. After more
>>>> thought, I agree and changed the value of standby priority in quorum
>>>> method so that it's not set 1 forcibly. The all standby priorities are
>>>> 1 If s_s_names = 'ANY(*)'.
>>>> Please review this patch.
>>>
>>> Sorry for my late reply. Here is my final lookup.
>>
>> Thank you for reviewing!
>>
>>>  <synopsis>
>>> -<replaceable class="parameter">num_sync</replaceable> ( <replaceable
>>> class="parameter">standby_name</replaceable> [, ...] )
>>> +[ANY] <replaceable class="parameter">num_sync</replaceable> (
>>> <replaceable class="parameter">standby_name</replaceable> [, ...] )
>>> +FIRST <replaceable class="parameter">num_sync</replaceable> (
>>> <replaceable class="parameter">standby_name</replaceable> [, ...] )
>>>  <replaceable class="parameter">standby_name</replaceable> [, ...
>>> This can just be replaced with [ ANY | FIRST ]. There is no need for
>>> braces as the keyword is not mandatory.
>>>
>>> +        is the name of a standby server.
>>> +        <literal>FIRST</> and <literal>ANY</> specify the method used by
>>> +        the master to control the standby servres.
>>>          </para>
>>> s/servres/servers/.
>>>
>>>             if (priority == 0)
>>>                 values[7] = CStringGetTextDatum("async");
>>>             else if (list_member_int(sync_standbys, i))
>>> -               values[7] = CStringGetTextDatum("sync");
>>> +               values[7] = SyncRepConfig->sync_method == SYNC_REP_PRIORITY ?
>>> +                   CStringGetTextDatum("sync") : CStringGetTextDatum("quorum");
>>>             else
>>>                 values[7] = CStringGetTextDatum("potential");
>>> This can be simplified a bit as "quorum" is the state value for all
>>> standbys with a non-zero priority when the method is set to
>>> SYNC_REP_QUORUM:
>>>             if (priority == 0)
>>>                 values[7] = CStringGetTextDatum("async");
>>> +           else if (SyncRepConfig->sync_method == SYNC_REP_QUORUM)
>>> +               values[7] = CStringGetTextDatum("quorum");
>>>             else if (list_member_int(sync_standbys, i))
>>>                 values[7] = CStringGetTextDatum("sync");
>>>             else
>>>
>>> SyncRepConfig data is made external to syncrep.c with this patch as
>>> walsender.c needs to look at the sync method in place, no complain
>>> about that after considering if there could be a more elegant way to
>>> do things without this change.
>>
>> Agreed.
>>
>>> While reviewing the patch, I have found a couple of incorrectly shaped
>>> sentences, both in the docs and some comments. Attached is a new
>>> version with this word-smithing. The patch is now switched as ready
>>> for committer.
>>
>> Thanks. I found a typo in v7 patch, so attached latest v8 patch.
>
> +        qsort(write_array, len, sizeof(XLogRecPtr), cmp_lsn);
> +        qsort(flush_array, len, sizeof(XLogRecPtr), cmp_lsn);
> +        qsort(apply_array, len, sizeof(XLogRecPtr), cmp_lsn);
>
> In quorum commit, we need to calculate the N-th largest LSN from
> M quorum synchronous standbys' LSN. N would be usually 1 - 3 and
> M would be 1 - 10, I guess. You used the algorithm using qsort for
> that calculation. But I'm not sure if that's enough effective algorithm
> or not.
>
> If M (i.e., number of quorum sync standbys) is enough large,
> your choice would be good. But usually M seems not so large.
>

Thank you for the comment.

One another possible idea is to use the partial selection sort[1],
which takes O(MN) time. Since this is more efficient if N is small
this would be better than qsort for this case. But I'm not sure that
we can see such a difference by result of performance measurement.

[1] https://en.wikipedia.org/wiki/Selection_algorithm#Partial_selection_sort

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: pg_dump, pg_dumpall and data durability
Next
From: Michael Paquier
Date:
Subject: Re: Quorum commit for multiple synchronous replication.