Re: Quorum commit for multiple synchronous replication. - Mailing list pgsql-hackers
From | Fujii Masao |
---|---|
Subject | Re: Quorum commit for multiple synchronous replication. |
Date | |
Msg-id | CAHGQGwG5OCPKtQQ9QB6SVBNAfnPfPyGJfY2dzYSjfjNyoSgtLA@mail.gmail.com Whole thread Raw |
In response to | Re: Quorum commit for multiple synchronous replication. (Masahiko Sawada <sawada.mshk@gmail.com>) |
Responses |
Re: Quorum commit for multiple synchronous replication.
|
List | pgsql-hackers |
On Tue, Dec 6, 2016 at 6:57 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: > On Tue, Dec 6, 2016 at 1:11 PM, Fujii Masao <masao.fujii@gmail.com> wrote: >> On Mon, Nov 28, 2016 at 8:03 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >>> On Sat, Nov 26, 2016 at 10:27 PM, Michael Paquier >>> <michael.paquier@gmail.com> wrote: >>>> On Tue, Nov 15, 2016 at 7:08 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote: >>>>> Attached latest version patch incorporated review comments. After more >>>>> thought, I agree and changed the value of standby priority in quorum >>>>> method so that it's not set 1 forcibly. The all standby priorities are >>>>> 1 If s_s_names = 'ANY(*)'. >>>>> Please review this patch. >>>> >>>> Sorry for my late reply. Here is my final lookup. >>> >>> Thank you for reviewing! >>> >>>> <synopsis> >>>> -<replaceable class="parameter">num_sync</replaceable> ( <replaceable >>>> class="parameter">standby_name</replaceable> [, ...] ) >>>> +[ANY] <replaceable class="parameter">num_sync</replaceable> ( >>>> <replaceable class="parameter">standby_name</replaceable> [, ...] ) >>>> +FIRST <replaceable class="parameter">num_sync</replaceable> ( >>>> <replaceable class="parameter">standby_name</replaceable> [, ...] ) >>>> <replaceable class="parameter">standby_name</replaceable> [, ... >>>> This can just be replaced with [ ANY | FIRST ]. There is no need for >>>> braces as the keyword is not mandatory. >>>> >>>> + is the name of a standby server. >>>> + <literal>FIRST</> and <literal>ANY</> specify the method used by >>>> + the master to control the standby servres. >>>> </para> >>>> s/servres/servers/. >>>> >>>> if (priority == 0) >>>> values[7] = CStringGetTextDatum("async"); >>>> else if (list_member_int(sync_standbys, i)) >>>> - values[7] = CStringGetTextDatum("sync"); >>>> + values[7] = SyncRepConfig->sync_method == SYNC_REP_PRIORITY ? >>>> + CStringGetTextDatum("sync") : CStringGetTextDatum("quorum"); >>>> else >>>> values[7] = CStringGetTextDatum("potential"); >>>> This can be simplified a bit as "quorum" is the state value for all >>>> standbys with a non-zero priority when the method is set to >>>> SYNC_REP_QUORUM: >>>> if (priority == 0) >>>> values[7] = CStringGetTextDatum("async"); >>>> + else if (SyncRepConfig->sync_method == SYNC_REP_QUORUM) >>>> + values[7] = CStringGetTextDatum("quorum"); >>>> else if (list_member_int(sync_standbys, i)) >>>> values[7] = CStringGetTextDatum("sync"); >>>> else >>>> >>>> SyncRepConfig data is made external to syncrep.c with this patch as >>>> walsender.c needs to look at the sync method in place, no complain >>>> about that after considering if there could be a more elegant way to >>>> do things without this change. >>> >>> Agreed. >>> >>>> While reviewing the patch, I have found a couple of incorrectly shaped >>>> sentences, both in the docs and some comments. Attached is a new >>>> version with this word-smithing. The patch is now switched as ready >>>> for committer. >>> >>> Thanks. I found a typo in v7 patch, so attached latest v8 patch. >> >> + qsort(write_array, len, sizeof(XLogRecPtr), cmp_lsn); >> + qsort(flush_array, len, sizeof(XLogRecPtr), cmp_lsn); >> + qsort(apply_array, len, sizeof(XLogRecPtr), cmp_lsn); >> >> In quorum commit, we need to calculate the N-th largest LSN from >> M quorum synchronous standbys' LSN. N would be usually 1 - 3 and >> M would be 1 - 10, I guess. You used the algorithm using qsort for >> that calculation. But I'm not sure if that's enough effective algorithm >> or not. >> >> If M (i.e., number of quorum sync standbys) is enough large, >> your choice would be good. But usually M seems not so large. >> > > Thank you for the comment. > > One another possible idea is to use the partial selection sort[1], > which takes O(MN) time. Since this is more efficient if N is small > this would be better than qsort for this case. But I'm not sure that > we can see such a difference by result of performance measurement. So, isn't it better to compare the performance of some algorithms and confirm which is the best for quorum commit? Since this code is hot, i.e., can be very frequently executed, I'd like to avoid waste of cycle as much as possible. Regards, -- Fujii Masao
pgsql-hackers by date: