Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Support for N synchronous standby servers - take 2 |
Date | |
Msg-id | CAD21AoCfSX_MbOze8C--zoDJatUfaJcF5PzZ-ELqpigNBQ_ERw@mail.gmail.com Whole thread Raw |
In response to | Re: Support for N synchronous standby servers - take 2 (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
List | pgsql-hackers |
Reply to multiple hackers. Thank you for reviewing this patch. > + used. Priority is given to servers in the order that the appear > in the list. > > s/the appear/they appear/ > > - The minimum wait time is the roundtrip time between primary to standby. > + The minimum wait time is the roundtrip time between the primary and the > + almost synchronous standby. > > s/almost/slowest/ Will fix this typo. Thanks! On Fri, Mar 4, 2016 at 5:22 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > Hello, > > Sorry for long, hard-to-read writings in advance.. > > At Thu, 3 Mar 2016 23:30:49 +0900, Masahiko Sawada <sawada.mshk@gmail.com> wrote in <CAD21AoD3XGZtuvgc5uKJdvcoJP5S0rvGQQCJLRL4rLsruRch5Q@mail.gmail.com> >> Hi, >> >> Thank you so much for reviewing this patch! >> >> All review comments regarding document and comment are fixed. >> Attached latest v14 patch. >> >> > This accepts 'abc^Id' as a name, which is wrong behavior (but >> > such appliction names are not allowed anyway. If you assume so, >> > I'd like to see a comment for that.). >> >> 'abc^Id' is accepted as application_name, no? >> postgres(1)=# set application_name to 'abc^Id'; >> SET >> postgres(1)=# show application_name ; >> application_name >> ------------------ >> abc^Id >> (1 row) > > Sorry, I implicitly used "^" in the meaning of "ctrl key". So > "^I" is so-called Ctrl-I, that is horizontal tab or 0x09. So the > following in psql shows that. > > =# set application_name to E'abc\td'; > =# show application_name ; > application_name > ------------------ > ab?d > (1 row) > > The <tab> is replaced with '?' (literally) at the time of > guc assinment. Oh, I see. I will comment for that. >> > addlit_xd_string(char *ytext) and addlitchar_xd_string(unsigned >> > char ychar) requires differnt character types. Is there any reason >> > for that? >> >> Because addlit_xd_string() is for adding string(char *) to xd_string, >> OTOH addlit_xd_char() is for adding just one character to xd_string. > > Umm. My qustion might have been a bit out of the point. > > The addlitchar_xd_string(str,unsigned char c) does > appendStringInfoChar(, c). On the other hand, the signature of > the function of stringinfo is the following. > > AppendStringInfoChar(StringInfo str, char ch); > > Of course "char" is equivalent of "signed char" as > default. addlitchar_xd_string assigns the given character in > "unsigned char" to the parameter of AppendStringInfoChar of > "signed char". > > These two are incompatible types. Imagine the > following codelet, > > #include <stdio.h> > > void hoge(signed char c){ > int ch = c; > fprintf(stderr, "char = %d\n", ch); > } > > int main(void) > { > unsigned char u; > > u = 200; > hoge(u); > return 0; > } > > The result is -56. So we generally should get rid of such type of > mixture of signedness for no particular reason. > > In this case, the domain of the variable is 0x20-0x7e so no > problem won't be actualized but also there's no reason for the > signedness mixture. Thank you for explanation. I will fix this. >> > I personally don't like addlit*string() things for such simple >> > syntax but itself is acceptble enough for me. However it uses >> > StringInfo to hold double-quoted names, which pallocs 1024 bytes >> > of memory chunk for every double-quoted name. The chunks are >> > finally stacked up left uncollected until the current >> > memorycontext is deleted or reset (It is deleted just after >> > finishing config file processing). Addition to that, setting >> > s_s_names runs the parser twice. It seems to me too greedy and >> > seems that static char [NAMEDATALEN] is enough using the v12 way >> > without palloc/repalloc. >> >> I though that length of group name could be more than NAMEDATALEN, so >> I use StringInfo. >> Is it not necessary? > > Such long names doesn't seem to necessary. Too long identifiers > no longer act as identifier for human eyeballs. We are limiting > the length of identifiers of the whole database system to > NAMEDATALEN-1, which seems to have been enough so I don't see any > reason to have a group name longer than that. > I see. I will fix this. >> > I found that the name SyncGroupName.wait_num is not >> > instinctive. How about sync_num, sync_member_num or >> > sync_standby_num? If the last is preferable, .members also should >> > be .standbys . >> >> Thanks, sync_num is preferable to me. >> >> === >> > I am quite uncomfortable with the existence of >> > WanSnd.sync_standby_priority. It represented the pirority in the >> > old linear s_s_names format but nested groups or even >> > single-level quarum list obviously doesn't fit it. Can we get rid >> > of sync_standby_priority, even though we realize atmost >> > n-priority for now? >> >> We could get rid of sync_standby_priority. >> But if so, we will not be able to see the next sync standby in >> pg_stat_replication system view. >> Regarding each node priority, I was thinking that standbys in quorum >> list have same priority, and in nested group each standbys are given >> the priority starting from 1. > > As far as I can see the varialbe is referred to as a boolean to > indicate whether a walsernder is connected to a candidate > synchronous standby. So the value is totally useless, at least > for now. However, SyncRepRelaseWaiters uses the value to check if > the synced LSNs can be advaned by a walsender so the variable is > useful as a boolean. > > In the previous versions, the reason why WanSnd had the priority > value is that a pair of synchronized LSNs is determined only by > one wansender, which has the highest priority among active > wansenders. So even if a walsender receives a response from > walreceiver, it doesn't need to do nothing if it is not at the > highest priority. It's a simple world. > > In the quorum commit word, in contrast, what > SyncRepGetSyncStandbysFn shoud do is returning certain private > information to be used to calculate a pair of safe/synched LSNs > in SyncRepGetSYncedLsnsFn looking into WalSndCtl->wansnds > list. The latter passes a pair of safe/synced LSNs to the upper > level list or SyncRepSyncedLsnAdvancedTo as the topmost > caller. There's no room for sync_standby_priority to work as the > original objective. > > Even if we assign the value in the explained way, the values are > always 1 for quorum method and duplicate values for multiple > priority method. What do you want to show by the value to users? I agree with you. When we implement nested style of multiple sync replication, it would tough to show to users using by sync_standby_priority. But in current our first goal (implementing 1-nest style), it doesn't seem to need to get rid of sync_standby_priority from WalSnd so far, no? Towards multiple nested style, I'm roughly planning to have new system view is defined like follows. - New system view shows all groups and nodes informations. - Move sync_state from pg_stat_replication to new system view. - Get rid of sync_priority from pg_stat_replication. - Add new sync_state 'quorum' that indicates candidate sync standbys of its group using quorum method. - If parent group state is potential, 'potential:' prefix is added to the child standby's sync_state. * s_s_names = '2[a, 1(b,c):group1, 1[d,e]:gorup2]' name | sync_method | member | sync_num | sync_state | parant_group -----------+--------------------+---------------------------+---------------+--------------------------+--------------main | priority | {a,group1,group2} | 2 | |a | | | | sync | maingroup1 | quorum | {b,c} | 1 | sync | mainb | | | | sync | group1c | | | | potential | group1group2| priority | {d,e} | 1| potential | maind | | | | potential:sync | group2e | | | | potential:potential | group2 (8 rows) * s_s_names = '2(a, 1[b,c]:group1, 1(d,e):group2)' name | sync_method | member | sync_num | sync_state | parant_group -----------+--------------------+--------------------------+----------------+--------------------------+--------------main | quorum | {a,group1,group2} | 2 | |a | | | | quorum | maingroup1 | priority | {b,c} | 1 | quorum | mainb | | | | sync |group1c | | | | potential | group1group2 | quorum | {d,e} | 1 | quorum | maind | | | | quorum | group2e | | | | quorum | group2 (8 rows) >> > SyncRepStandbys are to be in multilevel and the struct is >> > naturally allowed to be so but SyncRepClearStandbyGroupList >> > assumes it in single level. >> >> Because I think that we don't need to implement to fully support >> nested style at first version. >> We have to carefully design this feature while considering >> expandability, but overkill implementation could be cause of crash. >> Consider remaining time for 9.6, I feel we could implement quorum >> method at best. > > Yes, so I proposed to ass Aseert() in the function. Will add it. Regards, -- Masahiko Sawada
pgsql-hackers by date: