Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Support for N synchronous standby servers - take 2 |
Date | |
Msg-id | CAD21AoDoq1ubY4KkKhrA9jzaVXekwAT7gV5pQJbS+wj98b9-3A@mail.gmail.com Whole thread Raw |
In response to | Re: Support for N synchronous standby servers - take 2 (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>) |
Responses |
Re: Support for N synchronous standby servers - take 2
Re: Support for N synchronous standby servers - take 2 |
List | pgsql-hackers |
On Mon, Apr 4, 2016 at 6:03 PM, Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote: > Hello, thank you for testing. > > At Sat, 2 Apr 2016 14:20:55 +1300, Thomas Munro <thomas.munro@enterprisedb.com> wrote in <CAEepm=2sdDL2hs3XbWb5FORegNHBObLJ-8C2=aaeG-riZTd2Rw@mail.gmail.com> >> >>> Attached latest patch incorporate some review comments so far, and is >> >>> rebased against current HEAD. >> >>> >> >> >> >> Sorry I attached wrong patch. >> >> Attached patch is correct patch. >> >> >> >> [mulit_sync_replication_v21.patch] >> > >> > Here are some TPS numbers from some quick tests I ran on a set of >> > Amazon EC2 m3.large instances ("2 vCPU" virtual machines) configured >> > as primary + 3 standbys, to try out different combinations of >> > synchronous_commit levels and synchronous_standby_names numbers. They >> > were run for a short time only and these are of course systems with >> > limited and perhaps uneven IO and CPU, but they still give some idea >> > of the trends. And reassuringly, the trends are travelling in the >> > expected directions. >> > >> > All default settings except shared_buffers = 1GB, and the GUCs >> > required for replication. >> > >> > pgbench postgres -j2 -c2 -N bench2 -T 600 >> > >> > 1(*) 2(*) 3(*) >> > ==== ==== ==== >> > off = 4056 4096 4092 >> > local = 1323 1299 1312 >> > remote_write = 1130 1046 958 >> > on = 860 744 701 >> > remote_apply = 785 725 604 >> > >> > pgbench postgres -j16 -c16 -N bench2 -T 600 >> > >> > 1(*) 2(*) 3(*) >> > ==== ==== ==== >> > off = 3952 3943 3933 >> > local = 2964 2984 3026 >> > remote_write = 2790 2724 2675 >> > on = 2731 2627 2523 >> > remote_apply = 2627 2501 2432 >> > >> > One thing I noticed is that there are LOG messages telling me when a >> > standby becomes a synchronous standby, but nothing to tell me if a >> > standby stops being a standby (ie because a higher priority one has >> > taken its place in the quorum). Would that be interesting? > > A walsender exits by proc_exit() for any operational > termination so wrapping proc_exit() should work. (Attached file 1) > > For the setting "2(Sby1, Sby2, Sby3)", the master says that all > of the standbys are sync-standbys and no message is emited on > failure of Sby1, which should cause a promotion of Sby3. > >> standby "Sby3" is now the synchronous standby with priority 3 >> standby "Sby2" is now the synchronous standby with priority 2 >> standby "Sby1" is now the synchronous standby with priority 1 > ..<Sby 1 failure> >> standby "Sby3" is now the synchronous standby with priority 3 > > Sby3 becomes sync standby twice:p > > This was a behavior taken over from the single-sync-rep era but > it should be confusing for the new sync-rep selection mechanism. > The second attached diff makes this as the following. > > >> 17:48:21.969 LOG: standby "Sby3" is now a synchronous standby with priority 3 >> 17:48:23.087 LOG: standby "Sby2" is now a synchronous standby with priority 2 >> 17:48:25.617 LOG: standby "Sby1" is now a synchronous standby with priority 1 >> 17:48:31.990 LOG: standby "Sby3" is now a potential synchronous standby with priority 3 >> 17:48:43.905 LOG: standby "Sby3" is now a synchronous standby with priority 3 >> 17:49:10.262 LOG: standby "Sby1" is now a synchronous standby with priority 1 >> 17:49:13.865 LOG: standby "Sby3" is now a potential synchronous standby with priority 3 > > Since this status check is taken place for every reply from > stanbys, the message of downgrading to "potential" may be > diferred or even fail to occur but it should be no problem. > > Applying the both of the above patches, the message would be like > the following. > >> 17:54:08.367 LOG: standby "Sby3" is now a synchronous standby with priority 3 >> 17:54:08.564 LOG: standby "Sby1" is now a synchronous standby with priority 1 >> 17:54:08.565 LOG: standby "Sby2" is now a synchronous standby with priority 2 >> 17:54:18.387 LOG: standby "Sby3" is now a potential synchronous standby with priority 3 >> 17:54:28.887 LOG: synchronous standby "Sby1" with priority 1 exited >> 17:54:31.359 LOG: standby "Sby3" is now a synchronous standby with priority 3 >> 17:54:39.008 LOG: standby "Sby1" is now a synchronous standby with priority 1 >> 17:54:41.382 LOG: standby "Sby3" is now a potential synchronous standby with priority 3 > > Does this make sense? > > By the way, Sawada-san, you have changed the parentheses for the > priority method from '[]' to '()'. And I mistankenly defined > s_s_names as '2[Sby1, Sby2, Sby3]' and got wrong behavior, that > is, only Sby2 is registed as mandatory synchronous standby. > > For this case, the tree members of SyncRepConfig are '2[Sby1,', > 'Sby2', "Sby3]'. This syntax is valid for the current > specification but will surely get different meaning by the future > changes. We should refuse this known-to-be-wrong-in-future syntax > from now. > I have no objection about current version patch. But one optimise idea I came up with is to return false before calculation of lowest LSN from sync standby if MyWalSnd is not listed in sync_standby. For example in SyncRepGetOldestSyncRecPtr(), == sync_standby = SyncRepGetSyncStandbys(); if (list_length(sync_standbys) <SyncRepConfig->num_sync() { (snip) } /* Here if MyWalSnd is not listed in sync_standby, quick exit. */ if (list_member_int(sync_standbys, MyWalSnd->slotno)) return false; foreach(cell, sync_standbys) { (snip) } == > For this case, the tree members of SyncRepConfig are '2[Sby1,', > 'Sby2', "Sby3]'. This syntax is valid for the current > specification but will surely get different meaning by the future > changes. We should refuse this known-to-be-wrong-in-future syntax > from now. I couldn't get your point but why will the above syntax meaning be different from current meaning by future change? I thought that another method uses another kind of parentheses. Regards, -- Masahiko Sawada
pgsql-hackers by date: