Re: Support for N synchronous standby servers - take 2 - Mailing list pgsql-hackers

From Fujii Masao
Subject Re: Support for N synchronous standby servers - take 2
Date
Msg-id CAHGQGwG+DM=LCctG6q_Uxkgk17CbLKrHBwtPfUN3+Hu9QbvNuQ@mail.gmail.com
Whole thread Raw
In response to Re: Support for N synchronous standby servers - take 2  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Support for N synchronous standby servers - take 2  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
List pgsql-hackers
On Mon, Apr 4, 2016 at 10:00 PM, Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> On Mon, Apr 4, 2016 at 6:03 PM, Kyotaro HORIGUCHI
> <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
>> Hello, thank you for testing.
>>
>> At Sat, 2 Apr 2016 14:20:55 +1300, Thomas Munro <thomas.munro@enterprisedb.com> wrote in
<CAEepm=2sdDL2hs3XbWb5FORegNHBObLJ-8C2=aaeG-riZTd2Rw@mail.gmail.com>
>>> >>> Attached latest patch incorporate some review comments so far, and is
>>> >>> rebased against current HEAD.
>>> >>>
>>> >>
>>> >> Sorry I attached wrong patch.
>>> >> Attached patch is correct patch.
>>> >>
>>> >> [mulit_sync_replication_v21.patch]
>>> >
>>> > Here are some TPS numbers from some quick tests I ran on a set of
>>> > Amazon EC2 m3.large instances ("2 vCPU" virtual machines) configured
>>> > as primary + 3 standbys, to try out different combinations of
>>> > synchronous_commit levels and synchronous_standby_names numbers.  They
>>> > were run for a short time only and these are of course systems with
>>> > limited and perhaps uneven IO and CPU, but they still give some idea
>>> > of the trends.  And reassuringly, the trends are travelling in the
>>> > expected directions.
>>> >
>>> > All default settings except shared_buffers = 1GB, and the GUCs
>>> > required for replication.
>>> >
>>> > pgbench postgres -j2 -c2 -N bench2 -T 600
>>> >
>>> >                1(*) 2(*) 3(*)
>>> >                ==== ==== ====
>>> > off          = 4056 4096 4092
>>> > local        = 1323 1299 1312
>>> > remote_write = 1130 1046  958
>>> > on           =  860  744  701
>>> > remote_apply =  785  725  604
>>> >
>>> > pgbench postgres -j16 -c16 -N bench2 -T 600
>>> >
>>> >                1(*) 2(*) 3(*)
>>> >                ==== ==== ====
>>> > off          = 3952 3943 3933
>>> > local        = 2964 2984 3026
>>> > remote_write = 2790 2724 2675
>>> > on           = 2731 2627 2523
>>> > remote_apply = 2627 2501 2432
>>> >
>>> > One thing I noticed is that there are LOG messages telling me when a
>>> > standby becomes a synchronous standby, but nothing to tell me if a
>>> > standby stops being a standby (ie because a higher priority one has
>>> > taken its place in the quorum).  Would that be interesting?
>>
>> A walsender exits by proc_exit() for any operational
>> termination so wrapping proc_exit() should work. (Attached file 1)
>>
>> For the setting "2(Sby1, Sby2, Sby3)", the master says that all
>> of the standbys are sync-standbys and no message is emited on
>> failure of Sby1, which should cause a promotion of Sby3.
>>
>>>  standby "Sby3" is now the synchronous standby with priority 3
>>>  standby "Sby2" is now the synchronous standby with priority 2
>>>  standby "Sby1" is now the synchronous standby with priority 1
>> ..<Sby 1 failure>
>>>  standby "Sby3" is now the synchronous standby with priority 3
>>
>> Sby3 becomes sync standby twice:p
>>
>> This was a behavior taken over from the single-sync-rep era but
>> it should be confusing for the new sync-rep selection mechanism.
>> The second attached diff makes this as the following.
>>
>>
>>> 17:48:21.969 LOG:  standby "Sby3" is now a synchronous standby with priority 3
>>> 17:48:23.087 LOG:  standby "Sby2" is now a synchronous standby with priority 2
>>> 17:48:25.617 LOG:  standby "Sby1" is now a synchronous standby with priority 1
>>> 17:48:31.990 LOG:  standby "Sby3" is now a potential synchronous standby with priority 3
>>> 17:48:43.905 LOG:  standby "Sby3" is now a synchronous standby with priority 3
>>> 17:49:10.262 LOG:  standby "Sby1" is now a synchronous standby with priority 1
>>> 17:49:13.865 LOG:  standby "Sby3" is now a potential synchronous standby with priority 3
>>
>> Since this status check is taken place for every reply from
>> stanbys, the message of downgrading to "potential" may be
>> diferred or even fail to occur but it should be no problem.
>>
>> Applying the both of the above patches, the message would be like
>> the following.
>>
>>> 17:54:08.367 LOG:  standby "Sby3" is now a synchronous standby with priority 3
>>> 17:54:08.564 LOG:  standby "Sby1" is now a synchronous standby with priority 1
>>> 17:54:08.565 LOG:  standby "Sby2" is now a synchronous standby with priority 2
>>> 17:54:18.387 LOG:  standby "Sby3" is now a potential synchronous standby with priority 3
>>> 17:54:28.887 LOG:  synchronous standby "Sby1" with priority 1 exited
>>> 17:54:31.359 LOG:  standby "Sby3" is now a synchronous standby with priority 3
>>> 17:54:39.008 LOG:  standby "Sby1" is now a synchronous standby with priority 1
>>> 17:54:41.382 LOG:  standby "Sby3" is now a potential synchronous standby with priority 3
>>
>> Does this make sense?
>>
>> By the way, Sawada-san, you have changed the parentheses for the
>> priority method from '[]' to '()'. And I mistankenly defined
>> s_s_names as '2[Sby1, Sby2, Sby3]' and got wrong behavior, that
>> is, only Sby2 is registed as mandatory synchronous standby.
>>
>> For this case, the tree members of SyncRepConfig are '2[Sby1,',
>> 'Sby2', "Sby3]'. This syntax is valid for the current
>> specification but will surely get different meaning by the future
>> changes. We should refuse this known-to-be-wrong-in-future syntax
>> from now.
>>
>
> I have no objection about current version patch.
> But one optimise idea I came up with is to return false before
> calculation of lowest LSN from sync standby if MyWalSnd is not listed
> in sync_standby.
> For example in SyncRepGetOldestSyncRecPtr(),
>
> ==
> sync_standby = SyncRepGetSyncStandbys();
>
> if (list_length(sync_standbys) <SyncRepConfig->num_sync()
> {
>   (snip)
> }
>
> /* Here if MyWalSnd is not listed in sync_standby, quick exit. */
> if (list_member_int(sync_standbys, MyWalSnd->slotno))
> return false;

list_member_int() performs the loop internally. So I'm not sure how much
adding extra list_member_int() here can optimize this processing.
Another idea is to make SyncRepGetSyncStandby() check whether I'm sync
standby or not. In this idea, without adding extra loop, we can exit earilier
in the case where I'm not a sync standby. Does this make sense?

Regards,

-- 
Fujii Masao



pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: Yet another small patch - reorderbuffer.c:1099
Next
From: Simon Riggs
Date:
Subject: Re: Support for N synchronous standby servers - take 2