Re: Behaviour of take over the synchronous replication - Mailing list pgsql-hackers

From Sawada Masahiko
Subject Re: Behaviour of take over the synchronous replication
Date
Msg-id CAD21AoBr+NBo2g++wd5x8Q1HLOMSHXOQB2c4ZttzZmm5T2Xj1A@mail.gmail.com
Whole thread Raw
In response to Re: Behaviour of take over the synchronous replication  (Amit Kapila <amit.kapila16@gmail.com>)
List pgsql-hackers
On Wed, Aug 28, 2013 at 10:59 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Tue, Aug 27, 2013 at 4:51 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>> On Sun, Aug 25, 2013 at 3:21 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
>>> On Sat, Aug 24, 2013 at 2:46 PM, Sawada Masahiko <sawada.mshk@gmail.com> wrote:
>>>> On Sat, Aug 24, 2013 at 3:14 AM, Josh Berkus <josh@agliodbs.com> wrote:
>>>>> On 08/23/2013 12:42 AM, Sawada Masahiko wrote:
>>>>>> in case (a), those priority is clear. So I think that re-taking over
>>>>>> is correct behaviour.
>>>>>> OHOT, in case (b), even if AAA and BBB are set same priority, AAA
>>>>>> server steals SYNC replication.
>>>>>> I think it is better that BBB server continue behaviour SYNC standby,
>>>>>> and AAA should become potential server.
>>>>>
>>>>> So, you're saying that:
>>>>>
>>>>> 1) synchronous_standby_names = '*'
>>>>>
>>>>> 2) replica 'BBB' is the current sync standby
>>>>>
>>>>> 3) replica 'AAA' comes online
>>>>>
>>>>> 4) replica 'AAA' grabs sync status
>>>>>
>>>>> ?
>>>> I'm sorry that you are confuse.
>>>> It means that
>>>>
>>>> 1) synchronous_standby_names = '*'
>>>>
>>>> 2) replica 'AAA' is the current sync standby
>>>>
>>>> 3) replica 'BBB' is the current async standby (potential sync standby)
>>>>
>>>> 4) replica 'AAA' fail. after that, replica 'BBB' is current sync standby.
>>>>
>>>> 5) replica 'AAA' comes online
>>>>
>>>> 6) replica 'AAA' grabs sync status
>>>>
>>>>>
>>>>
>>>>
>>>>> If that's the case, I'm not really sure that's undesirable behavior.
>>>>> One could argue fairly persuasively that if you care about the
>>>>> precendence order of sync replicas, you shouldn't use '*'. And the rule
>>>>> of "if using *, the lowest-sorted replica name has sync" is actually a
>>>>> predictable, easy-to-understand rule.
>>>>>
>>>>> So if you want to make this a feature request, you'll need to come up
>>>>> with an argument as to why the current behavior is bad. Otherwise,
>>>>> you're just asking us to document it better (which is a good idea).
>>>> It is not depend on name of standby server. That is, The standby server,
>>>> which was connected to the master server during initial configration
>>>> replication, is top priority even if priority of two server are same.
>>>
>>> What is happening here is that incase of '*' as priority of both are
>>> same, system will choose whichever
>>> comes in list of registered standby's first (list is maintained in
>>> structure WalSndCtl).
>>> Each standby is registered with WalSndCtl when a new WALSender is
>>> started in function InitWalSenderSlot().
>>> As 'AAA' has been registered first it becomes preferred sync standby
>>> even if priorities of both are same.
>>> When 'AAA' goes down, it marks that Slot entry as free (by setting
>>> pid=0 in function WalSndKill),
>>> now when 'AAA' comes back again, it gets that free Slot entry and
>>> again becomes preferred sync standby.
>>>
>>> Now if we want to fix as you are suggesting which I don't think is
>>> necessary, we might need to change WalSndKill and some other place so
>>> that whenever any standby goes down, it changes slots for already
>>> registered standby's.
>>>> User must remember that which standby server connected to master server at
>>>> first.
>>>> I think that this behavior confuse user.
>>>> so I think that we need to modify this behaviour or if '*' is used, priority
>>>> of server is not same (modifying manual is also good).
>>>
>>> Here user has done the settings (setting synchronous_standby_names =
>>> '*'), after which he will not have any control which standby will
>>> become sync standby, so ideally he should not complain.
>>>
>>> It might be case that for some users current behavior is good enough
>>> which means that with '*' whichever standby has become sync standby
>>> first, it will be the sync standby always if alive.
>
>> I'm thinking that it is not necessary to change WalSndKill.
>> For example, we add the value (e.g., sync_standby) which have that
>> which wal sender is active SYNC rep.
>> And if sync_standby is already set and it is active, server doesn't
>> looking for active standby.
>> Only if sync_standby is not set and it is inactive, server looking for
>> that which server is active SYNC rep.
>> If so, we also prevent to find active SYNC rep whenever
>> SyncRepReleaseWaiters() is called.
>    For '*' case, it will be okay, but when the user has given proper
> names, in that case even if there is any active Sync
>    Rep, it has to be changed based on priority.
>
>    I think here how to provide a fix, so that behavior gets changed to
> what you describe is a second priority work, first
>    is to show the value of use-case. Do you really know where people
> actually setup using '*' as configuration and if
>    yes, are they annoyed with current behavior?
>    I have thought about it, but could imagine a scenario where people
> will be using '*' in their production
>    configurations, may be it will be useful in test labs.

Thank you for your feedback.
I have implemented the patch which change how to put priority on each
walsender, based on I suggested.
I added sync_standby value into WalCtl value. This value has that
which walsender is active sync rep.
This patch  handle also for case that user has given proper names.

Regards,

-------
Sawada Masahiko

Attachment

pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Compression of full-page-writes
Next
From: Peter Geoghegan
Date:
Subject: Re: Compression of full-page-writes