Re: Fix of doc for synchronous_standby_names. - Mailing list pgsql-hackers

From Kyotaro HORIGUCHI
Subject Re: Fix of doc for synchronous_standby_names.
Date
Msg-id 20160426.104721.156695825.horiguchi.kyotaro@lab.ntt.co.jp
Whole thread Raw
In response to Re: Fix of doc for synchronous_standby_names.  (Amit Langote <Langote_Amit_f8@lab.ntt.co.jp>)
List pgsql-hackers
Hi,


At Fri, 22 Apr 2016 17:27:07 +0900, Amit Langote <Langote_Amit_f8@lab.ntt.co.jp> wrote in
<5719E05B.4030701@lab.ntt.co.jp>
> 
> Horiguchi-san,
> 
> On 2016/04/22 14:21, Kyotaro HORIGUCHI wrote:
> > I came to think that both of you are misunderstanding how
> > synchronous standbys are choosed so I'd like to clarify the
> > behavior.
> 
> I certainly had a different (and/or wrong) idea in mind about how this
> works.  Thanks a lot for clarifying.  I'm still a little confused, so
> please see below.

Sure thing.

> > Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote:
> >>>> But this particular sentence seems to be talking
> >>>> about what's the case for any given slot.
> >>>
> >>> Right, that's my reading also.
> > 
> > In SyncRepInitConfig, every walsender sets sync_standby_priority
> > by itself. The priority value is the index of its
> > application_name in s_s_names list (1 based).
> > 
> > When a walsender receives a feedback from walreceiver, it may
> > release backends waiting for certain LSN to be secured.
> > 
> > First, SyncRepGetSyncStandbys collects active standbys. Then it
> > loops from the hightest priority value to pick up all of the
> > active standbys for each priority value until all of the seats
> > are occupied. Then SyncRepOldestSyncRepPtr calculates the oldest
> > LSNs only among the standbys SyncRepGetSyncStandbys
> > returned. Finally, it releases backends using the LSNs.
> > 
> > In short, every 'slot' in s_s_names can corresponds to two or
> > more *synchronous* standbys.
> > 
> > The resulting behavior is as the following.
> > 
> >> I don't certainly understnd what the 'sync slot' means. If it
> >> means a name in a replication set description, that is, 'nameN'
> >> in the following setting of s_s_names.
> >>
> >> '2(name1, name2, name3)'
> >>
> >> There may be two or more duplicates even in the
> >> single-sync-age. But only one synchronous standby was allowed so
> >> any 'sync slot' may have at least one matching synchronous
> >> standby in the single-sync-age. This is what I see in the
> >> sentense. Is this wrong?
> >>
> >> Now, we can have multiple synchronous standbys so, for example,
> >> if three standbys with the name 'name1', two of them are choosed
> >> as synchronous. This is a new behavior in the multi-sync-age and
> >> syncrep.c has been changed so as to do so.
> >>
> >> For a supplemnet, the following case.
> >>
> >> '5(name1, name2, name3)'
> >>
> >> and the following standbys
> >>
> >> (name1, name1, name2, name2, name3, name3)
> > 
> > This was a bad example,
> > 
> > (name1, name1, name2, name2, name2, name2, name3)
> > 
> > For this case, the followings are choosed as synchornous standby.
> > 
> > (name1, name1, name2, name2, name2)
> > 
> > Three of the four name2s are choosed but which name2s is an
> > implement matter.
> > 
> >> # However, 5 for three names causes a warning..
> 
> OK, I see.  I tried to understand it (by carrying it out) using the
> following example.
> 
> synchronous_standby_names: '3 (aa, aa, bb, bb, cc)'
> 
> Individual names get assigned the following priorities based on their
> position in the list:
> 
> aa: 1, bb: 3, cc: 5

Yes, it is (aa:1, aa:2, bb:3, bb:4, cc:5) precisely but 2 and 4
won't have a matching walsender.

> Then standbys connect one-by-one, say the following (host and
> application_name):
> 
> host1: aa
> host2: aa
> host3: bb
> host4: bb
> host5: cc
> 
> syncrep.c will assign following priorities (also shown is their sync status)
> 
> host1: aa: 1 (sync)
> host2: aa: 1 (sync)
> host3: bb: 3 (sync)
> host4: bb: 3 (potential)
> host5: cc: 5 (potential)

Which of host3 and 4 is to be sync is indeterminate, one that
have smaller index in walsender slots will be.

> As also evident in pg_stat_replication:
> 
> SELECT application_name, sync_priority, sync_state FROM
> pg_stat_replication ORDER BY 2;
>  application_name | sync_priority | sync_state
> ------------------+---------------+------------
>  aa               |             1 | sync
>  aa               |             1 | sync
>  bb               |             3 | sync
>  bb               |             3 | potential
>  cc               |             5 | potential
> (5 rows)

I haven't confirmed that but it will be so.

> Hm, I see (though I wondered if one of the 'aa's should have been assigned
> priority 2 and likewise for 'bb').

It is just a faith on users, or a harmless bug that I have
overlooked, honestly saying:p But the duplication is not what the
documentation is saying about. We can inhibit such duplication if
we want but perhaps we don't.

> Following is the documentation paragraph under scrutiny:
> 
> """
> The name of a standby server for this purpose is the
> <varname>application_name</> setting of the standby, as set in the
> <varname>primary_conninfo</> of the standby's WAL receiver.  There is
> no mechanism to enforce uniqueness. In case of duplicates one of the
> matching standbys will be considered as higher priority, though
> exactly which one is indeterminate.
> """
> 
> Consider the name 'aa', it turns out that there are in fact 2 connected
> standbys with that name - so duplicate.  Although, both got assigned
> priority 1 (IOW, neither of them is considered higher priority than the
> other as the sentence above seems to suggest).  2/3 sync spots are now
> occupied.

Correct.

> Then let's look at 'bb'.  Here (maybe) it makes sense.  Two standbys named
> 'bb' show up, but only 1/3 sync spot is now left.  So, one of them is
> counted as sync whereas the other becomes a potential sync.  So, what is
> indeterminate is which (standby) becomes which (sync/potential).

Correct.

> However,
> it isn't which one gets *higher priority* that is indeterminate as the
> concerned sentence says (both got the same priority viz. 3).

I'm also uneasy to first read it, but the phrase uses 'consider'
not assign. So it can be translated that 'one of the matching
standbys is choosed as synchronous *as if* it has a higher
priority than other matchins'. It might be caused by the
preceding phrase as I write below.

hosts: aa:1, aa:1, bb:1.5, bb:2: cc:3

If it cannot be took as so without racking one's brain, it should
be rewritten.

> I'm kind of confused.  Should the sentence say something else altogether?

Maybe that question will be out of my capability:p

> By the way, I see the following in comment header of SyncRepGetSyncStandbys()
> 
>  *
>  * If there are multiple standbys with the same priority,
>  * the first one found is selected preferentially.
>  * The caller must hold SyncRepLock.
>  *
> 
> Maybe, the following is possible rewording/rewrite:

Good catch. It is actually wrong and also needs rewriting.

> """
> The name of individual standby servers for this purpose is the
> <varname>application_name</> setting of the standby, as set in the
> <varname>primary_conninfo</> of the standby's WAL receiver. There is
> no mechanism to enforce uniqueness.  Multiple standbys with a given
> application_name get assigned the same priority.  However, if only
> fewer synchronous spots are left, which ones among those standbys
> with the same priority become synchronous is indeterminate.
> """
> 
> So per Horiguchi-san's original complaint, multiplicity of *something*
> needed to reflected after all.
> 
> Thoughts?

IIt seems saying what I was intended, but the 'spot' seems
to be an undefined word and 'standbys with a given
application_name' seems a bit confused.

- Multiple standbys with a given application_name get assigned
- the same priority.  However, if only fewer synchronous spots
- are left, which ones among those standbys with the same
- priority become synchronous is indeterminate.

+ All of the matching standbys have the same prioirty. However,
+ if not all of the standbys of the priority are needed to have
+ sufficient number of synchronous standbys, which ones among
+ those standbys become synchronous is indeterminate.

Thoughts?


By the way, the following phrase just above the in-focus phrase,

- Other standby servers appearing later in this list represent
- potential synchronous standbys. If any of the current
- synchronous standbys disconnects for whatever reason, it will
- be replaced immediately with the next-highest-priority standby.

This suggests that all of connecting standbys have unique
priorities. So the following phrase was foeced to use the words
'have higher prioirity'.

+ Other standby servers appearing in this list represent
+ potential synchronous standbys. If any of the current
+ synchronous standbys disconnects for whatever reason, it will
+ be replaced immediately with a potential sychronous standby
+ with the highest priority.

It might not be needed, though..

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center





pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Fix for OpenSSL error queue bug
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Support for N synchronous standby servers - take 2