Re: Race condition in SyncRepGetSyncStandbysPriority - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Race condition in SyncRepGetSyncStandbysPriority
Date
Msg-id CA+fd4k7_wcpZSwK0Axq5ag1Hhb7cMgp2b_CbdZegRTSzZtQAFA@mail.gmail.com
Whole thread Raw
In response to Re: Race condition in SyncRepGetSyncStandbysPriority  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Race condition in SyncRepGetSyncStandbysPriority  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
List pgsql-hackers
On Thu, 16 Apr 2020 at 16:22, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
>
> At Wed, 15 Apr 2020 11:31:49 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
> > Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> > > At Tue, 14 Apr 2020 16:32:40 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
> > > +           stby->is_sync_standby = true;   /* might change below */
> >
> > > I'm uneasy with that.  In quorum mode all running standbys are marked
> > > as "sync" and that's bogus.
> >
> > I don't follow that?  The existing coding of SyncRepGetSyncStandbysQuorum
> > returns all the candidates in its list, so this is isomorphic to that.
>
> The existing code actully does that. On the other hand
> SyncRepGetSyncStandbysPriority returns standbys that *are known to be*
> synchronous, but *Quorum returns standbys that *can be* synchronous.
> What the two functions return are different from each other.  So it
> should be is_sync_standby for -Priority and is_sync_candidate for
> -Quorum.
>
> > Possibly a different name for the flag would be more suited?
> >
> > > On the other hand sync_standbys is already sorted in priority order so I think we can get rid of the member by
setting*am_sync as the follows.
 
> >
> > > SyncRepGetSyncRecPtr:
> > >   if (sync_standbys[i].is_me)
> > >   {
> > >       *am_sync = (i < SyncRepConfig->num_sync);
> > >       break;
> > >   }
> >
> > I disagree with this, it will change the behavior in the quorum case.
>
> Oops, you're right.  I find the whole thing there (and me) is a bit
> confusing. syncrep_method affects how some values (specifically
> am_sync and sync_standbys) are translated at several calling depths.
> And the *am_sync informs nothing in quorum mode.
>
> > In any case, a change like this will cause callers to know way more than
> > they ought to about the ordering of the array.  In my mind, the fact that
> > SyncRepGetSyncStandbysPriority is sorting the array is an internal
> > implementation detail; I do not want it to be part of the API.
>
> Anyway the am_sync and is_sync_standby is utterly useless in quorum
> mode.  That discussion is pretty persuasive if not, but actually the
> upper layers (SyncRepReleaseWaiters and SyncRepGetSyncRecPtr) referes
> to syncrep_method to differentiate the interpretation of the am_sync
> flag and sync_standbys list.  So anyway the difference is actually a
> part of API.
>
> After thinking some more, I concluded that some of the variables are
> wrongly named or considered, and redundant. The fucntion of am_sync is
> covered by got_recptr in SyncRepReleaseWaiters, so it's enough that
> SyncRepGetSyncRecPtr just reports to the caller whether the caller may
> release some of the waiter processes. This simplifies the related
> functions and make it (to me) clearer.
>
> Please find the attached.
>
>
> > (Apropos to that, I realized from working on this patch that there's
> > another, completely undocumented assumption in the existing code, that
> > the integer list will be sorted by walsender index for equal priorities.
> > I don't like that either, and not just because it's undocumented.)
>
> That seems accidentally. Sorting by priority is the disigned behavior
> and documented, in contrast, entries of the same priority are ordered
> in index order by accident and not documented, that means it can be
> changed anytime.  I think we don't define everyting in such detail.
>

This is just a notice; I'm reading your latest patch but it seems to
include unrelated changes:

$ git diff --stat
 src/backend/replication/syncrep.c           | 475

+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-----------------------------------------------------------------------------------------------------------------------------------------------
 src/backend/replication/walsender.c         |  40 ++++++++++++++-----
 src/bin/pg_dump/compress_io.c               |  12 ++++++
 src/bin/pg_dump/pg_backup_directory.c       |  48 ++++++++++++++++++-----
 src/include/replication/syncrep.h           |  20 +++++++++-
 src/include/replication/walsender_private.h |  16 ++++----
 6 files changed, 274 insertions(+), 337 deletions(-)


Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Masahiko Sawada
Date:
Subject: Re: Vacuum o/p with (full 1, parallel 0) option throwing an error
Next
From: Rajkumar Raghuwanshi
Date:
Subject: ERROR: could not open file "pg_tblspc/ issue with replication setup.