Re: Race condition in SyncRepGetSyncStandbysPriority - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Race condition in SyncRepGetSyncStandbysPriority
Date
Msg-id CA+fd4k5Oy2OuiFHD-uY_j5rS9RAoSei5G62E0goLEfK2K+1dAw@mail.gmail.com
Whole thread Raw
In response to Re: Race condition in SyncRepGetSyncStandbysPriority  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Race condition in SyncRepGetSyncStandbysPriority
List pgsql-hackers
On Tue, 14 Apr 2020 at 18:35, Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote:
>
> At Tue, 14 Apr 2020 13:06:14 +0900, Masahiko Sawada <masahiko.sawada@2ndquadrant.com> wrote in
> > On Tue, 14 Apr 2020 at 10:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >
> > > Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> > > > At Sat, 11 Apr 2020 18:30:30 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
> > > >> What I think we should do about this is, essentially, to get rid of
> > > >> SyncRepGetSyncStandbys.  Instead, let's have each walsender advertise
> > > >> whether *it* believes that it is a sync standby, based on its last
> > > >> evaluation of the relevant GUCs.  This would be a bool that it'd
> > > >> compute and set alongside sync_standby_priority.  (Hm, maybe we'd not
> > >
> > > > Mmm..  SyncRepGetStandbyPriority returns the "priority" that a
> > > > walsender thinks it is at, among synchronous_standby_names.  Then to
> > > > decide "I am a sync standby" we need to know how many walsenders with
> > > > higher priority are alive now.  SyncRepGetSyncStandbyPriority does the
> > > > judgment now and suffers from the inconsistency of priority values.
> > >
> > > Yeah.  After looking a bit closer, I think that the current definition
> > > of sync_standby_priority (that is, as the result of local evaluation
> > > of SyncRepGetStandbyPriority()) is OK.  The problem is what we're doing
> > > with it.  I suggest that what we should do in SyncRepGetSyncRecPtr()
> > > is make one sweep across the WalSnd array, collecting PID,
> > > sync_standby_priority, *and* the WAL pointers from each valid entry.
> > > Then examine that data and decide which WAL value we need, without assuming
> > > that the sync_standby_priority values are necessarily totally consistent.
> > > But in any case we must examine each entry just once while holding its
> > > mutex, not go back to it later expecting it to still be the same.
>
> SyncRepGetSyncStandbysPriority() is runing holding SyncRepLock so
> sync_standby_priority of any walsender can be changed while the
> function is scanning welsenders. The issue is we already have
> inconsistent walsender information before we enter the function.  Thus
> how many times we scan on the array doesn't make any difference.
>
> I think we need to do one of the followings.
>
>  A) prevent SyncRepGetSyncStandbysPriority from being entered while
>     walsender priority is inconsistent.
>
>  B) make SyncRepGetSyncStandbysPriority be tolerant of priority
>     inconsistency.
>
>  C) protect walsender priority array from beinig inconsistent.
>
> The (B) is the band aids. To achieve A we need to central controller
> of priority config handling.  C is:
>
> > Can we have a similar approach of sync_standby_defined for
> > sync_standby_priority? That is, checkpionter is responsible for
> > changing sync_standby_priority of all walsenders when SIGHUP. That
> > way, all walsenders can see a consistent view of
> > sync_standby_priority. And when a walsender starts, it sets
> > sync_standby_priority by itself. The logic to decide who's a sync
> > standby doesn't change. SyncRepGetSyncRecPtr() gets all walsenders
> > having higher priority along with their WAL positions.
>
> Yeah, it works if we do , but the problem of that way is that to
> determin priority of walsenders, we need to know what walsenders are
> running. That is, when new walsender comes the process needs to aware
> of the arrival (or leaving) right away and reassign the priority of
> every wal senders again.

I think we don't need to reassign the priority when new walsender
comes or leaves. IIUC The priority is calculated based on only
synchronous_standby_names. Coming or leaving a walsender doesn't
affect other's priorities.

Regards,

-- 
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: documenting the backup manifest file format
Next
From: Dilip Kumar
Date:
Subject: Re: PATCH: logical_work_mem and logical streaming of largein-progress transactions