Re: Race condition in SyncRepGetSyncStandbysPriority - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Race condition in SyncRepGetSyncStandbysPriority
Date
Msg-id CA+fd4k4f+QrGOpEMR3GA6yHakCr6ssJB_EbRrdC-=miwEZf3mQ@mail.gmail.com
Whole thread Raw
In response to Re: Race condition in SyncRepGetSyncStandbysPriority  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Race condition in SyncRepGetSyncStandbysPriority
List pgsql-hackers
On Tue, 14 Apr 2020 at 10:34, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes:
> > At Sat, 11 Apr 2020 18:30:30 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in
> >> What I think we should do about this is, essentially, to get rid of
> >> SyncRepGetSyncStandbys.  Instead, let's have each walsender advertise
> >> whether *it* believes that it is a sync standby, based on its last
> >> evaluation of the relevant GUCs.  This would be a bool that it'd
> >> compute and set alongside sync_standby_priority.  (Hm, maybe we'd not
>
> > Mmm..  SyncRepGetStandbyPriority returns the "priority" that a
> > walsender thinks it is at, among synchronous_standby_names.  Then to
> > decide "I am a sync standby" we need to know how many walsenders with
> > higher priority are alive now.  SyncRepGetSyncStandbyPriority does the
> > judgment now and suffers from the inconsistency of priority values.
>
> Yeah.  After looking a bit closer, I think that the current definition
> of sync_standby_priority (that is, as the result of local evaluation
> of SyncRepGetStandbyPriority()) is OK.  The problem is what we're doing
> with it.  I suggest that what we should do in SyncRepGetSyncRecPtr()
> is make one sweep across the WalSnd array, collecting PID,
> sync_standby_priority, *and* the WAL pointers from each valid entry.
> Then examine that data and decide which WAL value we need, without assuming
> that the sync_standby_priority values are necessarily totally consistent.
> But in any case we must examine each entry just once while holding its
> mutex, not go back to it later expecting it to still be the same.

Can we have a similar approach of sync_standby_defined for
sync_standby_priority? That is, checkpionter is responsible for
changing sync_standby_priority of all walsenders when SIGHUP. That
way, all walsenders can see a consistent view of
sync_standby_priority. And when a walsender starts, it sets
sync_standby_priority by itself. The logic to decide who's a sync
standby doesn't change. SyncRepGetSyncRecPtr() gets all walsenders
having higher priority along with their WAL positions.

>
> Another thing that I'm finding interesting is that I now see this is
> not at all new code.  It doesn't look like SyncRepGetSyncStandbysPriority
> has changed much since 2016.  So how come we didn't detect this problem
> long ago?  I searched the buildfarm logs for assertion failures in
> syncrep.c, looking back one year, and here's what I found:
>
>   sysname   |    branch     |      snapshot       |     stage     |
                       l
 
>
------------+---------------+---------------------+---------------+-------------------------------------------------------------------------------------------------------------------------------------------------------
>  nightjar   | REL_10_STABLE | 2019-08-13 23:04:41 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File:
"/pgbuild/root/REL_10_STABLE/pgsql.build/../pgsql/src/backend/replication/syncrep.c",Line: 940)
 
>  hoverfly   | REL9_6_STABLE | 2019-11-07 17:19:12 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File:
"syncrep.c",Line: 723)
 
>  hoverfly   | HEAD          | 2019-11-22 12:15:08 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c",
Line:951)
 
>  francolin  | HEAD          | 2020-01-16 23:10:06 | recoveryCheck | TRAP: FailedAssertion("false", File:
"/home/andres/build/buildfarm-francolin/HEAD/pgsql.build/../pgsql/src/backend/replication/syncrep.c",Line: 951)
 
>  hoverfly   | REL_11_STABLE | 2020-02-29 01:34:55 | recoveryCheck | TRAP: FailedAssertion("!(0)", File: "syncrep.c",
Line:946)
 
>  hoverfly   | REL9_6_STABLE | 2020-03-26 13:51:15 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File:
"syncrep.c",Line: 723)
 
>  hoverfly   | REL9_6_STABLE | 2020-04-07 21:52:07 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File:
"syncrep.c",Line: 723)
 
>  curculio   | HEAD          | 2020-04-11 18:30:21 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c",
Line:951)
 
>  sidewinder | HEAD          | 2020-04-11 18:45:39 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c",
Line:951)
 
>  curculio   | HEAD          | 2020-04-11 20:30:26 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c",
Line:951)
 
>  sidewinder | HEAD          | 2020-04-11 21:45:48 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c",
Line:951)
 
>  sidewinder | HEAD          | 2020-04-13 10:45:35 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c",
Line:951)
 
>  conchuela  | HEAD          | 2020-04-13 16:00:18 | recoveryCheck | TRAP: FailedAssertion("false", File:
"/home/pgbf/buildroot/HEAD/pgsql.build/../pgsql/src/backend/replication/syncrep.c",Line: 951)
 
>  sidewinder | HEAD          | 2020-04-13 18:45:34 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c",
Line:951)
 
> (14 rows)
>
> The line numbers vary in the back branches, but all of these crashes are
> at that same Assert.  So (a) yes, this does happen in the back branches,
> but (b) some fairly recent change has made it a whole lot more probable.
> Neither syncrep.c nor 007_sync_rep.pl have changed much in some time,
> so whatever the change was was indirect.  Curious.  Is it just timing?

Interesting. It's happening on certain animals, not all. Especially
tests with HEAD on sidewinder and curculio, which are NetBSD 7 and
OpenBSD 5.9 respectively, started to fail at a high rate since a
couple of days ago.


Regards,

--
Masahiko Sawada            http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



pgsql-hackers by date:

Previous
From: Rajkumar Raghuwanshi
Date:
Subject: variation of row_number with parallel
Next
From: Pavel Stehule
Date:
Subject: Re: variation of row_number with parallel