Re: Race condition in SyncRepGetSyncStandbysPriority - Mailing list pgsql-hackers
From | Masahiko Sawada |
---|---|
Subject | Re: Race condition in SyncRepGetSyncStandbysPriority |
Date | |
Msg-id | CA+fd4k4f+QrGOpEMR3GA6yHakCr6ssJB_EbRrdC-=miwEZf3mQ@mail.gmail.com Whole thread Raw |
In response to | Re: Race condition in SyncRepGetSyncStandbysPriority (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Race condition in SyncRepGetSyncStandbysPriority
|
List | pgsql-hackers |
On Tue, 14 Apr 2020 at 10:34, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Kyotaro Horiguchi <horikyota.ntt@gmail.com> writes: > > At Sat, 11 Apr 2020 18:30:30 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote in > >> What I think we should do about this is, essentially, to get rid of > >> SyncRepGetSyncStandbys. Instead, let's have each walsender advertise > >> whether *it* believes that it is a sync standby, based on its last > >> evaluation of the relevant GUCs. This would be a bool that it'd > >> compute and set alongside sync_standby_priority. (Hm, maybe we'd not > > > Mmm.. SyncRepGetStandbyPriority returns the "priority" that a > > walsender thinks it is at, among synchronous_standby_names. Then to > > decide "I am a sync standby" we need to know how many walsenders with > > higher priority are alive now. SyncRepGetSyncStandbyPriority does the > > judgment now and suffers from the inconsistency of priority values. > > Yeah. After looking a bit closer, I think that the current definition > of sync_standby_priority (that is, as the result of local evaluation > of SyncRepGetStandbyPriority()) is OK. The problem is what we're doing > with it. I suggest that what we should do in SyncRepGetSyncRecPtr() > is make one sweep across the WalSnd array, collecting PID, > sync_standby_priority, *and* the WAL pointers from each valid entry. > Then examine that data and decide which WAL value we need, without assuming > that the sync_standby_priority values are necessarily totally consistent. > But in any case we must examine each entry just once while holding its > mutex, not go back to it later expecting it to still be the same. Can we have a similar approach of sync_standby_defined for sync_standby_priority? That is, checkpionter is responsible for changing sync_standby_priority of all walsenders when SIGHUP. That way, all walsenders can see a consistent view of sync_standby_priority. And when a walsender starts, it sets sync_standby_priority by itself. The logic to decide who's a sync standby doesn't change. SyncRepGetSyncRecPtr() gets all walsenders having higher priority along with their WAL positions. > > Another thing that I'm finding interesting is that I now see this is > not at all new code. It doesn't look like SyncRepGetSyncStandbysPriority > has changed much since 2016. So how come we didn't detect this problem > long ago? I searched the buildfarm logs for assertion failures in > syncrep.c, looking back one year, and here's what I found: > > sysname | branch | snapshot | stage | l > ------------+---------------+---------------------+---------------+------------------------------------------------------------------------------------------------------------------------------------------------------- > nightjar | REL_10_STABLE | 2019-08-13 23:04:41 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File: "/pgbuild/root/REL_10_STABLE/pgsql.build/../pgsql/src/backend/replication/syncrep.c",Line: 940) > hoverfly | REL9_6_STABLE | 2019-11-07 17:19:12 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File: "syncrep.c",Line: 723) > hoverfly | HEAD | 2019-11-22 12:15:08 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line:951) > francolin | HEAD | 2020-01-16 23:10:06 | recoveryCheck | TRAP: FailedAssertion("false", File: "/home/andres/build/buildfarm-francolin/HEAD/pgsql.build/../pgsql/src/backend/replication/syncrep.c",Line: 951) > hoverfly | REL_11_STABLE | 2020-02-29 01:34:55 | recoveryCheck | TRAP: FailedAssertion("!(0)", File: "syncrep.c", Line:946) > hoverfly | REL9_6_STABLE | 2020-03-26 13:51:15 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File: "syncrep.c",Line: 723) > hoverfly | REL9_6_STABLE | 2020-04-07 21:52:07 | recoveryCheck | TRAP: FailedAssertion("!(((bool) 0))", File: "syncrep.c",Line: 723) > curculio | HEAD | 2020-04-11 18:30:21 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line:951) > sidewinder | HEAD | 2020-04-11 18:45:39 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line:951) > curculio | HEAD | 2020-04-11 20:30:26 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line:951) > sidewinder | HEAD | 2020-04-11 21:45:48 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line:951) > sidewinder | HEAD | 2020-04-13 10:45:35 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line:951) > conchuela | HEAD | 2020-04-13 16:00:18 | recoveryCheck | TRAP: FailedAssertion("false", File: "/home/pgbf/buildroot/HEAD/pgsql.build/../pgsql/src/backend/replication/syncrep.c",Line: 951) > sidewinder | HEAD | 2020-04-13 18:45:34 | recoveryCheck | TRAP: FailedAssertion("false", File: "syncrep.c", Line:951) > (14 rows) > > The line numbers vary in the back branches, but all of these crashes are > at that same Assert. So (a) yes, this does happen in the back branches, > but (b) some fairly recent change has made it a whole lot more probable. > Neither syncrep.c nor 007_sync_rep.pl have changed much in some time, > so whatever the change was was indirect. Curious. Is it just timing? Interesting. It's happening on certain animals, not all. Especially tests with HEAD on sidewinder and curculio, which are NetBSD 7 and OpenBSD 5.9 respectively, started to fail at a high rate since a couple of days ago. Regards, -- Masahiko Sawada http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: