Re: SyncRepGetSyncStandbysPriority() vs. SIGHUP - Mailing list pgsql-hackers
From | Kyotaro Horiguchi |
---|---|
Subject | Re: SyncRepGetSyncStandbysPriority() vs. SIGHUP |
Date | |
Msg-id | 20200207.125251.146972241588695685.horikyota.ntt@gmail.com Whole thread Raw |
In response to | SyncRepGetSyncStandbysPriority() vs. SIGHUP (Noah Misch <noah@leadboat.com>) |
List | pgsql-hackers |
At Wed, 5 Feb 2020 23:45:52 -0800, Noah Misch <noah@leadboat.com> wrote in > Buildfarm runs have triggered the assertion at the end of > SyncRepGetSyncStandbysPriority(): .. > On my development system, this delay injection reproduces the failure: > > --- a/src/backend/replication/syncrep.c > +++ b/src/backend/replication/syncrep.c > @@ -399,6 +399,8 @@ SyncRepInitConfig(void) > { > int priority; > > + pg_usleep(100 * 1000); Though I couldn't see that, but actually that can happen. That happens if: all potentially-sync standbys have lower priority than the number of syncrep list members. the number of sync standbys is short. the number of the potentially-sync standbys is enough. If all of the above are true, the while (priority <-= lowest_priority) doesn't loops and goes into the assertion. > SyncRepInitConfig() is the function responsible for updating, after SIGHUP, > the sync_standby_priority values that SyncRepGetSyncStandbysPriority() > consults. The assertion holds if each walsender's sync_standby_priority (in > shared memory) accounts for the latest synchronous_standby_names GUC value. > That ceases to hold for brief moments after a SIGHUP that changes the > synchronous_standby_names GUC value. Agreed. > I think the way to fix this is to nominate one process to update all > sync_standby_priority values after SIGHUP. That process should acquire > SyncRepLock once per ProcessConfigFile(), not once per walsender. If > walsender startup occurs at roughly the same time as a SIGHUP, the new > walsender should avoid computing sync_standby_priority based on a GUC value > different from the one used for the older walsenders. If we update the priority of all walsenders at once, the other walsenders may calculate required LSN using the old configuration with the new priority. I'm not sure the probability of happening this, but that causes similar situation. The priority calcuation happens for every standby repliy. So if there're some standbys with wrong priority, They will catch up at receiving the next standby reply. Thus just asuuming walsenders with out-of-range priority as non-sync standbys can "fix" it, I believe. pg_stat_get_wal_senders() reveals such inconsistent state but I don't think it's not worth addressing. > Would anyone like to fix this? I could add it to my queue, but it would wait > a year or more. The attached does that. Isn't it enough? # The more significant problem is I haven't suceeded to replay the # problem.. regards. -- Kyotaro Horiguchi NTT Open Source Software Center From ac080100b692700688054630d77190d63027798c Mon Sep 17 00:00:00 2001 From: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Date: Fri, 7 Feb 2020 12:40:56 +0900 Subject: [PATCH] Fix handling of tentative state after chainging synchronous_standby_names When reloading config after updating synchronous_standby_names, walsenders don't recalculate their sync-priority exactly at the same time. SyncRepGetSyncStandbysPriority may fail under such inconsistent situation. Fix that by assuming a walsender as non-sync if it has a priority number that is impossible against the synchronous_standby_names. --- src/backend/replication/syncrep.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/src/backend/replication/syncrep.c b/src/backend/replication/syncrep.c index c284103b54..2689421bcd 100644 --- a/src/backend/replication/syncrep.c +++ b/src/backend/replication/syncrep.c @@ -842,6 +842,16 @@ SyncRepGetSyncStandbysPriority(bool *am_sync) if (this_priority == 0) continue; + /* + * Walsenders don't reload configuration at the same time. A walsender + * may have out-of-range priority before updating by the reload. Treat + * them the same way as non-syncronous standbys. Priority is going to + * be fixed soon and we can work with consistent priorities at the next + * standby-reply message. + */ + if (this_priority > lowest_priority) + continue; + /* Must have a valid flush position */ if (XLogRecPtrIsInvalid(flush)) continue; -- 2.18.2
pgsql-hackers by date: