Re:Re: [BUG] standby node can not provide service even it replaysall log files - Mailing list pgsql-hackers

From Thunder
Subject Re:Re: [BUG] standby node can not provide service even it replaysall log files
Date
Msg-id 583e4a3f.8416.16df37d5907.Coremail.thunder1@126.com
Whole thread Raw
In response to Re: [BUG] standby node can not provide service even it replays alllog files  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [BUG] standby node can not provide service even it replays alllog files
Re:Re:Re: [BUG] standby node can not provide service even itreplays all log files
List pgsql-hackers
Update the patch.
1. The STANDBY_SNAPSHOT_PENDING state is set when we replay the first XLOG_RUNNING_XACTS and the sub transaction ids are overflow.
2. When we log XLOG_RUNNING_XACTS in master node, can we assume that all xact IDS < oldestRunningXid are considered finished?
3. If we can assume this, when we replay XLOG_RUNNING_XACTS and change standbyState to STANDBY_SNAPSHOT_PENDING, can we record oldestRunningXid to a shared variable, like procArray->oldest_running_xid?
4. In standby node when call GetSnapshotData if procArray->oldest_running_xid is valid, can we set xmin to be procArray->oldest_running_xid?

Appreciate any suggestion to this issue.



At 2019-10-22 01:27:58, "Robert Haas" <robertmhaas@gmail.com> wrote: >On Mon, Oct 21, 2019 at 4:13 AM Thunder <thunder1@126.com> wrote: >> Can we fix this issue like the following patch? >> >> $git diff src/backend/access/transam/xlog.c >> diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c >> index 49ae97d4459..0fbdf6fd64a 100644 >> --- a/src/backend/access/transam/xlog.c >> +++ b/src/backend/access/transam/xlog.c >> @@ -8365,7 +8365,7 @@ CheckRecoveryConsistency(void) >> * run? If so, we can tell postmaster that the database is consistent now, >> * enabling connections. >> */ >> - if (standbyState == STANDBY_SNAPSHOT_READY && >> + if ((standbyState == STANDBY_SNAPSHOT_READY || standbyState == STANDBY_SNAPSHOT_PENDING) && >> !LocalHotStandbyActive && >> reachedConsistency && >> IsUnderPostmaster) > >I think that the issue you've encountered is design behavior. In >other words, it's intended to work that way. > >The comments for the code you propose to change say that we can allow >connections once we've got a valid snapshot. So presumably the effect >of your change would be to allow connections even though we don't have >a valid snapshot. > >That seems bad. > >-- >Robert Haas >EnterpriseDB: http://www.enterprisedb.com >The Enterprise PostgreSQL Company


 

Attachment

pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: intermittent test failure on Windows
Next
From: Stephen Frost
Date:
Subject: Re: v12 pg_basebackup fails against older servers (take two)