Thread: Condition to become the standby mode.
Hi All, When the slave server starts, the slave server perform the following steps in StartupXLOG(): 1. Read latest CheckPoint record LSN from pg_control file. 2. Try to fetch CheckPoint record from pg_xlog directory at first. ( The server try to read up to prior CheckPoint recordfrom pg_contrl file) 3. If there is not in pg_xlog, the slave server requests CheckPoint record to the primary server. in #3, it works only when StandbyMode is true. For StandbyMode is to true, database cluster state should be "DB_SHUTDOWNED" (it is one of the conditions). that is, slave server can try to fetch checkpoint record from the master server after slave server was successful completion. But there is the case that slave server can catch up the primary server fetching WAL record from the primary server even if slave server was not successful completion. For example, when the 2 servers are connected over a relatively slower link. even if there is possible of clustering replication without taking full backup of the primary server, we ignore the possible. I think that is a waste. so I think it is better that slave server try to request WAL record to the primary server even if database cluster state of pg_control is "DB_IN_PRODUCTION" ( is not enough?) If this problem is solved, there is possible of that we can failback by removing the all WAL record which is in pg_xlog before server starts as the slave server. ( And we also use "synchronous_transfer" which I'm proposing, I think we can fail-back without taking full backup surely) Am I missing something? Please give me feedback. diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 96aceb9..c3ccd0e 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -6213,7 +6213,8 @@ StartupXLOG(void) (ControlFile->minRecoveryPoint != InvalidXLogRecPtr || ControlFile->backupEndRequired || ControlFile->backupEndPoint != InvalidXLogRecPtr|| - ControlFile->state == DB_SHUTDOWNED)) + ControlFile->state == DB_SHUTDOWNED || + ControlFile->state == DB_IN_PRODUCTION)) { InArchiveRecovery= true; if (StandbyModeRequested) synchronous transfer discussion : http://www.postgresql.org/message-id/CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=gGyu1KmT+s2xcQ-bw@mail.gmail.com -- Regards, ------- Sawada Masahiko
Hi, On 2013-07-26 13:19:34 +0900, Sawada Masahiko wrote: > When the slave server starts, the slave server perform the following > steps in StartupXLOG(): > 1. Read latest CheckPoint record LSN from pg_control file. > 2. Try to fetch CheckPoint record from pg_xlog directory at first. > ( The server try to read up to prior CheckPoint record from pg_contrl file) > 3. If there is not in pg_xlog, the slave server requests CheckPoint > record to the primary server. > in #3, it works only when StandbyMode is true. For StandbyMode is to > true, database cluster state should be "DB_SHUTDOWNED" (it is one of > the conditions). > that is, slave server can try to fetch checkpoint record from the > master server after slave server was successful completion. It also will fetch the xlog from remote when we've found a backup label (the read_backup_label() branch in StartupXLOG() will set StandbyMode to true). Which will be the case if we're recovering from something like a base backup. The reason we don't fetch from remote if there's no backup label *and* the last checkpoint wasn't a shutdown checkpoint is that that's doing *local* crash recovery. Usually it will happen because we needed to restart after a crash (or immediate restart which basically is the same). There's one valid case where you can get into the situation otherwise as well, which is that the *entire* database directory has been copied via an atomic snapshot. But in that case you *need* to also snapshot pg_xlog. Maybe there's another valid scenario? > If this problem is solved, there is possible of that we can failback > by removing the all WAL record which is in pg_xlog before server > starts as the slave server. > ( And we also use "synchronous_transfer" which I'm proposing, I think > we can fail-back without taking full backup surely) I still have *massive* doubts about the concept. But anyway, if you want to do so, you should generate a backup label that specifies the startup location. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jul 26, 2013 at 3:08 PM, Andres Freund <andres@2ndquadrant.com> wrote: > Hi, > > On 2013-07-26 13:19:34 +0900, Sawada Masahiko wrote: >> When the slave server starts, the slave server perform the following >> steps in StartupXLOG(): >> 1. Read latest CheckPoint record LSN from pg_control file. >> 2. Try to fetch CheckPoint record from pg_xlog directory at first. >> ( The server try to read up to prior CheckPoint record from pg_contrl file) >> 3. If there is not in pg_xlog, the slave server requests CheckPoint >> record to the primary server. > >> in #3, it works only when StandbyMode is true. For StandbyMode is to >> true, database cluster state should be "DB_SHUTDOWNED" (it is one of >> the conditions). >> that is, slave server can try to fetch checkpoint record from the >> master server after slave server was successful completion. > > It also will fetch the xlog from remote when we've found a backup label > (the read_backup_label() branch in StartupXLOG() will set StandbyMode to > true). Which will be the case if we're recovering from something like a > base backup. > > The reason we don't fetch from remote if there's no backup label *and* > the last checkpoint wasn't a shutdown checkpoint is that that's doing > *local* crash recovery. Yes, in that case, local crash recovery occurs first. After it ends (i.e., reach the end of the WAL available in local), we can enable standby mode and fetch the WAL from remote server. >> If this problem is solved, there is possible of that we can failback >> by removing the all WAL record which is in pg_xlog before server >> starts as the slave server. >> ( And we also use "synchronous_transfer" which I'm proposing, I think >> we can fail-back without taking full backup surely) > > I still have *massive* doubts about the concept. But anyway, if you want > to do so, you should generate a backup label that specifies the startup > location. Generating a backup label doesn't seem to be enough because there is no backup-end WAL record and we cannot know the consistency point. Regards, -- Fujii Masao
On 2013-07-26 23:47:59 +0900, Fujii Masao wrote: > >> If this problem is solved, there is possible of that we can failback > >> by removing the all WAL record which is in pg_xlog before server > >> starts as the slave server. > >> ( And we also use "synchronous_transfer" which I'm proposing, I think > >> we can fail-back without taking full backup surely) > > > > I still have *massive* doubts about the concept. But anyway, if you want > > to do so, you should generate a backup label that specifies the startup > > location. > > Generating a backup label doesn't seem to be enough because there is > no backup-end WAL record and we cannot know the consistency point. Since 9.2 we allow generation of base backups from standbys, the infrastructure built for should be sufficient to pass the lsn at which consistency is achieved. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services
On Fri, Jul 26, 2013 at 11:55 PM, Andres Freund <andres@2ndquadrant.com> wrote: > On 2013-07-26 23:47:59 +0900, Fujii Masao wrote: >> >> If this problem is solved, there is possible of that we can failback >> >> by removing the all WAL record which is in pg_xlog before server >> >> starts as the slave server. >> >> ( And we also use "synchronous_transfer" which I'm proposing, I think >> >> we can fail-back without taking full backup surely) >> > >> > I still have *massive* doubts about the concept. But anyway, if you want >> > to do so, you should generate a backup label that specifies the startup >> > location. >> >> Generating a backup label doesn't seem to be enough because there is >> no backup-end WAL record and we cannot know the consistency point. > > Since 9.2 we allow generation of base backups from standbys, the > infrastructure built for should be sufficient to pass the lsn at which > consistency is achieved. Yeah, right. I'd forgotten about that. Regards, -- Fujii Masao
On Sat, Jul 27, 2013 at 12:51 AM, Fujii Masao <masao.fujii@gmail.com> wrote: > On Fri, Jul 26, 2013 at 11:55 PM, Andres Freund <andres@2ndquadrant.com> wrote: >> On 2013-07-26 23:47:59 +0900, Fujii Masao wrote: >>> >> If this problem is solved, there is possible of that we can failback >>> >> by removing the all WAL record which is in pg_xlog before server >>> >> starts as the slave server. >>> >> ( And we also use "synchronous_transfer" which I'm proposing, I think >>> >> we can fail-back without taking full backup surely) >>> > >>> > I still have *massive* doubts about the concept. But anyway, if you want >>> > to do so, you should generate a backup label that specifies the startup >>> > location. >>> >>> Generating a backup label doesn't seem to be enough because there is >>> no backup-end WAL record and we cannot know the consistency point. >> >> Since 9.2 we allow generation of base backups from standbys, the >> infrastructure built for should be sufficient to pass the lsn at which >> consistency is achieved. > > Yeah, right. I'd forgotten about that. On the second thought, using such an infrastructure seems not enough for this issue. When we start new standby from the backup taken from another standby, we use pg_control's min recovery point as the consistency point. When we use this technique for the issue that Sawada raised, we will get one problem that pg_control's min recovery point is zero because it's pg_control under the master. The master doesn't update min recovery point in pg_control. Only the standby updates it. Therefore, we would need to find another way for the issue. Regards, -- Fujii Masao