Thread: Condition to become the standby mode.

Condition to become the standby mode.

From
Sawada Masahiko
Date:
Hi All,

When the slave server starts, the slave server perform the following
steps in StartupXLOG():
1. Read latest CheckPoint record LSN from pg_control file.
2. Try to fetch CheckPoint record from pg_xlog directory at first. ( The server try to read up to prior CheckPoint
recordfrom pg_contrl file) 
3. If there is not in pg_xlog, the slave server requests CheckPoint
record to the primary server.

in #3, it works only when StandbyMode is true. For StandbyMode is to
true, database cluster state should be "DB_SHUTDOWNED" (it is one of
the conditions).
that is, slave server can try to fetch checkpoint record from the
master server after slave server was successful completion.

But there is the case that slave server can catch up the primary
server fetching WAL record from the primary server even if slave
server was not successful completion.

For example, when the 2 servers are connected over a relatively slower link.
even if there is possible of clustering replication without taking
full backup of the primary server, we ignore the possible. I think
that is a waste.

so I think it is better that slave server try to request WAL record to
the primary server even if database cluster state of pg_control is
"DB_IN_PRODUCTION" ( is not enough?)

If this problem is solved, there is possible of that we can failback
by removing the all WAL record which is in pg_xlog before server
starts as the slave server.
( And we also use "synchronous_transfer" which I'm proposing, I think
we can fail-back without taking full backup surely)

Am I missing something? Please give me feedback.

diff --git a/src/backend/access/transam/xlog.c
b/src/backend/access/transam/xlog.c
index 96aceb9..c3ccd0e 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -6213,7 +6213,8 @@ StartupXLOG(void)                       (ControlFile->minRecoveryPoint != InvalidXLogRecPtr ||
                   ControlFile->backupEndRequired ||                        ControlFile->backupEndPoint !=
InvalidXLogRecPtr|| 
-                        ControlFile->state == DB_SHUTDOWNED))
+                        ControlFile->state == DB_SHUTDOWNED ||
+                        ControlFile->state == DB_IN_PRODUCTION))               {
InArchiveRecovery= true;                       if (StandbyModeRequested) 

synchronous transfer discussion :
http://www.postgresql.org/message-id/CAF8Q-Gy7xa60HwXc0MKajjkWFEbFDWTG=gGyu1KmT+s2xcQ-bw@mail.gmail.com

--
Regards,

-------
Sawada Masahiko



Re: Condition to become the standby mode.

From
Andres Freund
Date:
Hi,

On 2013-07-26 13:19:34 +0900, Sawada Masahiko wrote:
> When the slave server starts, the slave server perform the following
> steps in StartupXLOG():
> 1. Read latest CheckPoint record LSN from pg_control file.
> 2. Try to fetch CheckPoint record from pg_xlog directory at first.
>   ( The server try to read up to prior CheckPoint record from pg_contrl file)
> 3. If there is not in pg_xlog, the slave server requests CheckPoint
> record to the primary server.

> in #3, it works only when StandbyMode is true. For StandbyMode is to
> true, database cluster state should be "DB_SHUTDOWNED" (it is one of
> the conditions).
> that is, slave server can try to fetch checkpoint record from the
> master server after slave server was successful completion.

It also will fetch the xlog from remote when we've found a backup label
(the read_backup_label() branch in StartupXLOG() will set StandbyMode to
true). Which will be the case if we're recovering from something like a
base backup.

The reason we don't fetch from remote if there's no backup label *and*
the last checkpoint wasn't a shutdown checkpoint is that that's doing
*local* crash recovery. Usually it will happen because we needed to
restart after a crash (or immediate restart which basically is the
same).
There's one valid case where you can get into the situation otherwise as
well, which is that the *entire* database directory has been copied via
an atomic snapshot. But in that case you *need* to also snapshot
pg_xlog.

Maybe there's another valid scenario?

> If this problem is solved, there is possible of that we can failback
> by removing the all WAL record which is in pg_xlog before server
> starts as the slave server.
> ( And we also use "synchronous_transfer" which I'm proposing, I think
> we can fail-back without taking full backup surely)

I still have *massive* doubts about the concept. But anyway, if you want
to do so, you should generate a backup label that specifies the startup
location.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Condition to become the standby mode.

From
Fujii Masao
Date:
On Fri, Jul 26, 2013 at 3:08 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> Hi,
>
> On 2013-07-26 13:19:34 +0900, Sawada Masahiko wrote:
>> When the slave server starts, the slave server perform the following
>> steps in StartupXLOG():
>> 1. Read latest CheckPoint record LSN from pg_control file.
>> 2. Try to fetch CheckPoint record from pg_xlog directory at first.
>>   ( The server try to read up to prior CheckPoint record from pg_contrl file)
>> 3. If there is not in pg_xlog, the slave server requests CheckPoint
>> record to the primary server.
>
>> in #3, it works only when StandbyMode is true. For StandbyMode is to
>> true, database cluster state should be "DB_SHUTDOWNED" (it is one of
>> the conditions).
>> that is, slave server can try to fetch checkpoint record from the
>> master server after slave server was successful completion.
>
> It also will fetch the xlog from remote when we've found a backup label
> (the read_backup_label() branch in StartupXLOG() will set StandbyMode to
> true). Which will be the case if we're recovering from something like a
> base backup.
>
> The reason we don't fetch from remote if there's no backup label *and*
> the last checkpoint wasn't a shutdown checkpoint is that that's doing
> *local* crash recovery.

Yes, in that case, local crash recovery occurs first. After it ends (i.e.,
reach the end of the WAL available in local), we can enable standby
mode and fetch the WAL from remote server.

>> If this problem is solved, there is possible of that we can failback
>> by removing the all WAL record which is in pg_xlog before server
>> starts as the slave server.
>> ( And we also use "synchronous_transfer" which I'm proposing, I think
>> we can fail-back without taking full backup surely)
>
> I still have *massive* doubts about the concept. But anyway, if you want
> to do so, you should generate a backup label that specifies the startup
> location.

Generating a backup label doesn't seem to be enough because there is
no backup-end WAL record and we cannot know the consistency point.

Regards,

--
Fujii Masao



Re: Condition to become the standby mode.

From
Andres Freund
Date:
On 2013-07-26 23:47:59 +0900, Fujii Masao wrote:
> >> If this problem is solved, there is possible of that we can failback
> >> by removing the all WAL record which is in pg_xlog before server
> >> starts as the slave server.
> >> ( And we also use "synchronous_transfer" which I'm proposing, I think
> >> we can fail-back without taking full backup surely)
> >
> > I still have *massive* doubts about the concept. But anyway, if you want
> > to do so, you should generate a backup label that specifies the startup
> > location.
> 
> Generating a backup label doesn't seem to be enough because there is
> no backup-end WAL record and we cannot know the consistency point.

Since 9.2 we allow generation of base backups from standbys, the
infrastructure built for should be sufficient to pass the lsn at which
consistency is achieved.

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



Re: Condition to become the standby mode.

From
Fujii Masao
Date:
On Fri, Jul 26, 2013 at 11:55 PM, Andres Freund <andres@2ndquadrant.com> wrote:
> On 2013-07-26 23:47:59 +0900, Fujii Masao wrote:
>> >> If this problem is solved, there is possible of that we can failback
>> >> by removing the all WAL record which is in pg_xlog before server
>> >> starts as the slave server.
>> >> ( And we also use "synchronous_transfer" which I'm proposing, I think
>> >> we can fail-back without taking full backup surely)
>> >
>> > I still have *massive* doubts about the concept. But anyway, if you want
>> > to do so, you should generate a backup label that specifies the startup
>> > location.
>>
>> Generating a backup label doesn't seem to be enough because there is
>> no backup-end WAL record and we cannot know the consistency point.
>
> Since 9.2 we allow generation of base backups from standbys, the
> infrastructure built for should be sufficient to pass the lsn at which
> consistency is achieved.

Yeah, right. I'd forgotten about that.

Regards,

-- 
Fujii Masao



Re: Condition to become the standby mode.

From
Fujii Masao
Date:
On Sat, Jul 27, 2013 at 12:51 AM, Fujii Masao <masao.fujii@gmail.com> wrote:
> On Fri, Jul 26, 2013 at 11:55 PM, Andres Freund <andres@2ndquadrant.com> wrote:
>> On 2013-07-26 23:47:59 +0900, Fujii Masao wrote:
>>> >> If this problem is solved, there is possible of that we can failback
>>> >> by removing the all WAL record which is in pg_xlog before server
>>> >> starts as the slave server.
>>> >> ( And we also use "synchronous_transfer" which I'm proposing, I think
>>> >> we can fail-back without taking full backup surely)
>>> >
>>> > I still have *massive* doubts about the concept. But anyway, if you want
>>> > to do so, you should generate a backup label that specifies the startup
>>> > location.
>>>
>>> Generating a backup label doesn't seem to be enough because there is
>>> no backup-end WAL record and we cannot know the consistency point.
>>
>> Since 9.2 we allow generation of base backups from standbys, the
>> infrastructure built for should be sufficient to pass the lsn at which
>> consistency is achieved.
>
> Yeah, right. I'd forgotten about that.

On the second thought, using such an infrastructure seems not enough for
this issue. When we start new standby from the backup taken from another
standby, we use pg_control's min recovery point as the consistency point.

When we use this technique for the issue that Sawada raised, we will get
one problem that pg_control's min recovery point is zero because it's
pg_control under the master. The master doesn't update min recovery
point in pg_control. Only the standby updates it.

Therefore, we would need to find another way for the issue.

Regards,

-- 
Fujii Masao