Thread: Online base backup from the hot-standby

Online base backup from the hot-standby

From
Jun Ishiduka
Date:
( Quotation from
  http://archives.postgresql.org/pgsql-hackers/2011-05/msg01396.php )

>  STEP1: Make startup process to acquire backup-end-position from
>         not only backup-end record but also backup-history-file .
>           * startup process allows to acquire backup-end-position
>             from backup-history-file .


I have created a patch to the above-mentioned content.

Please check it.


--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------

Attachment

Re: Online base backup from the hot-standby

From
Heikki Linnakangas
Date:
On 14.06.2011 09:03, Jun Ishiduka wrote:
> ( Quotation from
>    http://archives.postgresql.org/pgsql-hackers/2011-05/msg01396.php )
> 
>>   STEP1: Make startup process to acquire backup-end-position from
>>          not only backup-end record but also backup-history-file .
>>            * startup process allows to acquire backup-end-position
>>              from backup-history-file .
> 
> 
> I have created a patch to the above-mentioned content.

I still think that's headed in the wrong direction.
(http://archives.postgresql.org/pgsql-hackers/2011-05/msg01405.php)

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


Re: Online base backup from the hot-standby

From
Jun Ishiduka
Date:
> I still think that's headed in the wrong direction.
> (http://archives.postgresql.org/pgsql-hackers/2011-05/msg01405.php)

Please check these mails, and teach the reason for content of the wrong 
direction.
(http://archives.postgresql.org/pgsql-hackers/2011-06/msg00209.php)
(http://archives.postgresql.org/pgsql-hackers/2011-05/msg01566.php)


--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------




Re: Online base backup from the hot-standby

From
Steve Singer
Date:
On 11-06-14 02:52 AM, Jun Ishiduka wrote:
>> I still think that's headed in the wrong direction.
>> (http://archives.postgresql.org/pgsql-hackers/2011-05/msg01405.php)
> Please check these mails, and teach the reason for content of the wrong 
> direction.
> (http://archives.postgresql.org/pgsql-hackers/2011-06/msg00209.php)
> (http://archives.postgresql.org/pgsql-hackers/2011-05/msg01566.php)
>
>

Jun, I've been reviewing these threads as a start to reviewing your
patch (I haven't yet looked at the patch).

I *think* the concern is that

1) Today you can do a backup by just calling pg_start_backup('x'); copy
the data directory and
pg_stop_backup(); You do not need to use pg_basebackup to create a
backup. The solution you are proposing would require pg_basebackup to be
used to build backups from standby servers.

2) If I run pg_basebackup but do not specify '-x' then no pg_xlog
segments are included in the output. The relevant pg_xlog segments are
in the archive from the master. I can see situations where you are
already copying the archive to the remote site that the new standby will
be created in so you don't want to have to copy the pg_xlog segments
twice over your network.

What Heikki is proposing will work both when you aren't using
pg_basebackup (as long the output of pg_stop_backup() is somehow
captured in a way that it can be read) and will also work with
pg_basebackup when '-x' isn't specified.

Steve


> --------------------------------------------
> Jun Ishizuka
> NTT Software Corporation
> TEL:045-317-7018
> E-Mail: ishizuka.jun@po.ntts.co.jp
> --------------------------------------------
>
>
>



Re: Online base backup from the hot-standby

From
Jun Ishiduka
Date:
> 1) Today you can do a backup by just calling pg_start_backup('x'); copy
> the data directory and
> pg_stop_backup(); You do not need to use pg_basebackup to create a
> backup. The solution you are proposing would require pg_basebackup to be
> used to build backups from standby servers.

YES.


> 2) If I run pg_basebackup but do not specify '-x' then no pg_xlog
> segments are included in the output. The relevant pg_xlog segments are
> in the archive from the master. I can see situations where you are
> already copying the archive to the remote site that the new standby will
> be created in so you don't want to have to copy the pg_xlog segments
> twice over your network.

No, I don't matter because of the same behavior as 9.1.
Please see the part of "Before:" of the following answer.


> What Heikki is proposing will work both when you aren't using
> pg_basebackup (as long the output of pg_stop_backup() is somehow
> captured in a way that it can be read) and will also work with
> pg_basebackup when '-x' isn't specified.

I receive this mail, so I notice I do wrong recognition to what 
Heikki is proposing. 

my recognition: Before:   * I thought Heikki proposes, "Execute SQL(pg_start_backup('x'); copy      the data directory
andpg_stop_backup();) from the standby server      to the primary server".     -> I disliked this.  Now:   * Heikki is
proposingboth No using pg_basebackup and Not specify -x.     So,       * Use the output of pg_stop_backup().       *
Don'tuse backup history file.     he thinks.
 

Right?


--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------




Re: Online base backup from the hot-standby

From
Steve Singer
Date:
On 11-06-23 02:41 AM, Jun Ishiduka wrote:
> I receive this mail, so I notice I do wrong recognition to what
> Heikki is proposing. 
>
> my recognition:
>   Before:
>     * I thought Heikki proposes, "Execute SQL(pg_start_backup('x'); copy 
>       the data directory and pg_stop_backup();) from the standby server 
>       to the primary server".
>       -> I disliked this. 
>   Now:
>     * Heikki is proposing both No using pg_basebackup and Not specify -x.
>       So,
>         * Use the output of pg_stop_backup().
>         * Don't use backup history file.
>       he thinks.
>
> Right?
>

What I think he is proposing would not require using pg_stop_backup()
but you could also modify pg_stop_back() to work as well.

What do you think of this idea?

Do you see how the patch can be reworked to accomplish this?



> --------------------------------------------
> Jun Ishizuka
> NTT Software Corporation
> TEL:045-317-7018
> E-Mail: ishizuka.jun@po.ntts.co.jp
> --------------------------------------------
>
>
>



Re: Online base backup from the hot-standby

From
Jun Ishiduka
Date:
> What I think he is proposing would not require using pg_stop_backup()
> but you could also modify pg_stop_back() to work as well.
> 
> What do you think of this idea?
> 
> Do you see how the patch can be reworked to accomplish this?

The logic that not use pg_stop_backup() would be difficult,
because pg_stop_backup() is used to identify minRecoveryPoint.

--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------




Re: Online base backup from the hot-standby

From
Steve Singer
Date:
On 11-06-24 12:41 AM, Jun Ishiduka wrote:
>
> The logic that not use pg_stop_backup() would be difficult,
> because pg_stop_backup() is used to identify minRecoveryPoint.
>

Considering everything that has been discussed on this thread so far.

Do you still think your patch is the best way to accomplish base backups
from standby servers?
If not what changes do you think should be made?


Steve

> --------------------------------------------
> Jun Ishizuka
> NTT Software Corporation
> TEL:045-317-7018
> E-Mail: ishizuka.jun@po.ntts.co.jp
> --------------------------------------------
>
>
>



Re: Online base backup from the hot-standby

From
Jun Ishiduka
Date:
> Considering everything that has been discussed on this thread so far.
> 
> Do you still think your patch is the best way to accomplish base backups
> from standby servers?
> If not what changes do you think should be made?

I reconsider the way to not use pg_stop_backup().

Process of online base backup on standby server:1. pg_start_backup('x');2. copy the data directory3. copy *pg_control*

Behavior while restore:* read "Minimum recovery ending location" of the copied pg_control.* use the value with the same
purposesas the end-of-backup location.  -> When the value is equal to 0/0, this behavior do not do.     This situation
isto acquire backup from master server.
 


--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------




Re: Online base backup from the hot-standby

From
Steve Singer
Date:
On 11-06-28 01:52 AM, Jun Ishiduka wrote:
>> Considering everything that has been discussed on this thread so far.
>>
>> Do you still think your patch is the best way to accomplish base backups
>> from standby servers?
>> If not what changes do you think should be made?
> I reconsider the way to not use pg_stop_backup().
>
> Process of online base backup on standby server:
>  1. pg_start_backup('x');
>  2. copy the data directory
>  3. copy *pg_control*
>
> Behavior while restore:
>  * read "Minimum recovery ending location" of the copied pg_control.
>  * use the value with the same purposes as the end-of-backup location.
>    -> When the value is equal to 0/0, this behavior do not do.
>       This situation is to acquire backup from master server.
>

The behaviour you describe above sounds okay to me, if someone else sees
issues with this then they should speak up (ideally before you go off
and write a new patch)

I'm going to consolidate my other comments below so this can act as a
more complete review.

Usability Review
-----------------
We would like to be able to perform base backups from slaves without
having to call pg_start_backup() on the master. We can not currently do
this. The patch tries to do this. From a useability point of view it
would be nice if this could be done both manually with pg_start_backup()
and with pg_basebackup.

The main issue I have with the first patch you submitted is that it does
not work for cases where you don't want to call pg_basebackup or you
don't want to include the wal segments in the output of pg_basebackup.
There are a number of these use-cases (examples include the wal is
already available on an archive server, or you want to use
filesystem/disk array level snapshots instead of tar) . I feel your
above proposal to copy the control file as the last step in the
basebackup and the get the minRecoveryEnding location from this solves
these issues. It would be nice if things 'worked' when calling
pg_basebackup against the slave (maybe by having perform_base_backup()
resend the control file after it has sent the log?).

Feature test & Performance review
-----------------
Skipped since a new patch is coming

Coding Review
------------------
I didn't look too closely at the code since a new patch that might
change a lot of the code. I did like how you added comments to most of
the larger code blocks that you added.


Architecture Review
-----------------------
There were some concerns with your original approach but using the
control file was suggested by Heikki and seems sound to me.


I'm marking this 'waiting for author' , if you don't think you'll be
able to get a reworked patch out during this commitfest then you should
move it to 'returned with feedback'

Steve


> --------------------------------------------
> Jun Ishizuka
> NTT Software Corporation
> TEL:045-317-7018
> E-Mail: ishizuka.jun@po.ntts.co.jp
> --------------------------------------------
>
>
>



Re: Online base backup from the hot-standby

From
Fujii Masao
Date:
2011/6/28 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
>
>> Considering everything that has been discussed on this thread so far.
>>
>> Do you still think your patch is the best way to accomplish base backups
>> from standby servers?
>> If not what changes do you think should be made?
>
> I reconsider the way to not use pg_stop_backup().
>
> Process of online base backup on standby server:
>  1. pg_start_backup('x');
>  2. copy the data directory
>  3. copy *pg_control*

Who deletes the backup_label file created by pg_start_backup()?
Isn't pg_stop_backup() required to do that?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Online base backup from the hot-standby

From
Magnus Hagander
Date:
<p><br /> On Jun 30, 2011 5:59 AM, "Fujii Masao" <<a
href="mailto:masao.fujii@gmail.com">masao.fujii@gmail.com</a>>wrote:<br /> ><br /> > 2011/6/28 Jun Ishiduka
<<ahref="mailto:ishizuka.jun@po.ntts.co.jp">ishizuka.jun@po.ntts.co.jp</a>>:<br /> > ><br /> > >>
Consideringeverything that has been discussed on this thread so far.<br /> > >><br /> > >> Do you
stillthink your patch is the best way to accomplish base backups<br /> > >> from standby servers?<br /> >
>>If not what changes do you think should be made?<br /> > ><br /> > > I reconsider the way to not
usepg_stop_backup().<br /> > ><br /> > > Process of online base backup on standby server:<br /> > >
 1.pg_start_backup('x');<br /> > >  2. copy the data directory<br /> > >  3. copy *pg_control*<br />
><br/> > Who deletes the backup_label file created by pg_start_backup()?<br /> > Isn't pg_stop_backup()
requiredto do that?<p>You need it to take the system out of backup mode as well, don't you? <p>/Magnus  

Re: Online base backup from the hot-standby

From
Jun Ishiduka
Date:
> > > Process of online base backup on standby server:
> > >  1. pg_start_backup('x');
> > >  2. copy the data directory
> > >  3. copy *pg_control*
> > 
> > Who deletes the backup_label file created by pg_start_backup()?
> > Isn't pg_stop_backup() required to do that?
>
> You need it to take the system out of backup mode as well, don't you?

Certainly so.

Add to the above process: 4. pg_stop_backup();

But I do not consider a case such as to promote in backup mode yet.
I need to think a little, including it.

On this commitfest, the goal of the patch is to be able to be 
recovered using "Minimum recovery ending location" in the control file.

--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------




Re: Online base backup from the hot-standby

From
Jun Ishiduka
Date:
> On this commitfest, the goal of the patch is to be able to be
> recovered using "Minimum recovery ending location" in the control file.

Done.

Regards.

--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------

Attachment

Re: Online base backup from the hot-standby

From
Fujii Masao
Date:
2011/7/1 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
>
>> On this commitfest, the goal of the patch is to be able to be
>> recovered using "Minimum recovery ending location" in the control file.
>
> Done.

When the standby restarts after it crashes during recovery, it always
checks whether recovery has reached the backup end location by
using minRecoveryPoint even though the standby doesn't start from
the backup. This looks odd.

-        XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
+        (XLogRecPtrIsInvalid(ControlFile->backupStartPoint) ||
+         reachedControlMinRecoveryPoint == true))

The flag 'reachedControlMinRecoveryPoint' is really required? When recovery
reaches minRecoveryPoint, ControlFile->backupStartPoint is reset to zero. So
we can check whether recovery has reached minRecoveryPoint or not by only
doing XLogRecPtrIsInvalid(ControlFile->backupStartPoint). No?

We should check if recovery has reached minRecoveryPoint before calling
CheckRecoveryConsistency() after reading new WAL record? Otherwise,
even if recovery has reached minRecoveryPoint, the standby cannot think
that it's in consistent state until it reads new WAL record.

+                        if (XLByteLT(ControlFile->minRecoveryPoint, EndRecPtr))
+                            ControlFile->minRecoveryPoint = EndRecPtr;

Why does ControlFile->minRecoveryPoint need to be set to EndRecPtr?

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Online base backup from the hot-standby

From
Jun Ishiduka
Date:
> When the standby restarts after it crashes during recovery, it always
> checks whether recovery has reached the backup end location by
> using minRecoveryPoint even though the standby doesn't start from
> the backup. This looks odd.

Certainly.

But, in this case, the state before recovery starts is lost.
Therefore, postgres can not see that the backup got from whether 
standby server or master.

What should?
Should use pg_control?Ex.   * Add 'Where to get backup' to pg_control. (default 'none')  * When recovery starts, it
checksit whether 'none'.     * When minRecoveryPoint equals 0/0, change 'master'.     * When minRecoveryPoint do not
equals0/0, change 'slave'.  * When it reached the end of recovery, change 'none' .
 


> -        XLogRecPtrIsInvalid(ControlFile->backupStartPoint))
> +        (XLogRecPtrIsInvalid(ControlFile->backupStartPoint) ||
> +         reachedControlMinRecoveryPoint == true))

> The flag 'reachedControlMinRecoveryPoint' is really required? When recovery
> reaches minRecoveryPoint, ControlFile->backupStartPoint is reset to zero. So
> we can check whether recovery has reached minRecoveryPoint or not by only
> doing XLogRecPtrIsInvalid(ControlFile->backupStartPoint). No?

Yes.
'reachedControlMinRecoveryPoint' is unnecessary.


> We should check if recovery has reached minRecoveryPoint before calling
> CheckRecoveryConsistency() after reading new WAL record? Otherwise,
> even if recovery has reached minRecoveryPoint, the standby cannot think
> that it's in consistent state until it reads new WAL record.

This is a same sequence with a case of backup end location.
It should be no changed.


> +                        if (XLByteLT(ControlFile->minRecoveryPoint, EndRecPtr))
> +                            ControlFile->minRecoveryPoint = EndRecPtr;

> Why does ControlFile->minRecoveryPoint need to be set to EndRecPtr?

Yes.
I delete it.

--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------




Re: Online base backup from the hot-standby

From
Fujii Masao
Date:
2011/7/4 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
>
>> When the standby restarts after it crashes during recovery, it always
>> checks whether recovery has reached the backup end location by
>> using minRecoveryPoint even though the standby doesn't start from
>> the backup. This looks odd.
>
> Certainly.
>
> But, in this case, the state before recovery starts is lost.
> Therefore, postgres can not see that the backup got from whether
> standby server or master.
>
> What should?
> Should use pg_control?
>  Ex.
>   * Add 'Where to get backup' to pg_control. (default 'none')
>   * When recovery starts, it checks it whether 'none'.
>      * When minRecoveryPoint equals 0/0, change 'master'.
>      * When minRecoveryPoint do not equals 0/0, change 'slave'.
>   * When it reached the end of recovery, change 'none' .

What about using backupStartPoint to check whether this recovery
started from the backup or not?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Online base backup from the hot-standby

From
Jun Ishiduka
Date:
> What about using backupStartPoint to check whether this recovery
> started from the backup or not?

No, postgres can check whether this recovery started from the backup 
or not, but can not check whether standby server or master (got backup 
from).

Once recovery started, backupStartPoint is recorded to pg_control until
recovery reaches backup end location, it is not related to any backup 
server.


--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------




Re: Online base backup from the hot-standby

From
Fujii Masao
Date:
2011/7/5 Jun Ishiduka <ishizuka.jun@po.ntts.co.jp>:
>
>> What about using backupStartPoint to check whether this recovery
>> started from the backup or not?
>
> No, postgres can check whether this recovery started from the backup
> or not, but can not check whether standby server or master (got backup
> from).

Oh, right. We cannot distinguish the following two cases just by using
minRecoveryPoint and backupStartPoint.
   * The standby starts from the backup taken from the standby   * The standby starts after it crashes during
recoveringfrom the      backup taken from the master
 

As you proposed, adding new field which stores the backup end location
taken from minRecoveryPoint, into pg_control sounds good idea.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


Re: Online base backup from the hot-standby

From
Jun Ishiduka
Date:
> As you proposed, adding new field which stores the backup end location
> taken from minRecoveryPoint, into pg_control sounds good idea.

Update patch.

Regards.

--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------

Attachment

Re: Online base backup from the hot-standby

From
Steve Singer
Date:
On 11-07-07 09:22 PM, Jun Ishiduka wrote: <blockquote cite="mid:201107080122.p681MTl2009948@ccmds32.silk.ntts.co.jp"
type="cite"><prewrap="">
 
</pre><blockquote type="cite"><pre wrap="">As you proposed, adding new field which stores the backup end location
taken from minRecoveryPoint, into pg_control sounds good idea.
</pre></blockquote><pre wrap="">
Update patch.

</pre></blockquote> Here is a review of the updated patch<br /><br /> This version of the patch adds a field into
pg_controldatathat tries to store the source of the base backup while in recovery mode. <br /> I think your ultimate
goalwith this patch is to be able to take a backup of a running hot-standby slave and recover it as another instance.  
Thispatch seems to provide the ability to have the second slave stop recovery at minRecoveryPoint from the control
file.<br/><br /><br /> My understanding of the procedure you want to get to to take base backups off a slave  is<br
/><br/> 1.  execute pg_start_backup('x') on the slave (*)<br /> 2.  take a backup of the data dir<br /> 3. call
pg_stop_backup()on the slave<br /> 4. Copy the control file  on the slave<br /><br /> This patch only addresses the
recoveryportions.<br /><br /> * - I think your goal is to be able to run pg_start_backup() on the slave, the patch so
fardoesn't do this.  If your goal was require this to be run on the master, then correct me.<br /><br /><br /> Code
Review<br/> -------------------<br /> A few comments on the code<br /><br /><blockquote type="cite">***
postgresql/src/include/catalog/pg_control.h   2011-06-30 10:04:48.000000000 +0900<br /> ---
postgresql_with_patch/src/include/catalog/pg_control.h   2011-07-07 18:23:56.000000000 +0900<br /> ***************<br
/>*** 142,147 ****<br /> --- 142,157 ----<br />       XLogRecPtr    backupStartPoint;<br />   <br />       /*<br /> +
    * Postgres keeps where to take a backup server.<br /> +      *<br /> +      * backupserver is "none" , "master" or
"slave",its default is "none".<br /> +      * When postgres starts and it is "none", it is updated to either
"master"<br/> +      * or "slave" with minRecoveryPoint of the backup server.<br /> +      * When postgres reaches
backupend location, it is updated to "none".<br /> +      */<br /> +     int            backupserver;<br /> + <br /> +
   /*<br />        * Parameter settings that determine if the WAL can be used for archival<br />        * or hot
standby.<br/>        */</blockquote><br /> I don't think the above comment is very clear on what backupserver is.<br />
Perhaps<br/><br /> /**<br />  * backupserver is used while postgresql is in recovery mode to<br />  * store the
locationof where the backup comes from.<br />  * When Postgres starts recovery operations<br />  *  it is set to
"none". During recovery it is updated to either "master", or "slave"<br />  * When recovery operations finish it is
updatedback to "none"<br />  **/<br /><br /> Also shouldn't backupServer be the enum type of 'BackupServer' not int?
Otherenums in the structure such as DBState are defined this way.<br /><br /> Testing Review<br />
----------------------<br/><br /> Since I can't yet call pg_start_backup or pg_stop_backup() on the slave I am calling
themon the master.<br /> (I also did some testing where I didn't put the system into backup mode).  I admit that I am
notsure what to look for as an indication that the system isn't recovering to the correct point.  In much of my testing
Iwas just verifying that the slave started and my data 'looked' okay.<br /><br /><br /> I seem to get this warning in
mylogs when I start up the instance based on the slave backup.<br /> LOG:  00000: database system was interrupted while
inrecovery at log time 2011-07-08 18:40:20 EDT<br /> HINT:  If this has occurred more than once some data might be
corruptedand you might need to choose an earlier recovery target<br /><br /> I'm wondering if this warning is a bit
misleadingto users because it is an expected message when starting up an instance based on a slave backup (because the
slavewas already in recovery mode).    If I shutdown this instance and start it up again I keep getting the warning. 
Myunderstanding of your patch is that there shouldn't be any risk of corruption in that case (assuming your patch has
nobugs).   Can/should we be suppressing this message when we detect that we are recovering from a slave backup?<br
/><br/><br /> The direction of the patch has changed a bit during this commit fest.   I think it would be good to
providean update on the rest of the changes you plan for this to be a complete useable feature.  That would make it
easierto comment on something you <br /> missed versus something your planning on dealing with in the next stage.<br
/><br/> Steve<br /><br /><br /><blockquote cite="mid:201107080122.p681MTl2009948@ccmds32.silk.ntts.co.jp"
type="cite"><prewrap="">Regards.
 

--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: <a class="moz-txt-link-abbreviated" href="mailto:ishizuka.jun@po.ntts.co.jp">ishizuka.jun@po.ntts.co.jp</a>
--------------------------------------------
</pre> <pre wrap="">
<fieldset class="mimeAttachmentHeader"></fieldset>

</pre></blockquote><br />

Re: Online base backup from the hot-standby

From
Jun Ishiduka
Date:
> This version of the patch adds a field into pg_controldata that tries to
> store the source of the base backup while in recovery mode.
> I think your ultimate goal with this patch is to be able to take a
> backup of a running hot-standby slave and recover it as another
> instance. This patch seems to provide the ability to have the second
> slave stop recovery at minRecoveryPoint from the control file.
>
>
> My understanding of the procedure you want to get to to take base
> backups off a slave is
>
> 1. execute pg_start_backup('x') on the slave (*)
> 2. take a backup of the data dir
> 3. call pg_stop_backup() on the slave
> 4. Copy the control file on the slave
>
> This patch only addresses the recovery portions.

Yes.


> I don't think the above comment is very clear on what backupserver is.
> Perhaps
>
> /**
> * backupserver is used while postgresql is in recovery mode to
> * store the location of where the backup comes from.
> * When Postgres starts recovery operations
> * it is set to "none". During recovery it is updated to either "master",
> or "slave"
> * When recovery operations finish it is updated back to "none"
> **/

Done.


> Also shouldn't backupServer be the enum type of 'BackupServer' not int?
> Other enums in the structure such as DBState are defined this way.

Now, this is a same as wal_level, not DBState. No?


> Since I can't yet call pg_start_backup or pg_stop_backup() on the slave
> I am calling them on the master.
> (I also did some testing where I didn't put the system into backup
> mode). I admit that I am not sure what to look for as an indication that
> the system isn't recovering to the correct point. In much of my testing
> I was just verifying that the slave started and my data 'looked' okay.

Updated patch as can execute pg_start/stop_backup() on standby server.
One-pass of above steps(from 1. to 4.) is now done on this.
However, there are conditions.
 * Master's full_page_write = on.
 * On the slave,  do not execute stop/promote operation before pg_stop_backup() is executed.
 * the result of pg_start_backup() may exceed the result of pg_stop_backup().


> I seem to get this warning in my logs when I start up the instance based
> on the slave backup.
> LOG: 00000: database system was interrupted while in recovery at log
> time 2011-07-08 18:40:20 EDT
> HINT: If this has occurred more than once some data might be corrupted
> and you might need to choose an earlier recovery target
>
> I'm wondering if this warning is a bit misleading to users because it is
> an expected message when starting up an instance based on a slave backup
> (because the slave was already in recovery mode). If I shutdown this
> instance and start it up again I keep getting the warning. My
> understanding of your patch is that there shouldn't be any risk of
> corruption in that case (assuming your patch has no bugs). Can/should we
> be suppressing this message when we detect that we are recovering from a
> slave backup?

This has not been supported yet.
I do not see what state of this message.

Always happens when backup is taken from slave.
What do you think about an approach to add context, "unless take backup from slave"?


> The direction of the patch has changed a bit during this commit fest. I
> think it would be good to provide an update on the rest of the changes
> you plan for this to be a complete useable feature. That would make it
> easier to comment on something you
> missed versus something your planning on dealing with in the next stage.

I see.

I will provide a patch which can exeute pg_start/stop_backup
including to solve above comment and conditions in next stage.
Then please review.

I change this patch status to "Returned with feedback".

Regards.


--------------------------------------------
Jun Ishizuka
NTT Software Corporation
TEL:045-317-7018
E-Mail: ishizuka.jun@po.ntts.co.jp
--------------------------------------------

Attachment