Thread: on-line backup questions

on-line backup questions

From
Tom Davies
Date:
I've been working on backup restore scripts for a PostgreSQL DB hosted
on an Amazon EC2 instance. EC2 instances don't have any persistent
storage, so I archive the WAL files to Amazon S3. I do periodic
backups, and restore and roll forward when an instance is restarted.

I've followed the excellent documentation at http://www.postgresql.org/docs/8.0/interactive/backup-online.html
  , and everything is working well, but I have a few further questions.

1. Purely out of curiousity, what's the nnnnnnnn.history file which is
requested from the archive when you restore? I realise that the
restore process looking for files which don't exist is not a problem,
I'm just being inquisitive.

2. When I do a backup, Postgres requests archiving of n.m.backup and
the WAL file n immediately, so I only copy WAL files with numbers > n
from pg_xlog. I've seen suggestions that I should be copying WAL files
with numbers >= n -- is there any reason to do so?

3. What's the best thing to do when I deliberately shut down
PostgreSQL (i.e. pg_ctl stop)? When I start again I will be restoring
from the most recent backup and rolling forward over the archived WAL
files. I believe that shutdown leaves me with unarchived WAL files in
pg_xlog. Can I simply copy these to S3 as if they were WAL files which
were copied from pg_xlog at the time of the last backup, and then put
them back into pg_xlog as part of my restore process?

4. I'm using PostgreSQL 8.0 -- are there any significant improvements
in on-line backups in later versions?

Thanks,
   Tom

Re: on-line backup questions

From
Tom Lane
Date:
Tom Davies <tgdavies@gmail.com> writes:
> 1. Purely out of curiousity, what's the nnnnnnnn.history file which is
> requested from the archive when you restore?

Read the doc section about timelines --- if a history file exists,
it's needed to allow proper tracing of the "timeline" through multiple
recovery attempts.

> 3. What's the best thing to do when I deliberately shut down
> PostgreSQL (i.e. pg_ctl stop)? When I start again I will be restoring
> from the most recent backup and rolling forward over the archived WAL
> files. I believe that shutdown leaves me with unarchived WAL files in
> pg_xlog.

Yeah, you should archive the latest WAL file, but in 8.0 you'd have to
do that manually.  (IIRC there isn't even a forced-xlog-switch function
in that version to help you.)

> 4. I'm using PostgreSQL 8.0 -- are there any significant improvements
> in on-line backups in later versions?

Get thyself onto 8.2 ASAP, or maybe go to 8.3 shortly after the
holidays.  8.0 is basically our stone age for PITR support; while
the concepts haven't changed since then, we've filed off a whole lot
of rough edges in operational details.  In a situation where you're
depending on archive recovery as much as this, you *need* those fixes.

            regards, tom lane

Re: on-line backup questions

From
Tom Davies
Date:
Thanks for the prompt response!

On 22/12/2007, at 1:33 PM, Tom Lane wrote:

> Tom Davies <tgdavies@gmail.com> writes:
>>
>> 3. What's the best thing to do when I deliberately shut down
>> PostgreSQL (i.e. pg_ctl stop)? When I start again I will be restoring
>> from the most recent backup and rolling forward over the archived WAL
>> files. I believe that shutdown leaves me with unarchived WAL files in
>> pg_xlog.
>
> Yeah, you should archive the latest WAL file, but in 8.0 you'd have to
> do that manually.  (IIRC there isn't even a forced-xlog-switch
> function
> in that version to help you.)

So in 8.2, when I do a backup I *don't* need to manually copy any WAL
files?, and when I shut down I should:

1 call pg_switch_xlog()
2 wait for the WAL to be archived *if* pg_switch_xlog return a
location after the end of the previously archived WAL
3 actually call pg_ctl stop

Do I need to do 2, above, or can postgres wait until the previous WAL
archive is complete?

Thanks,
   Tom