Re: Seeking datacenter PITR backup suggestions - Mailing list pgsql-general

From Decibel!
Subject Re: Seeking datacenter PITR backup suggestions
Date
Msg-id BE545B2A-20E3-4A4E-A00C-CF10A0B2C2F3@decibel.org
Whole thread Raw
In response to Seeking datacenter PITR backup suggestions  ("Joey K." <pguser@gmail.com>)
Responses Re: Seeking datacenter PITR backup suggestions  (Steve Crawford <scrawford@pinpointresearch.com>)
List pgsql-general
On Aug 17, 2007, at 5:48 PM, Joey K. wrote:
> We have several web applications with Pg 8.2.x running on isolated
> servers (~25). The database size on each machines (du -h pgdata) is
> ~2 GB. We have been using nightly filesystem backup (stop pg, tar
> backup to ftp, start pg) and it worked well.
>
> We would like to move to PITR backups since the database size will
> increase moving forward and our current backup method might
> increase server downtimes.
>
> We have a central ftp backup server (yes, ftp :-) which we would
> like to use for weekly full and daily incremental PITR backups.
>
> After reading the docs, PITR is still fuzzy. Our ideas for backup
> are (do not worry about the syntax),
>
> ** START **
>
> tmpwal = "/localhost/tmp"   # tmp space on server 1 for storing wal
> files before ftp
> Configure $pgdata/postgresql.conf archive_command = "cp %p $tmpwal/%f"

Why not just FTP WAL files directly?

> Day 1:
> % psql pg_start_backup(); tar pgdata.tar --exclude pg_xlog/ pgdata
> % psql pg_stop_backup()
> % ftp put pgdata.tar ftpserver:/server1/day1/pgdata
> % ftp put $tmpwal/* ftpserver:/server1/day1/wal
> % rm -f $tmpwal/* pgdata.tar

The last 2 are a race condition... you could easily lose a WAL file
that way.

Keep in mind that that pgdata.tar is 100% useless unless you also
have the WAL files that were created during the backup. I generally
recommend to folks that they keep two base copies around for that
reason.

> Day 2:
> % ftp put $tmpwal/* ftpserver:/server1/day2/wal
> % rm -f $tmpwal/*
>
> Day 3:
> ...
> ...
>
> Day 7:
> % rm -f $tmpwal/*
> Start over
>
> Recovery on server1 (skeleton commands),
> % rm -f $tmpwal/*
> % mv pgdata pgdata.hosed
> % ftp get ftpbackup:/server1/day1/pgdata.tar  .
> % tar -xvf pgdata.tar
> % ftp get ftpbackup:/server1/day1/wal/*  $tmpwal
> % ftp get ftpbackup:/server1/day2/wal/*  $tmpwal
> .....
> .....
> % cp -r pgdata.hosed/pg_xlog pgdata/
> % echo "cp $tmpwal/%f %p" > pgdata/recovery.conf
> % start pg (recovery begins)
>
> ** END **
>
> Assumptions:
> a. After pg_stop_backup(), Pg immediately recycles log files and
> hence wal logs can be copied to backup. This is a clean start.
> b. New wal files since (a) are incremental backups
>
> We are not sure if WAL log filenames are unique and possibly
> overwrite older wal files during recovery.
>
> I'm seeking suggestions from others with experience performing
> PostgreSQL PITR backups from multiple servers to a central backup
> server.

In general, your handling of WAL files seems fragile and error-prone.
I think it would make far more sense to just FTP them directly, and
not try and get fancy with different directories for different days.
*when* a WAL file was generated is meaningless until you compare it
to a base backup to see if that WAL file is required for the base
backup, useful (but not required) to the base backup, or useless for
the base backup.
--
Decibel!, aka Jim Nasby                        decibel@decibel.org
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)



pgsql-general by date:

Previous
From: Decibel!
Date:
Subject: Re: Searching for Duplicates and Hosed the System
Next
From: Decibel!
Date:
Subject: Re: Seeking datacenter PITR backup procedures [RESENDING]