Re: : PostgreSQL Online Backup - Mailing list pgsql-general

From Venkat Balaji
Subject Re: : PostgreSQL Online Backup
Date
Msg-id CAFrxt0gx7q5JNVK79X611g6rdPoy0c9SFDy4xO3CMFmHyMsp6g@mail.gmail.com
Whole thread Raw
In response to Re: : PostgreSQL Online Backup  ("Albe Laurenz" <laurenz.albe@wien.gv.at>)
Responses Re: : PostgreSQL Online Backup  (Alan Hodgson <ahodgson@simkin.ca>)
List pgsql-general
Another problem in recovery (probably because of "rsync") -

As said earlier, we are taking a production backup everyday incrementally using "rsync".

But, Postgres some how misses to sync few files in between and keeps on asking the back dated archive files (more than 1 week ago). 

I restored October 2nd backup and PG is asking for September 26th archive file with the last known time as 26th Sep, 2011. 

2011-10-03 07:17:12 CDT [12705]: [1-1] LOG:  database system was interrupted; last known up at 2011-09-26 09:01:36 CDT
2011-10-03 07:17:12 CDT [12705]: [2-1] LOG:  starting archive recovery
cp: cannot stat `/usr/local/pgsql9.0.1/obtdata/data/pg_xlog/000000010000053900000076': No such file or directory
2011-10-03 07:17:12 CDT [12705]: [3-1] LOG:  could not open file "pg_xlog/000000010000053900000076" (log file 1337, segment 118): No such file or directory
2011-10-03 07:17:12 CDT [12705]: [4-1] LOG:  invalid checkpoint record
2011-10-03 07:17:12 CDT [12705]: [5-1] PANIC:  could not locate required checkpoint record
2011-10-03 07:17:12 CDT [12705]: [6-1] HINT:  If you are not restoring from a backup, try removing the file "/usr/local/pgsql9.0.1/obtdata/data/backup_label".
2011-10-03 07:17:12 CDT [12702]: [1-1] LOG:  startup process (PID 12705) was terminated by signal 6: Aborted
2011-10-03 07:17:12 CDT [12702]: [2-1] LOG:  aborting startup due to startup process failure


I always see pg_clog files and some base files not getting synced.

Below is what we are doing -

pg_start_backup()
rsync the data directory
pg_stop_backup()

The first time "rsync" is fine, but, the subsequent runs are generating in-consistency.

We do the same every day to backup the data directory incrementally.

What i observed is PG records the TXN id when ever backup starts and stops + backup label. The next day when PG records the start backup time and TXN id, i think some of the TXN ids and pg_clog files generated between last stop time and the next start time are missed.

Did anyone observe this behavior ?? Please help !

This is critical for us. I want to recommend not to use "rsync" (use cp or scp instead) for production backup.

Thanks
VB

On Tue, Sep 27, 2011 at 2:36 PM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
Venkat Balaji wrote:
> Our problem is -
>
> We had mistakenly executed  "rsync" on the running PostgreSQL data
directory (production) and we did
> not run "pg_start_backup()".
>
> Will this harm production ? can this lead to corruption ?

I assume that you used rsync to copy *from* the data directory.

This cannot lead to data corruption.
Only performance might suffer temporarily due to the additional I/O.

The backup made with rsync will be unusable without pg_start_backup().

Yours,
Laurenz Albe

pgsql-general by date:

Previous
From: Alban Hertroys
Date:
Subject: Re: I don't understand something...
Next
From: "姜头"
Date:
Subject: How can i get record by data block not by sql?