Re: Use of rsync for data directory copying - Mailing list pgsql-hackers

From Bruce Momjian
Subject Re: Use of rsync for data directory copying
Date
Msg-id 20120715025722.GA3215@momjian.us
Whole thread Raw
In response to Re: Use of rsync for data directory copying  (Stephen Frost <sfrost@snowman.net>)
Responses Re: Use of rsync for data directory copying  (Stephen Frost <sfrost@snowman.net>)
Re: Use of rsync for data directory copying  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
List pgsql-hackers
On Sat, Jul 14, 2012 at 09:17:22PM -0400, Stephen Frost wrote:
> Bruce,
> 
> * Bruce Momjian (bruce@momjian.us) wrote:
> > If two writes happens in the middle of a file in the same second, it
> > seems one might be missed.  Yes, I suppose the WAL does fix that during
> > replay, though if both servers were shut down cleanly, WAL would not be
> > replayed.
> > 
> > If you using it for a hot backup, and WAL would clean that up.
> 
> Right...  If it's hot backup, then WAL will fix it; if it's done after a
> clean shut-down, nothing should be writing to those files (much less
> multiple writes in the same second), so checksum shouldn't be
> necessary...
> 
> If you're doing rsync w/o doing pg_start_backup/pg_stop_backup, that's
> not likely to work even *with* --checksum..
> 
> So, can you explain which case you're specifically worried about?

OK.  The basic problem is that I previously was not clear about how
reliant our use of rsync (without --checksum) was on the presence of WAL
replay.

Here is an example from our documentation that doesn't have WAL replay:
http://www.postgresql.org/docs/9.2/static/backup-file.htmlAnother option is to use rsync to perform a file system
backup.This isdone by first running rsync while the database server is running, thenshutting down the database server
justlong enough to do a second rsync.The second rsync will be much quicker than the first, because it hasrelatively
littledata to transfer, and the end result will beconsistent because the server was down. This method allows a file
systembackupto be performed with minimal downtime.
 

Now, if a write happens in both the first and second half of a second,
and only the first write is seen by the first rsync, I don't think the
second rsync will see the write, and hence the backup will be
inconsistent.

--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + It's impossible for everything to be true. +


pgsql-hackers by date:

Previous
From: Joel Jacobson
Date:
Subject: Re: Schema version management
Next
From: Stephen Frost
Date:
Subject: Re: Use of rsync for data directory copying