Thread: The OS Command for pg_hotbackup -- Use lvmsnapshot instead of tar cvzf

The OS Command for pg_hotbackup -- Use lvmsnapshot instead of tar cvzf

From
"Arnold, Sandra"
Date:
I am currently using the pg_hotbackup Perl script to backup my production PostgreSQL database.  However, the next production database is going to be close to a terrabyte in size.  We feel that a tar cvzf command is not the correct OS backup solution to use.  Instead we would like to use lvmsnapshot.  Is anyone using the pg_hotbackup script but instead of the tar cvzf OS command are they using lvmsnapshot?
 
Or, is anyone using lvmsnapshot and doing point-in-time recoveries?
 
Thanks,

Sandra Arnold
Sr. Database Administrator
DOE/OSTI
Oak Ridge, TN
Arnold, Sandra wrote:
> I am currently using the pg_hotbackup Perl script to backup my
> production PostgreSQL database.  However, the next production database
> is going to be close to a terrabyte in size.  We feel that a tar cvzf
> command is not the correct OS backup solution to use.  Instead we
> would like to use lvmsnapshot.  Is anyone using the pg_hotbackup
> script but instead of the tar cvzf OS command are they using lvmsnapshot?
I believe that if you take an LVM snapshot and the change size exceeds
the snapshot size (a configurable value) then the snapshot gets
released, so you might want to be careful with this solution (I don't
know how much data you accumulate, or how fast you accumulate it, or if
that rate changes, etc.).  You'll also need to ensure that you snapshot
any and all tablespaces (if you use pg_start_backup() first the
snapshots need not be simultaneous, but they still must be done).

You might consider using rsync instead, it should be a lot faster than
tar, and doesn't have the same downside as using snapshots.  You'll also
find that it is fairly widely used in the PostgreSQL community.

--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
919-463-0999/877-258-8987
http://www.otg-nc.com
Expert PostgreSQL, Python, PHP, and other open source training.


On Tue, Mar 2, 2010 at 8:43 AM, Arnold, Sandra <ArnoldS@osti.gov> wrote:
> I am currently using the pg_hotbackup Perl script to backup my production
> PostgreSQL database.  However, the next production database is going to be
> close to a terrabyte in size.  We feel that a tar cvzf command is not the
> correct OS backup solution to use.  Instead we would like to use
> lvmsnapshot.  Is anyone using the pg_hotbackup script but instead of the tar
> cvzf OS command are they using lvmsnapshot?
>
> Or, is anyone using lvmsnapshot and doing point-in-time recoveries?

I do not run my databases on LVM because, at least in the past, LVM
did not properly honor fsync / write barrier commands.  I don't know
if this has since  been fixed.   If your transaction rate on LVM is
unnaturally higher than it should / could be, then suspect it is not
fsyncing, and putting your data on it is putting it at risk.

FYI, here's the conversations on the subject of LVM and write barriers
from last year about this time:

http://www.redhat.com/archives/linux-lvm/2009-March/msg00025.html
http://archives.postgresql.org/pgsql-general/2009-03/msg00204.php

Note that it's also considered bad form to post the same question to
multiple lists.

On Tue, Mar 2, 2010 at 12:20 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
> On Tue, Mar 2, 2010 at 8:43 AM, Arnold, Sandra <ArnoldS@osti.gov> wrote:
>> I am currently using the pg_hotbackup Perl script to backup my production
>> PostgreSQL database.  However, the next production database is going to be
>> close to a terrabyte in size.  We feel that a tar cvzf command is not the
>> correct OS backup solution to use.  Instead we would like to use
>> lvmsnapshot.  Is anyone using the pg_hotbackup script but instead of the tar
>> cvzf OS command are they using lvmsnapshot?
>>
>> Or, is anyone using lvmsnapshot and doing point-in-time recoveries?
>
> I do not run my databases on LVM because, at least in the past, LVM
> did not properly honor fsync / write barrier commands.  I don't know
> if this has since  been fixed.   If your transaction rate on LVM is
> unnaturally higher than it should / could be, then suspect it is not
> fsyncing, and putting your data on it is putting it at risk.

From what I've read further on, it seems it's ok as long as it's on a
storage system like a battery backed RAID controller.  Still, someone
on the list did some testing and LVM resulted in a reduced throughput
when on large fast RAID arrays, so do some testing before you set off
into production with LVM.