Thread: Backup Method

Backup Method

From
"howardnews@selestial.com"
Date:
Hi everyone,

I am reaching the point with my database backups where the backups are
taking too long, and starting to interfere with running of the system
during the day. So I am looking for a bit of sage advice as to how to
proceed.

For a typical server, I have a single database cluster with multiple
database that I backup nightly using sequential pg_dumps to NFS.

So what I am looking for is a backup regime with less impact that will
run overnight but faster or less impact than the pg_dump.

Would a file system backup be the better option in PITR format, or is
pg_dumpall more efficient than pg_dump?

Look forward to your advice.

Howard.


Re: Backup Method

From
Guillaume Lelarge
Date:

Le 3 juil. 2015 12:03 PM, "howardnews@selestial.com" <howardnews@selestial.com> a écrit :
>
> Hi everyone,
>
> I am reaching the point with my database backups where the backups are taking too long, and starting to interfere with running of the system during the day. So I am looking for a bit of sage advice as to how to proceed.
>
> For a typical server, I have a single database cluster with multiple database that I backup nightly using sequential pg_dumps to NFS.
>
> So what I am looking for is a backup regime with less impact that will run overnight but faster or less impact than the pg_dump.
>
> Would a file system backup be the better option in PITR format, or is pg_dumpall more efficient than pg_dump?
>

pg_dumpall is not more efficient than pg_dump. As far as I can tell, you're looking for PITR backups.

Re: Backup Method

From
"howardnews@selestial.com"
Date:

On 03/07/2015 11:15, Guillaume Lelarge wrote:
>
> pg_dumpall is not more efficient than pg_dump. As far as I can tell,
> you're looking for PITR backups.
>

Thanks Guillame.

In that case is there any recommendation for how often to make base
backups in relation to the size of the cluster and the size of the WAL?


Re: Backup Method

From
Guillaume Lelarge
Date:

Le 3 juil. 2015 12:31 PM, "howardnews@selestial.com" <howardnews@selestial.com> a écrit :
>
>
>
> On 03/07/2015 11:15, Guillaume Lelarge wrote:
>>
>>
>> pg_dumpall is not more efficient than pg_dump. As far as I can tell, you're looking for PITR backups.
>>
>
> Thanks Guillame.
>
> In that case is there any recommendation for how often to make base backups in relation to the size of the cluster and the size of the WAL?
>

Nope, not really. That depends on a lot of things. Our customers usually do one per day.

--
Guillaume

Re: Backup Method

From
"howardnews@selestial.com"
Date:

On 03/07/2015 11:39, Guillaume Lelarge wrote:
>
> > In that case is there any recommendation for how often to make base
> backups in relation to the size of the cluster and the size of the WAL?
> >
>
> Nope, not really. That depends on a lot of things. Our customers
> usually do one per day.
>
> --
> Guillaume
>
Excuse my ignorance... Is the base backup, in general, faster than pg_dump?



Re: Backup Method

From
Jan Lentfer
Date:
Am 2015-07-03 13:00, schrieb howardnews@selestial.com:
> On 03/07/2015 11:39, Guillaume Lelarge wrote:
>>
>> > In that case is there any recommendation for how often to make
>> base backups in relation to the size of the cluster and the size of
>> the WAL?
>> >
>>
>> Nope, not really. That depends on a lot of things. Our customers
>> usually do one per day.
>>
>>
> Excuse my ignorance... Is the base backup, in general, faster than
> pg_dump?

It is a different approach. With the base backup you are actually
backing up files from the filesystem ($PGDATA directory), whereas with
pg_dump your saving the SQL commands to reload and rebuild the database.
"Usually" a file based backup will be faster, both on backup and
restore, but it is - as mentioned - a different approach and it might
also not serve all your purposes.

That is why I do weekly base backups (plus WAL Archiving) and use
pg_dump in a parallel way to do "logical" backups every night.

Regards,

Jan


Re: Backup Method

From
Bill Moran
Date:
On Fri, 03 Jul 2015 13:16:02 +0200
Jan Lentfer <Jan.Lentfer@web.de> wrote:

> Am 2015-07-03 13:00, schrieb howardnews@selestial.com:
> > On 03/07/2015 11:39, Guillaume Lelarge wrote:
> >>
> >> > In that case is there any recommendation for how often to make
> >> base backups in relation to the size of the cluster and the size of
> >> the WAL?
> >> >
> >>
> >> Nope, not really. That depends on a lot of things. Our customers
> >> usually do one per day.
> >>
> >>
> > Excuse my ignorance... Is the base backup, in general, faster than
> > pg_dump?
>
> It is a different approach. With the base backup you are actually
> backing up files from the filesystem ($PGDATA directory), whereas with
> pg_dump your saving the SQL commands to reload and rebuild the database.
> "Usually" a file based backup will be faster, both on backup and
> restore, but it is - as mentioned - a different approach and it might
> also not serve all your purposes.

One of the things that makes a lot of difference is the amount of
redundant data in the database. For example, indexes are completely
redundant. They sure do speed things up, but they're storing the same
data 2x for each index you have. When you do a base backup, you have
to copy all that redundancy, but when you do a pg_dump, all that
redundant data is reduced to a single CREATE INDEX command. The
result being that if your database has a lot of indexes, the pg_dump
might actually be faster.

But the only way to know is to try it out on your particular system.

--
Bill Moran


Re: Backup Method

From
"howardnews@selestial.com"
Date:

On 03/07/2015 12:23, Bill Moran wrote:
> On Fri, 03 Jul 2015 13:16:02 +0200
> Jan Lentfer <Jan.Lentfer@web.de> wrote:
>
>> Am 2015-07-03 13:00, schrieb howardnews@selestial.com:
>>> On 03/07/2015 11:39, Guillaume Lelarge wrote:
>>>>> In that case is there any recommendation for how often to make
>>>> base backups in relation to the size of the cluster and the size of
>>>> the WAL?
>>>> Nope, not really. That depends on a lot of things. Our customers
>>>> usually do one per day.
>>>>
>>>>
>>> Excuse my ignorance... Is the base backup, in general, faster than
>>> pg_dump?
>> It is a different approach. With the base backup you are actually
>> backing up files from the filesystem ($PGDATA directory), whereas with
>> pg_dump your saving the SQL commands to reload and rebuild the database.
>> "Usually" a file based backup will be faster, both on backup and
>> restore, but it is - as mentioned - a different approach and it might
>> also not serve all your purposes.
> One of the things that makes a lot of difference is the amount of
> redundant data in the database. For example, indexes are completely
> redundant. They sure do speed things up, but they're storing the same
> data 2x for each index you have. When you do a base backup, you have
> to copy all that redundancy, but when you do a pg_dump, all that
> redundant data is reduced to a single CREATE INDEX command. The
> result being that if your database has a lot of indexes, the pg_dump
> might actually be faster.
>
> But the only way to know is to try it out on your particular system.
>

Thanks everyone.

I am trying to move away from pg_dump as it is proving too slow. The
size of the database clusters are approaching 1TB (with multiple
individual compressed pg_dumps of around 100GB each, but the pace of
change is relatively glacial compared to the size so I am hoping that
WAL backups will prove to be much more efficient. As you all point out
it looks like I will need to test the various methods to find the best
solution for me.

A supplementary question would be: would rsync be a viable alternative
to pg_basebackup when performing the file system copy. I have seen a few
posts on this subject which suggest rsync is more prone to mistakes but
is potentially the faster option. Love to hear all your thoughts on this
subject before I risk trying rsync.


Re: Backup Method

From
Andreas Joseph Krogh
Date:
På fredag 03. juli 2015 kl. 11:59:49, skrev howardnews@selestial.com <howardnews@selestial.com>:
Hi everyone,

I am reaching the point with my database backups where the backups are
taking too long, and starting to interfere with running of the system
during the day. So I am looking for a bit of sage advice as to how to
proceed.

For a typical server, I have a single database cluster with multiple
database that I backup nightly using sequential pg_dumps to NFS.

So what I am looking for is a backup regime with less impact that will
run overnight but faster or less impact than the pg_dump.

Would a file system backup be the better option in PITR format, or is
pg_dumpall more efficient than pg_dump?

Look forward to your advice.
 
Use Barman http://www.pgbarman.org/
 
We're continuously backing up a 1.5TB cluster without any problems. Full PITR with a window of 1 week.
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Attachment

Re: Backup Method

From
Francisco Reyes
Date:
On 07/03/2015 08:08 AM, howardnews@selestial.com wrote:
> I am trying to move away from pg_dump as it is proving too slow.


Have you looked into barman?
http://www.pgbarman.org

Also, another potential approach is to setup replication and to do the
backups from the slave.


Re: Backup Method

From
David Steele
Date:
On 7/3/15 8:08 AM, howardnews@selestial.com wrote:
> Thanks everyone.
>
> I am trying to move away from pg_dump as it is proving too slow. The
> size of the database clusters are approaching 1TB (with multiple
> individual compressed pg_dumps of around 100GB each, but the pace of
> change is relatively glacial compared to the size so I am hoping that
> WAL backups will prove to be much more efficient. As you all point out
> it looks like I will need to test the various methods to find the best
> solution for me.

PgBackRest might be a good fit for you:
https://github.com/pgmasters/backrest

It supports compressed full, differential, and incremental backups and
can do local backups to an NFS mount without a lot of configuration.

> A supplementary question would be: would rsync be a viable alternative
> to pg_basebackup when performing the file system copy. I have seen a
> few posts on this subject which suggest rsync is more prone to
> mistakes but  is potentially the faster option. Love to hear all your
> thoughts on this subject before I risk trying rsync.

Rsync is potentially dangerous for an incremental (or resumed) backup.
PgBackRest is not based on rsync and is safe for incrementals.  It also
offers destination compression which rsync cannot do, though of course
you could backup to a compressed file system.

More importantly you still need to deal with the WAL and PgBackRest does
that for you.  If you have a large database with a small rate of change
then a weekly/biweekly full backup with daily incremental/differential
should be very efficient.

--
- David Steele
david@pgmasters.net



Attachment