Thread: Backup Method
Hi everyone, I am reaching the point with my database backups where the backups are taking too long, and starting to interfere with running of the system during the day. So I am looking for a bit of sage advice as to how to proceed. For a typical server, I have a single database cluster with multiple database that I backup nightly using sequential pg_dumps to NFS. So what I am looking for is a backup regime with less impact that will run overnight but faster or less impact than the pg_dump. Would a file system backup be the better option in PITR format, or is pg_dumpall more efficient than pg_dump? Look forward to your advice. Howard.
Le 3 juil. 2015 12:03 PM, "howardnews@selestial.com" <howardnews@selestial.com> a écrit :
>
> Hi everyone,
>
> I am reaching the point with my database backups where the backups are taking too long, and starting to interfere with running of the system during the day. So I am looking for a bit of sage advice as to how to proceed.
>
> For a typical server, I have a single database cluster with multiple database that I backup nightly using sequential pg_dumps to NFS.
>
> So what I am looking for is a backup regime with less impact that will run overnight but faster or less impact than the pg_dump.
>
> Would a file system backup be the better option in PITR format, or is pg_dumpall more efficient than pg_dump?
>
pg_dumpall is not more efficient than pg_dump. As far as I can tell, you're looking for PITR backups.
On 03/07/2015 11:15, Guillaume Lelarge wrote: > > pg_dumpall is not more efficient than pg_dump. As far as I can tell, > you're looking for PITR backups. > Thanks Guillame. In that case is there any recommendation for how often to make base backups in relation to the size of the cluster and the size of the WAL?
Le 3 juil. 2015 12:31 PM, "howardnews@selestial.com" <howardnews@selestial.com> a écrit :
>
>
>
> On 03/07/2015 11:15, Guillaume Lelarge wrote:
>>
>>
>> pg_dumpall is not more efficient than pg_dump. As far as I can tell, you're looking for PITR backups.
>>
>
> Thanks Guillame.
>
> In that case is there any recommendation for how often to make base backups in relation to the size of the cluster and the size of the WAL?
>
Nope, not really. That depends on a lot of things. Our customers usually do one per day.
--
Guillaume
On 03/07/2015 11:39, Guillaume Lelarge wrote: > > > In that case is there any recommendation for how often to make base > backups in relation to the size of the cluster and the size of the WAL? > > > > Nope, not really. That depends on a lot of things. Our customers > usually do one per day. > > -- > Guillaume > Excuse my ignorance... Is the base backup, in general, faster than pg_dump?
Am 2015-07-03 13:00, schrieb howardnews@selestial.com: > On 03/07/2015 11:39, Guillaume Lelarge wrote: >> >> > In that case is there any recommendation for how often to make >> base backups in relation to the size of the cluster and the size of >> the WAL? >> > >> >> Nope, not really. That depends on a lot of things. Our customers >> usually do one per day. >> >> > Excuse my ignorance... Is the base backup, in general, faster than > pg_dump? It is a different approach. With the base backup you are actually backing up files from the filesystem ($PGDATA directory), whereas with pg_dump your saving the SQL commands to reload and rebuild the database. "Usually" a file based backup will be faster, both on backup and restore, but it is - as mentioned - a different approach and it might also not serve all your purposes. That is why I do weekly base backups (plus WAL Archiving) and use pg_dump in a parallel way to do "logical" backups every night. Regards, Jan
On Fri, 03 Jul 2015 13:16:02 +0200 Jan Lentfer <Jan.Lentfer@web.de> wrote: > Am 2015-07-03 13:00, schrieb howardnews@selestial.com: > > On 03/07/2015 11:39, Guillaume Lelarge wrote: > >> > >> > In that case is there any recommendation for how often to make > >> base backups in relation to the size of the cluster and the size of > >> the WAL? > >> > > >> > >> Nope, not really. That depends on a lot of things. Our customers > >> usually do one per day. > >> > >> > > Excuse my ignorance... Is the base backup, in general, faster than > > pg_dump? > > It is a different approach. With the base backup you are actually > backing up files from the filesystem ($PGDATA directory), whereas with > pg_dump your saving the SQL commands to reload and rebuild the database. > "Usually" a file based backup will be faster, both on backup and > restore, but it is - as mentioned - a different approach and it might > also not serve all your purposes. One of the things that makes a lot of difference is the amount of redundant data in the database. For example, indexes are completely redundant. They sure do speed things up, but they're storing the same data 2x for each index you have. When you do a base backup, you have to copy all that redundancy, but when you do a pg_dump, all that redundant data is reduced to a single CREATE INDEX command. The result being that if your database has a lot of indexes, the pg_dump might actually be faster. But the only way to know is to try it out on your particular system. -- Bill Moran
On 03/07/2015 12:23, Bill Moran wrote: > On Fri, 03 Jul 2015 13:16:02 +0200 > Jan Lentfer <Jan.Lentfer@web.de> wrote: > >> Am 2015-07-03 13:00, schrieb howardnews@selestial.com: >>> On 03/07/2015 11:39, Guillaume Lelarge wrote: >>>>> In that case is there any recommendation for how often to make >>>> base backups in relation to the size of the cluster and the size of >>>> the WAL? >>>> Nope, not really. That depends on a lot of things. Our customers >>>> usually do one per day. >>>> >>>> >>> Excuse my ignorance... Is the base backup, in general, faster than >>> pg_dump? >> It is a different approach. With the base backup you are actually >> backing up files from the filesystem ($PGDATA directory), whereas with >> pg_dump your saving the SQL commands to reload and rebuild the database. >> "Usually" a file based backup will be faster, both on backup and >> restore, but it is - as mentioned - a different approach and it might >> also not serve all your purposes. > One of the things that makes a lot of difference is the amount of > redundant data in the database. For example, indexes are completely > redundant. They sure do speed things up, but they're storing the same > data 2x for each index you have. When you do a base backup, you have > to copy all that redundancy, but when you do a pg_dump, all that > redundant data is reduced to a single CREATE INDEX command. The > result being that if your database has a lot of indexes, the pg_dump > might actually be faster. > > But the only way to know is to try it out on your particular system. > Thanks everyone. I am trying to move away from pg_dump as it is proving too slow. The size of the database clusters are approaching 1TB (with multiple individual compressed pg_dumps of around 100GB each, but the pace of change is relatively glacial compared to the size so I am hoping that WAL backups will prove to be much more efficient. As you all point out it looks like I will need to test the various methods to find the best solution for me. A supplementary question would be: would rsync be a viable alternative to pg_basebackup when performing the file system copy. I have seen a few posts on this subject which suggest rsync is more prone to mistakes but is potentially the faster option. Love to hear all your thoughts on this subject before I risk trying rsync.
Hi everyone,
I am reaching the point with my database backups where the backups are
taking too long, and starting to interfere with running of the system
during the day. So I am looking for a bit of sage advice as to how to
proceed.
For a typical server, I have a single database cluster with multiple
database that I backup nightly using sequential pg_dumps to NFS.
So what I am looking for is a backup regime with less impact that will
run overnight but faster or less impact than the pg_dump.
Would a file system backup be the better option in PITR format, or is
pg_dumpall more efficient than pg_dump?
Look forward to your advice.
Attachment
On 07/03/2015 08:08 AM, howardnews@selestial.com wrote: > I am trying to move away from pg_dump as it is proving too slow. Have you looked into barman? http://www.pgbarman.org Also, another potential approach is to setup replication and to do the backups from the slave.
On 7/3/15 8:08 AM, howardnews@selestial.com wrote: > Thanks everyone. > > I am trying to move away from pg_dump as it is proving too slow. The > size of the database clusters are approaching 1TB (with multiple > individual compressed pg_dumps of around 100GB each, but the pace of > change is relatively glacial compared to the size so I am hoping that > WAL backups will prove to be much more efficient. As you all point out > it looks like I will need to test the various methods to find the best > solution for me. PgBackRest might be a good fit for you: https://github.com/pgmasters/backrest It supports compressed full, differential, and incremental backups and can do local backups to an NFS mount without a lot of configuration. > A supplementary question would be: would rsync be a viable alternative > to pg_basebackup when performing the file system copy. I have seen a > few posts on this subject which suggest rsync is more prone to > mistakes but is potentially the faster option. Love to hear all your > thoughts on this subject before I risk trying rsync. Rsync is potentially dangerous for an incremental (or resumed) backup. PgBackRest is not based on rsync and is safe for incrementals. It also offers destination compression which rsync cannot do, though of course you could backup to a compressed file system. More importantly you still need to deal with the WAL and PgBackRest does that for you. If you have a large database with a small rate of change then a weekly/biweekly full backup with daily incremental/differential should be very efficient. -- - David Steele david@pgmasters.net