Thread: [PERFORM] Backup taking long time !!!
Hi Expert,
I have a database having size around 1350 GB, created in PostgreSQL-9.1 in Linux platform.
I am using pg_dump to take backup which takes around 12 hours to complete.
Could you please suggest me how I can make my backup fast so that it complete in less hours?
Thanks in advance.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------------------------------------------
Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
Hi Expert,
I have a database having size around 1350 GB, created in PostgreSQL-9.1 in Linux platform.
I am using pg_dump to take backup which takes around 12 hours to complete.
Could you please suggest me how I can make my backup fast so that it complete in less hours?
Thanks in advance.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------
------------------------------ ------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
Madusudanan.B.N
Exactly parallel option is there in version 9.3 but I can’t upgrade new version due to some concerns.
Could you suggest in 9.1 how may I fix it.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------------------------------------------
Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
From: Madusudanan.B.N [mailto:b.n.madusudanan@gmail.com]
Sent: 20 January, 2017 5:04 PM
To: Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Backup taking long time !!!
If you can upgrade to a newer version, there is parallel pg dump.
Documentation - https://www.postgresql.org/docs/current/static/backup-dump.html
Which can give significant speed up depending on your machine's I/O capabilities.
On Fri, Jan 20, 2017 at 4:54 PM, Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> wrote:
Hi Expert,
I have a database having size around 1350 GB, created in PostgreSQL-9.1 in Linux platform.
I am using pg_dump to take backup which takes around 12 hours to complete.
Could you please suggest me how I can make my backup fast so that it complete in less hours?
Thanks in advance.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------------------------------------------
Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
--
Regards,
Madusudanan.B.N
Exactly parallel option is there in version 9.3 but I can’t upgrade new version due to some concerns.
Could you suggest in 9.1 how may I fix it.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------
------------------------------ ------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
From: Madusudanan.B.N [mailto:b.n.madusudanan@gmail.
com]
Sent: 20 January, 2017 5:04 PM
To: Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Backup taking long time !!!
If you can upgrade to a newer version, there is parallel pg dump.
Documentation - https://www.postgresql.org/
docs/current/static/backup- dump.html
Which can give significant speed up depending on your machine's I/O capabilities.
On Fri, Jan 20, 2017 at 4:54 PM, Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> wrote:
Hi Expert,
I have a database having size around 1350 GB, created in PostgreSQL-9.1 in Linux platform.
I am using pg_dump to take backup which takes around 12 hours to complete.
Could you please suggest me how I can make my backup fast so that it complete in less hours?
Thanks in advance.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------
------------------------------ ------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
--
Regards,
Madusudanan.B.N
Hi2017-01-20 12:43 GMT+01:00 Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>:Exactly parallel option is there in version 9.3 but I can’t upgrade new version due to some concerns.
Could you suggest in 9.1 how may I fix it.
1. don't use it - you can use physical full backup with export transaction segments.or2. buy faster IORegardsPavel Stehule
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------
------------------------------ ------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
From: Madusudanan.B.N [mailto:b.n.madusudanan@gmail.
com]
Sent: 20 January, 2017 5:04 PM
To: Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Backup taking long time !!!
If you can upgrade to a newer version, there is parallel pg dump.
Documentation - https://www.postgresql.org/d
ocs/current/static/backup-dump .html
Which can give significant speed up depending on your machine's I/O capabilities.
On Fri, Jan 20, 2017 at 4:54 PM, Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> wrote:
Hi Expert,
I have a database having size around 1350 GB, created in PostgreSQL-9.1 in Linux platform.
I am using pg_dump to take backup which takes around 12 hours to complete.
Could you please suggest me how I can make my backup fast so that it complete in less hours?
Thanks in advance.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------
------------------------------ ------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
--
Regards,
Madusudanan.B.N
Madusudanan.B.N
Dear Pavel,
Thanks for quick response.
May I know how can I use physical full backup with export transaction segments.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------------------------------------------
Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
From: Pavel Stehule [mailto:pavel.stehule@gmail.com]
Sent: 20 January, 2017 5:19 PM
To: Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>
Cc: Madusudanan.B.N <b.n.madusudanan@gmail.com>; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Backup taking long time !!!
Hi
2017-01-20 12:43 GMT+01:00 Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>:
Exactly parallel option is there in version 9.3 but I can’t upgrade new version due to some concerns.
Could you suggest in 9.1 how may I fix it.
1. don't use it - you can use physical full backup with export transaction segments.
or
2. buy faster IO
Regards
Pavel Stehule
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------------------------------------------
Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
From: Madusudanan.B.N [mailto:b.n.madusudanan@gmail.com]
Sent: 20 January, 2017 5:04 PM
To: Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Backup taking long time !!!
If you can upgrade to a newer version, there is parallel pg dump.
Documentation - https://www.postgresql.org/docs/current/static/backup-dump.html
Which can give significant speed up depending on your machine's I/O capabilities.
On Fri, Jan 20, 2017 at 4:54 PM, Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> wrote:
Hi Expert,
I have a database having size around 1350 GB, created in PostgreSQL-9.1 in Linux platform.
I am using pg_dump to take backup which takes around 12 hours to complete.
Could you please suggest me how I can make my backup fast so that it complete in less hours?
Thanks in advance.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------------------------------------------
Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
--
Regards,
Madusudanan.B.N
Dear Pavel,
Thanks for quick response.
May I know how can I use physical full backup with export transaction segments.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------
------------------------------ ------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
From: Pavel Stehule [mailto:pavel.stehule@gmail.
com]
Sent: 20 January, 2017 5:19 PM
To: Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>
Cc: Madusudanan.B.N <b.n.madusudanan@gmail.com>; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Backup taking long time !!!
Hi
2017-01-20 12:43 GMT+01:00 Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>:
Exactly parallel option is there in version 9.3 but I can’t upgrade new version due to some concerns.
Could you suggest in 9.1 how may I fix it.
1. don't use it - you can use physical full backup with export transaction segments.
or
2. buy faster IO
Regards
Pavel Stehule
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------
------------------------------ ------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
From: Madusudanan.B.N [mailto:b.n.madusudanan@gmail.
com]
Sent: 20 January, 2017 5:04 PM
To: Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Backup taking long time !!!
If you can upgrade to a newer version, there is parallel pg dump.
Documentation - https://www.postgresql.org/
docs/current/static/backup- dump.html
Which can give significant speed up depending on your machine's I/O capabilities.
On Fri, Jan 20, 2017 at 4:54 PM, Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> wrote:
Hi Expert,
I have a database having size around 1350 GB, created in PostgreSQL-9.1 in Linux platform.
I am using pg_dump to take backup which takes around 12 hours to complete.
Could you please suggest me how I can make my backup fast so that it complete in less hours?
Thanks in advance.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------
------------------------------ ------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
--
Regards,
Madusudanan.B.N
* Pavel Stehule (pavel.stehule@gmail.com) wrote: > 2017-01-20 12:53 GMT+01:00 Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com>: > > Thanks for quick response. > > > > May I know how can I use physical full backup with export transaction > > segments. > > > > https://www.postgresql.org/docs/9.1/static/continuous-archiving.html > > This process can be automatized by some applications like barman > http://www.pgbarman.org/ Last I checked, barman is still single-threaded. If the database is large enough that you need multi-process backup, I'd suggest looking at pgbackrest- http://www.pgbackrest.org. pgbackrest has parallel backup, incremental/differential/full backup support, supports compression, CRC checking, and a whole ton of other good stuff. Thanks! Stephen
Attachment
20 янв. 2017 г., в 15:22, Stephen Frost <sfrost@snowman.net> написал(а):
This process can be automatized by some applications like barman
http://www.pgbarman.org/
Last I checked, barman is still single-threaded.
If the database is large enough that you need multi-process backup, I'd
suggest looking at pgbackrest- http://www.pgbackrest.org.
pgbackrest has parallel backup, incremental/differential/full backup
support, supports compression, CRC checking, and a whole ton of other
good stuff.
Vladimir, * Vladimir Borodin (root@simply.name) wrote: > > 20 янв. 2017 г., в 15:22, Stephen Frost <sfrost@snowman.net> написал(а): > >> This process can be automatized by some applications like barman > >> http://www.pgbarman.org/ > > > > Last I checked, barman is still single-threaded. > > > > If the database is large enough that you need multi-process backup, I'd > > suggest looking at pgbackrest- http://www.pgbackrest.org. > > > > pgbackrest has parallel backup, incremental/differential/full backup > > support, supports compression, CRC checking, and a whole ton of other > > good stuff. > > Increments in pgbackrest are done on file level which is not really efficient. We have done parallelism, compression andpage-level increments (9.3+) in barman fork [1], but unfortunately guys from 2ndquadrant-it don’t hurry to work on it. We're looking at page-level incremental backup in pgbackrest also. For larger systems, we've not heard too much complaining about it being file-based though, which is why it hasn't been a priority. Of course, the OP is on 9.1 too, so. As for your fork, well, I can't say I really blame the barman folks for being cautious- that's usually a good thing in your backup software. :) I'm curious how you're handling compressed page-level incremental backups though. I looked through barman-incr and it wasn't obvious to me what was going wrt how the incrementals are stored, are they ending up as sparse files, or are you actually copying/overwriting the prior file in the backup repository? Apologies, python isn't my first language, but the lack of any comment anywhere in that file doesn't really help. > And actually it would be much better to do a good backup and recovery manager part of the core postgres. Sure, but that's not going to happen for 9.1, or even 9.6, and I doubt PG10 is going to suddenly get parallel base-backup with compression. I've been discussing ways to improve the situation with Magnus and we do have some ideas about it, but that's really an independent effort as we're still going to need a tool for released versions of PG. Thanks! Stephen
Attachment
20 янв. 2017 г., в 16:40, Stephen Frost <sfrost@snowman.net> написал(а):Vladimir,Increments in pgbackrest are done on file level which is not really efficient. We have done parallelism, compression and page-level increments (9.3+) in barman fork [1], but unfortunately guys from 2ndquadrant-it don’t hurry to work on it.
We're looking at page-level incremental backup in pgbackrest also. For
larger systems, we've not heard too much complaining about it being
file-based though, which is why it hasn't been a priority. Of course,
the OP is on 9.1 too, so.
As for your fork, well, I can't say I really blame the barman folks for
being cautious- that's usually a good thing in your backup software. :)
I'm curious how you're handling compressed page-level incremental
backups though. I looked through barman-incr and it wasn't obvious to
me what was going wrt how the incrementals are stored, are they ending
up as sparse files, or are you actually copying/overwriting the prior
file in the backup repository?
Apologies, python isn't my first
language, but the lack of any comment anywhere in that file doesn't
really help.
Vladimir, * Vladimir Borodin (root@simply.name) wrote: > > 20 янв. 2017 г., в 16:40, Stephen Frost <sfrost@snowman.net> написал(а): > >> Increments in pgbackrest are done on file level which is not really efficient. We have done parallelism, compressionand page-level increments (9.3+) in barman fork [1], but unfortunately guys from 2ndquadrant-it don’t hurry towork on it. > > > > We're looking at page-level incremental backup in pgbackrest also. For > > larger systems, we've not heard too much complaining about it being > > file-based though, which is why it hasn't been a priority. Of course, > > the OP is on 9.1 too, so. > > Well, we have forked barman and made everything from the above just because we needed ~ 2 PB of disk space for storingbackups for our ~ 300 TB of data. (Our recovery window is 7 days) And on 5 TB database it took a lot of time to make/restorea backup. Right, without incremental or compressed backups, you'd have to have room for 7 full copies of your database. Have you looked at what your incrementals would be like with file-level incrementals and compression? Single-process backup/restore is definitely going to be slow. We've seen pgbackrest doing as much as 3TB/hr with 32 cores handling compression. Of course, your i/o, network, et al, need to be able to handle it. > > As for your fork, well, I can't say I really blame the barman folks for > > being cautious- that's usually a good thing in your backup software. :) > > The reason seems to be not the caution but the lack of time for working on it. But yep, it took us half a year to deployour fork everywhere. And it would take much more time if we didn’t have system for checking backups consistency. How are you testing your backups..? Do you have page-level checksums enabled on your database? pgbackrest recently added the ability to check PG page-level checksums during a backup and report issues. We've also been looking at how to use pgbackrest to do backup/restore+replay page-level difference analysis but there's still a number of things which can cause differences, so it's a bit difficult to do. Of course, doing a pgbackrest-restore-replay+pg_dump+pg_restore is pretty easy to do and we do use that in some places to validate backups. > > I'm curious how you're handling compressed page-level incremental > > backups though. I looked through barman-incr and it wasn't obvious to > > me what was going wrt how the incrementals are stored, are they ending > > up as sparse files, or are you actually copying/overwriting the prior > > file in the backup repository? > > No, we do store each file in the following way. At the beginning you write a map of changed pages. At second you writechanged pages themselves. The compression is streaming so you don’t need much memory for that but the downside of thisapproach is that you read each datafile twice (we believe in page cache here). Ah, yes, I noticed that you passed over the file twice but wasn't quite sure what functools.partial() was doing and a quick read of the docs made me think you were doing seeking there. All the pages are the same size, so I'm surprised you didn't consider just having a format along the lines of: magic+offset+page, magic+offset+page, magic+offset+page, etc... I'd have to defer to David on this, but I think he was considering having some kind of a bitmap to indicate which pages changed instead of storing the full offset as, again, all the pages are the same size. > > Apologies, python isn't my first > > language, but the lack of any comment anywhere in that file doesn't > > really help. > > Not a problem. Actually, it would be much easier to understand if it was a series of commits rather than one commit thatwe do ammend and force-push after each rebase on vanilla barman. We should add comments. Both would make it easier to understand, though the comments would be more helpful for me as I don't actually know the barman code all that well. Thanks! Stephen
Attachment
20 янв. 2017 г., в 18:06, Stephen Frost <sfrost@snowman.net> написал(а):
Right, without incremental or compressed backups, you'd have to have
room for 7 full copies of your database. Have you looked at what your
incrementals would be like with file-level incrementals and compression?
How are you testing your backups..? Do you have page-level checksums
enabled on your database?
pgbackrest recently added the ability to
check PG page-level checksums during a backup and report issues.
Vladimir, * Vladimir Borodin (root@simply.name) wrote: > > 20 янв. 2017 г., в 18:06, Stephen Frost <sfrost@snowman.net> написал(а): > > > > Right, without incremental or compressed backups, you'd have to have > > room for 7 full copies of your database. Have you looked at what your > > incrementals would be like with file-level incrementals and compression? > > Most of our DBs can’t use partitioning over time-series fields, so we have a lot of datafiles in which only a few pageshave been modified. So file-level increments didn’t really work for us. And we didn’t use compression in barman beforepatching it because single-threaded compression sucks. Interesting. That's certainly the kind of use-case we are thinking about for pgbackrest's page-level incremental support. Hopefully it won't be too much longer before we add support for it. > > How are you testing your backups..? Do you have page-level checksums > > enabled on your database? > > Yep, we use checksums. We restore latest backup with recovery_target = 'immediate' and do COPY tablename TO '/dev/null’with checking exit code for each table in each database (in several threads, of course). Right, unfortunately that only checks the heap pages, it won't help with corruption happening in an index file or other files which have a checksum. > > pgbackrest recently added the ability to > > check PG page-level checksums during a backup and report issues. > > Sounds interesting, should take a look. It's done with a C library that's optional and not yet included in the packages on apt/yum.p.o, though we hope it will be soon. The C library is based, unsurprisingly, on the PG backend code and so should be pretty fast. All of the checking is done on whole pgbackrest blocks, in stream, so it doesn't slow down the backup process too much. Thanks! Stephen
Attachment
Julyanto SUTANDANG
Equnix Business Solutions, PT
(An Open Source and Open Mind Company)
www.equnix.co.id
Pusat Niaga ITC Roxy Mas Blok C2/42. Jl. KH Hasyim Ashari 125, Jakarta Pusat
T: +6221 22866662 F: +62216315281 M: +628164858028
Caution: The information enclosed in this email (and any attachments) may be legally privileged and/or confidential and is intended only for the use of the addressee(s). No addressee should forward, print, copy, or otherwise reproduce this message in any manner that would allow it to be viewed by any individual not originally listed as a recipient. If the reader of this message is not the intended recipient, you are hereby notified that any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is strictly prohibited. If you have received this communication in error, please immediately notify the sender and delete this message.Unless it is made by the authorized person, any views expressed in this message are those of the individual sender and may not necessarily reflect the views of PT Equnix Business Solutions.
Hi Expert,
I have a database having size around 1350 GB, created in PostgreSQL-9.1 in Linux platform.
I am using pg_dump to take backup which takes around 12 hours to complete.
Could you please suggest me how I can make my backup fast so that it complete in less hours?
Thanks in advance.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------
------------------------------ ------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
"you might you pg_start_backup to tell the server not to write into the DATADIR"
Julyanto SUTANDANG
Equnix Business Solutions, PT
(An Open Source and Open Mind Company)
www.equnix.co.id
Pusat Niaga ITC Roxy Mas Blok C2/42. Jl. KH Hasyim Ashari 125, Jakarta Pusat
T: +6221 22866662 F: +62216315281 M: +628164858028
Caution: The information enclosed in this email (and any attachments) may be legally privileged and/or confidential and is intended only for the use of the addressee(s). No addressee should forward, print, copy, or otherwise reproduce this message in any manner that would allow it to be viewed by any individual not originally listed as a recipient. If the reader of this message is not the intended recipient, you are hereby notified that any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is strictly prohibited. If you have received this communication in error, please immediately notify the sender and delete this message.Unless it is made by the authorized person, any views expressed in this message are those of the individual sender and may not necessarily reflect the views of PT Equnix Business Solutions.
Hi Dinesh,Best practice in doing full backup is using RSYNC, but before you can copy the DATADIR, you might you pg_start_backup to tell the server not to write into the DATADIR, because you are copying that data. After finished copy all the data in DATADIR, you can ask server to continue flushing the data from logs, by commanding pg_stop_backup. Remember, not to copy the XLOG dir.There are another way more simpler, which is applying command pg_basebackup, which actually did that way in simpler version.if you did pg_dump, you wont get the exact copy of your data, and you will take longer downtime to recover the backup data. By that way, recovering is only starting up the postgres with that copy.Good luck!
Julyanto SUTANDANG
Equnix Business Solutions, PT
(An Open Source and Open Mind Company)
www.equnix.co.id
Pusat Niaga ITC Roxy Mas Blok C2/42. Jl. KH Hasyim Ashari 125, Jakarta Pusat
T: +6221 22866662 F: +62216315281 M: +628164858028
Caution: The information enclosed in this email (and any attachments) may be legally privileged and/or confidential and is intended only for the use of the addressee(s). No addressee should forward, print, copy, or otherwise reproduce this message in any manner that would allow it to be viewed by any individual not originally listed as a recipient. If the reader of this message is not the intended recipient, you are hereby notified that any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is strictly prohibited. If you have received this communication in error, please immediately notify the sender and delete this message.Unless it is made by the authorized person, any views expressed in this message are those of the individual sender and may not necessarily reflect the views of PT Equnix Business Solutions.On Fri, Jan 20, 2017 at 6:24 PM, Dinesh Chandra 12108 <Dinesh.Chandra@cyient.com> wrote:Hi Expert,
I have a database having size around 1350 GB, created in PostgreSQL-9.1 in Linux platform.
I am using pg_dump to take backup which takes around 12 hours to complete.
Could you please suggest me how I can make my backup fast so that it complete in less hours?
Thanks in advance.
Regards,
Dinesh Chandra
|Database administrator (Oracle/PostgreSQL)| Cyient Ltd. Noida.
------------------------------
------------------------------ ------ Mobile: +91-9953975849 | Ext 1078 |dinesh.chandra@cyient.com
Plot No. 7, NSEZ, Phase-II ,Noida-Dadri Road, Noida - 201 305,India.
DISCLAIMER:
This email message is for the sole use of the intended recipient(s) and may contain confidential and privileged information. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply email and destroy all copies of the original message. Check all attachments for viruses before opening them. All views or opinions presented in this e-mail are those of the author and may not reflect the opinion of Cyient or those of our affiliates.
Greetings, * julyanto SUTANDANG (julyanto@equnix.co.id) wrote: > Best practice in doing full backup is using RSYNC, but before you can copy > the DATADIR, you might you pg_start_backup to tell the server not to write > into the DATADIR, because you are copying that data. After finished copy > all the data in DATADIR, you can ask server to continue flushing the data > from logs, by commanding pg_stop_backup. Remember, not to copy the XLOG > dir. Whoah. That is not, at all, correct, if I'm understanding what you're suggesting. PG most certainly *does* continue to write into the data directory even after pg_start_backup() has been run. You *must* use archive_command or pg_receivexlog to capture all of the WAL during the backup to have a consistent backup. > There are another way more simpler, which is applying command > pg_basebackup, which actually did that way in simpler version. pg_basebackup has options to stream the WAL during the backup to capture it, which is how it handles that. > if you did pg_dump, you wont get the exact copy of your data, and you will > take longer downtime to recover the backup data. By that way, recovering is > only starting up the postgres with that copy. pg_dump will generally take longer to do a restore, yes. Recovering from a backup does require that a recovery.conf exists with a restore_command that PG can use to get the WAL files it needs, or that all of the WAL from the backup is in pg_xlog/pg_wal. Please do not claim that PG stops writing to the DATADIR or BASEDIR after a pg_start_backup(), that is not correct and could lead to invalid backups. Thanks! Stephen
Attachment
Greetings, * julyanto SUTANDANG (julyanto@equnix.co.id) wrote: > CORRECTION: > > "you might you pg_start_backup to tell the server not to write into the > DATADIR" > > become > > "you might *use* pg_start_backup to tell the server not to write into the > *BASEDIR*, actually server still writes but only to XLOGDIR " Just to make sure anyone reading the mailing list archives isn't confused, running pg_start_backup does *not* make PG stop writing to BASEDIR (or DATADIR, or anything, really). PG *will* continue to write data into BASEDIR after pg_start_backup has been called. The only thing that pg_start_backup does is identify an entry in the WAL stream, from which point all WAL must be replayed when restoring the backup. All WAL generated from that point (pg_start_backup point) until the pg_stop_backup point *must* be replayed when restoring the backup or the database will not be consistent. Thanks! Stephen
Attachment
Julyanto SUTANDANG
Equnix Business Solutions, PT
(An Open Source and Open Mind Company)
www.equnix.co.id
Pusat Niaga ITC Roxy Mas Blok C2/42. Jl. KH Hasyim Ashari 125, Jakarta Pusat
T: +6221 22866662 F: +62216315281 M: +628164858028
Caution: The information enclosed in this email (and any attachments) may be legally privileged and/or confidential and is intended only for the use of the addressee(s). No addressee should forward, print, copy, or otherwise reproduce this message in any manner that would allow it to be viewed by any individual not originally listed as a recipient. If the reader of this message is not the intended recipient, you are hereby notified that any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is strictly prohibited. If you have received this communication in error, please immediately notify the sender and delete this message.Unless it is made by the authorized person, any views expressed in this message are those of the individual sender and may not necessarily reflect the views of PT Equnix Business Solutions.
Greetings,
* julyanto SUTANDANG (julyanto@equnix.co.id) wrote:
> Best practice in doing full backup is using RSYNC, but before you can copy
> the DATADIR, you might you pg_start_backup to tell the server not to write
> into the DATADIR, because you are copying that data. After finished copy
> all the data in DATADIR, you can ask server to continue flushing the data
> from logs, by commanding pg_stop_backup. Remember, not to copy the XLOG
> dir.
Whoah. That is not, at all, correct, if I'm understanding what you're
suggesting.
PG most certainly *does* continue to write into the data directory even
after pg_start_backup() has been run. You *must* use archive_command or
pg_receivexlog to capture all of the WAL during the backup to have a
consistent backup.
> There are another way more simpler, which is applying command
> pg_basebackup, which actually did that way in simpler version.
pg_basebackup has options to stream the WAL during the backup to capture
it, which is how it handles that.
> if you did pg_dump, you wont get the exact copy of your data, and you will
> take longer downtime to recover the backup data. By that way, recovering is
> only starting up the postgres with that copy.
pg_dump will generally take longer to do a restore, yes. Recovering
from a backup does require that a recovery.conf exists with a
restore_command that PG can use to get the WAL files it needs, or that
all of the WAL from the backup is in pg_xlog/pg_wal.
Please do not claim that PG stops writing to the DATADIR or BASEDIR
after a pg_start_backup(), that is not correct and could lead to invalid
backups.
Thanks!
Stephen
Greetings, * julyanto SUTANDANG (julyanto@equnix.co.id) wrote: > Please elaborate more of what you are saying. What i am saying is based on > the Official Docs, Forum and our own test. This is what we had to do to > save time, both backing up and restoring. > > https://www.postgresql.org/docs/9.6/static/functions-admin.html > > When PostgreSQL in the mode of Start Backup, PostgreSQL only writes to the > XLOG, then you can safely rsync / copy the base data (snapshot) then later > you can have full copy of snapshot backup data. You are confusing two things. After calling pg_start_backup, you can safely copy the contents of the data directory, that is correct. However, PostgreSQL *will* continue to write to the data directory. That, however, is ok, because those changes will *also* be written into the WAL and, after calling pg_start_backup(), you collect all of the WAL using archive_command or pg_receivexlog. With all of the WAL which was created during the backup, PG will be able to recover from the changes made during the backup to the data directory, but you *must* have all of that WAL, or the backup will be inconsistent because of those changes that were made to the data directory after pg_start_backup() was called. In other words, if you aren't using pg_receivexlog or archive_command, your backups are invalid. > if you wanted to backup in later day, you can use rsync then it will copy > faster because rsync only copy the difference, rather than copy all the > data. This is *also* incorrect. rsync, by itself, is *not* safe to use for doing that kind of incremental backup, unless you enable checksums. The reason for this is that rsync has only a 1-second level granularity and it is possible (unlikely, though it has been demonstrated) to miss changes made to a file within that 1-second window. > my latter explanation is: use pg_basebackup, it will do it automatically > for you. Yes, if you are unsure about how to perform a safe backup properly, using pg_basebackup or one of the existing backup tools is, by far, the best approach. Attempting to roll your own backup system based on rsync is not something I am comfortable recommending any more because it is *not* simple to do correctly. Thanks! Stephen
Attachment
> When PostgreSQL in the mode of Start Backup, PostgreSQL only writes to the
> XLOG, then you can safely rsync / copy the base data (snapshot) then later
> you can have full copy of snapshot backup data.
You are confusing two things.
After calling pg_start_backup, you can safely copy the contents of the
data directory, that is correct.
However, PostgreSQL *will* continue to write to the data directory.
That, however, is ok, because those changes will *also* be written into
the WAL and, after calling pg_start_backup(), you collect all of the
WAL using archive_command or pg_receivexlog.
With all of the WAL
which was created during the backup, PG will be able to recover from the
changes made during the backup to the data directory, but you *must*
have all of that WAL, or the backup will be inconsistent because of
those changes that were made to the data directory after
pg_start_backup() was called.
In other words, if you aren't using pg_receivexlog or archive_command,
your backups are invalid.
> if you wanted to backup in later day, you can use rsync then it will copy
> faster because rsync only copy the difference, rather than copy all the
> data.
This is *also* incorrect. rsync, by itself, is *not* safe to use for
doing that kind of incremental backup, unless you enable checksums. The
reason for this is that rsync has only a 1-second level granularity and
it is possible (unlikely, though it has been demonstrated) to miss
changes made to a file within that 1-second window.
> my latter explanation is: use pg_basebackup, it will do it automatically
> for you.
Yes, if you are unsure about how to perform a safe backup properly,
using pg_basebackup or one of the existing backup tools is, by far, the
best approach. Attempting to roll your own backup system based on rsync
is not something I am comfortable recommending any more because it is
*not* simple to do correctly.
Thanks!
Stephen
Greetings, * julyanto SUTANDANG (julyanto@equnix.co.id) wrote: > Thanks for elaborating this Information, this is new, so whatever it is the > procedure is *Correct and Workable*. Backups are extremely important, so I get quite concerned when people provide incorrect information regarding them. > > With all of the WAL > > which was created during the backup, PG will be able to recover from the > > changes made during the backup to the data directory, but you *must* > > have all of that WAL, or the backup will be inconsistent because of > > That is rather out of question, because all what we discuss here is just > doing full/snapshot backup. It's unclear what you mean by 'out of question' or why you believe that it matters if it's a full backup or not. Any backup of PG *must* include all of the WAL that was created during the backup. > The backup is Full Backup or Snapshot and it will work whenever needed. > We are not saying about Incremental Backup yet. > Along with collecting the XLOG File, you can have incremental backup and > having complete continuous data backup. > in this case, Stephen is suggesting on using pg_receivexlog or > archive_command > (everything here is actually explained well on the docs)) No, that is not correct. You must have the WAL for a full backup as well. If I understand what you're suggesting, it's that WAL is only for point-in-time-recovery, but that is *not* correct, WAL is required for restoring a full backup to a consistent state. > those changes that were made to the data directory after > > pg_start_backup() was called. > > > > In other words, if you aren't using pg_receivexlog or archive_command, > > your backups are invalid. > > > I doubt that *invalid* here is a valid word > In term of snapshot backup and as long as the snapshot can be run, that is > valid, isn't it? It's absolutely correct, you must have the WAL generated during your backup or the backup is invalid. If, what you mean by 'snapshot' is a *full-system atomic snapshot*, provided by some layer lower than PostgreSQL that is *exactly* as if the machine was physically turned off all at once, then, and *only* then, can you be guaranteed that PG will be able to recover, but the reason for that is because PG will go back to the last checkpoint that happened, as recorded in pg_control, and replay all of the WAL in the pg_xlog/pg_wal directory, which must all exist and be complete for all committed transaction because the WAL was sync'd to disk before the commit was acknowledged and the WAL is not removed until after a checkpoint has completed which has sync'd the data in the data directory out to the filesystem. That's also known as 'crash recovery' and it works precisely because all of the WAL is available at the time of the event and we have a known point to go back to (the checkpoint). During a backup, multiple checkpoints can occur and WAL will be removed from the pg_xlog/pg_wal directory during the backup; WAL which is critical to the consistency of the database and which must be retained by the user because it must be used to perform WAL replay of the database when restoring from the backup which was made. > > if you wanted to backup in later day, you can use rsync then it will copy > > > faster because rsync only copy the difference, rather than copy all the > > > data. > > > > This is *also* incorrect. rsync, by itself, is *not* safe to use for > > doing that kind of incremental backup, unless you enable checksums. The > > reason for this is that rsync has only a 1-second level granularity and > > it is possible (unlikely, though it has been demonstrated) to miss > > changes made to a file within that 1-second window. > > As long as that is not XLOG file, anyway.. as you are saying that wouldn't > be a problem since actually we can run the XLOG for recovery. . No, that's also not correct, unless you keep all WAL since the *first* full backup. The 1-second window concern is regarding the validity of a subsequent incremental backup. This is what happens, more-or-less: 1- File datadir/A is copied by rsync 2- backup starts, user retains all WAL during backup #1 3- File datadir/A is copied by rsync in the same second as backup started 4- File datadir/A is *subsequently* modified by PG and the data is written out to the filesystem, still within the same second as when the backup started 5- The rsync finishes, the backup finishes, all WAL for backup #1 is retained, which includes the changes made to datadir/A during the backup. Everything is fine at this point for backup #1. 6- A new, incremental, backup is started, called backup #2. 7- rsync does *not* copy the file datadir/A because it was not subsequently changed by the user and the timestamp is the same, according to rsync's 1-second-level granularity. 8- The WAL for backup #2 is retained, but it does not contain any of the changes which were made to datadir/A because *those* changes are in the WAL which was written out during backup #1 9- backup #2 completes, with its WAL retainined 10- At this point, backup #2 is an invalid backup. This is not hypothetical, it's been shown to be possible to have this happen. (side-note: this is all from memory, so perhaps there's a detail or two incorrect, but this is the gist of the issue) > > > my latter explanation is: use pg_basebackup, it will do it automatically > > > for you. > > > > Yes, if you are unsure about how to perform a safe backup properly, > > using pg_basebackup or one of the existing backup tools is, by far, the > > best approach. Attempting to roll your own backup system based on rsync > > is not something I am comfortable recommending any more because it is > > *not* simple to do correctly. > > OK, that is fine, and actually we are using that. You must be sure to use one of the methods with pg_basebackup that keeps all of the WAL created during the full backup. That would be one of: pg_basebackup -x, pg_basebackup -X stream, or pg_basebackup + pg_receivexlog. > the reason why i explain about start_backup and stop_backup is to give a > gradual understand, and hoping that people will get the mechanism in the > back understandable. I'm more than happy to have people explaining about pg_start/stop_backup, but I do have an issue when the explanation is incorrect and could cause a user to use a backup method which will result in an invalid backup. Thanks! Stephen
Attachment
Julyanto SUTANDANG
Equnix Business Solutions, PT
(An Open Source and Open Mind Company)
www.equnix.co.id
Pusat Niaga ITC Roxy Mas Blok C2/42. Jl. KH Hasyim Ashari 125, Jakarta Pusat
T: +6221 22866662 F: +62216315281 M: +628164858028
Caution: The information enclosed in this email (and any attachments) may be legally privileged and/or confidential and is intended only for the use of the addressee(s). No addressee should forward, print, copy, or otherwise reproduce this message in any manner that would allow it to be viewed by any individual not originally listed as a recipient. If the reader of this message is not the intended recipient, you are hereby notified that any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is strictly prohibited. If you have received this communication in error, please immediately notify the sender and delete this message.Unless it is made by the authorized person, any views expressed in this message are those of the individual sender and may not necessarily reflect the views of PT Equnix Business Solutions.
Greetings,
* julyanto SUTANDANG (julyanto@equnix.co.id) wrote:
> Thanks for elaborating this Information, this is new, so whatever it is the
> procedure is *Correct and Workable*.
Backups are extremely important, so I get quite concerned when people
provide incorrect information regarding them.
> > With all of the WAL
> > which was created during the backup, PG will be able to recover from the
> > changes made during the backup to the data directory, but you *must*
> > have all of that WAL, or the backup will be inconsistent because of
>
> That is rather out of question, because all what we discuss here is just
> doing full/snapshot backup.
It's unclear what you mean by 'out of question' or why you believe that
it matters if it's a full backup or not.
Any backup of PG *must* include all of the WAL that was created during
the backup.
> The backup is Full Backup or Snapshot and it will work whenever needed.
> We are not saying about Incremental Backup yet.
> Along with collecting the XLOG File, you can have incremental backup and
> having complete continuous data backup.
> in this case, Stephen is suggesting on using pg_receivexlog or
> archive_command
> (everything here is actually explained well on the docs))
No, that is not correct. You must have the WAL for a full backup as
well. If I understand what you're suggesting, it's that WAL is only for
point-in-time-recovery, but that is *not* correct, WAL is required for
restoring a full backup to a consistent state.
> those changes that were made to the data directory after
> > pg_start_backup() was called.
> >
> > In other words, if you aren't using pg_receivexlog or archive_command,
> > your backups are invalid.
> >
> I doubt that *invalid* here is a valid word
> In term of snapshot backup and as long as the snapshot can be run, that is
> valid, isn't it?
It's absolutely correct, you must have the WAL generated during your
backup or the backup is invalid.
If, what you mean by 'snapshot' is a *full-system atomic snapshot*,
provided by some layer lower than PostgreSQL that is *exactly* as if the
machine was physically turned off all at once, then, and *only* then,
can you be guaranteed that PG will be able to recover, but the reason
for that is because PG will go back to the last checkpoint that
happened, as recorded in pg_control, and replay all of the WAL in the
pg_xlog/pg_wal directory, which must all exist and be complete for all
committed transaction because the WAL was sync'd to disk before the
commit was acknowledged and the WAL is not removed until after a
checkpoint has completed which has sync'd the data in the data directory
out to the filesystem.
That's also known as 'crash recovery' and it works precisely because all
of the WAL is available at the time of the event and we have a known
point to go back to (the checkpoint).
During a backup, multiple checkpoints can occur and WAL will be removed
from the pg_xlog/pg_wal directory during the backup; WAL which is
critical to the consistency of the database and which must be retained
by the user because it must be used to perform WAL replay of the
database when restoring from the backup which was made.
> > if you wanted to backup in later day, you can use rsync then it will copy
> > > faster because rsync only copy the difference, rather than copy all the
> > > data.
> >
> > This is *also* incorrect. rsync, by itself, is *not* safe to use for
> > doing that kind of incremental backup, unless you enable checksums. The
> > reason for this is that rsync has only a 1-second level granularity and
> > it is possible (unlikely, though it has been demonstrated) to miss
> > changes made to a file within that 1-second window.
>
> As long as that is not XLOG file, anyway.. as you are saying that wouldn't
> be a problem since actually we can run the XLOG for recovery. .
No, that's also not correct, unless you keep all WAL since the *first*
full backup.
The 1-second window concern is regarding the validity of a subsequent
incremental backup.
This is what happens, more-or-less:
1- File datadir/A is copied by rsync
2- backup starts, user retains all WAL during backup #1
3- File datadir/A is copied by rsync in the same second as backup
started
4- File datadir/A is *subsequently* modified by PG and the data is
written out to the filesystem, still within the same second as when
the backup started
5- The rsync finishes, the backup finishes, all WAL for backup #1 is
retained, which includes the changes made to datadir/A during the
backup. Everything is fine at this point for backup #1.
6- A new, incremental, backup is started, called backup #2.
7- rsync does *not* copy the file datadir/A because it was not
subsequently changed by the user and the timestamp is the same,
according to rsync's 1-second-level granularity.
8- The WAL for backup #2 is retained, but it does not contain any of the
changes which were made to datadir/A because *those* changes are in
the WAL which was written out during backup #1
9- backup #2 completes, with its WAL retainined
10- At this point, backup #2 is an invalid backup.
This is not hypothetical, it's been shown to be possible to have this
happen.
(side-note: this is all from memory, so perhaps there's a detail or two
incorrect, but this is the gist of the issue)
> > > my latter explanation is: use pg_basebackup, it will do it automatically
> > > for you.
> >
> > Yes, if you are unsure about how to perform a safe backup properly,
> > using pg_basebackup or one of the existing backup tools is, by far, the
> > best approach. Attempting to roll your own backup system based on rsync
> > is not something I am comfortable recommending any more because it is
> > *not* simple to do correctly.
>
> OK, that is fine, and actually we are using that.
You must be sure to use one of the methods with pg_basebackup that keeps
all of the WAL created during the full backup. That would be one of:
pg_basebackup -x, pg_basebackup -X stream, or pg_basebackup +
pg_receivexlog.
> the reason why i explain about start_backup and stop_backup is to give a
> gradual understand, and hoping that people will get the mechanism in the
> back understandable.
I'm more than happy to have people explaining about
pg_start/stop_backup, but I do have an issue when the explanation is
incorrect and could cause a user to use a backup method which will
result in an invalid backup.
Thanks!
Stephen
20 янв. 2017 г., в 19:59, Stephen Frost <sfrost@snowman.net> написал(а):How are you testing your backups..? Do you have page-level checksums
enabled on your database?
Yep, we use checksums. We restore latest backup with recovery_target = 'immediate' and do COPY tablename TO '/dev/null’ with checking exit code for each table in each database (in several threads, of course).
Right, unfortunately that only checks the heap pages, it won't help with
corruption happening in an index file or other files which have a
checksum.
Vladimir, * Vladimir Borodin (root@simply.name) wrote: > > 20 янв. 2017 г., в 19:59, Stephen Frost <sfrost@snowman.net> написал(а): > >>> How are you testing your backups..? Do you have page-level checksums > >>> enabled on your database? > >> > >> Yep, we use checksums. We restore latest backup with recovery_target = 'immediate' and do COPY tablename TO '/dev/null’with checking exit code for each table in each database (in several threads, of course). > > > > Right, unfortunately that only checks the heap pages, it won't help with > > corruption happening in an index file or other files which have a > > checksum. > > That’s fine for us because indexes could be rebuilt. The main idea is the guarantee that data would not be lost. Fair enough, however, if you don't check that the indexes are valid then you could end up with corruption in the database if you start operating against such a recovered database. Consider what happens on a 'primary key' lookup or insert- if the index doesn't find a conflicting tuple (perhaps because the index is corrupt), then it will happily allow the INSERT to go through, even though it should have been prevented. Indexes are a *really* important component to having a valid database. If you aren't checking the validity of them, then you're running the risk of being exposed to corruption in them, either ongoing or when restoring. If you *always* rebuild your indexes when restoring from a backup, then you should be fine, of course, but if you're going to do that then you might consider using pg_dump instead, which would do that and validate all foreign key references too. Thanks! Stephen
Attachment
Hello, >>> Increments in pgbackrest are done on file level which is not really >>> efficient. We have done parallelism, compression and page-level >>> increments (9.3+) in barman fork [1], but unfortunately guys from >>> 2ndquadrant-it don’t hurry to work on it. >> >> We're looking at page-level incremental backup in pgbackrest also. For >> larger systems, we've not heard too much complaining about it being >> file-based though, which is why it hasn't been a priority. Of course, >> the OP is on 9.1 too, so. > > Well, we have forked barman and made everything from the above just > because we needed ~ 2 PB of disk space for storing backups for our ~ 300 > TB of data. (Our recovery window is 7 days) And on 5 TB database it took > a lot of time to make/restore a backup. I just have around 11 TB but switched to ZFS based backups only. I'm using snapshots therefore which gives some flexibility. I can rolback them, i can just clone it and work with a full copy as a different cluster (and just the differences are stored) and i can send them incrementally to other servers. This is very fine for my use cases but it doesn't fit everything of course. Greetings, Torsten
On 1/20/17 9:06 AM, Stephen Frost wrote: > All the pages are the same size, so I'm surprised you didn't consider > just having a format along the lines of: magic+offset+page, > magic+offset+page, magic+offset+page, etc... Keep in mind that if you go that route you need to accommodate BLKSZ <> 8192. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
* Jim Nasby (Jim.Nasby@BlueTreble.com) wrote: > On 1/20/17 9:06 AM, Stephen Frost wrote: > >All the pages are the same size, so I'm surprised you didn't consider > >just having a format along the lines of: magic+offset+page, > >magic+offset+page, magic+offset+page, etc... > > Keep in mind that if you go that route you need to accommodate BLKSZ > <> 8192. If you want my 2c on that, running with BLKSZ <> 8192 is playing with fire, or at least running with scissors. That said, yes, the code should either barf when BLKSZ <> 8192 in a very clear way early on, or handle it correctly, and be tested with such configurations. Thanks! Stephen
Attachment
On 1/22/17 11:32 AM, Stephen Frost wrote: > The 1-second window concern is regarding the validity of a subsequent > incremental backup. BTW, there's a simpler scenario here: Postgres touches file. rsync notices file has different timestamp, starts copying. Postgres touches file again. If those 3 steps happen in the same second, you now have an invalid backup. There's probably other scenarios as well. In short, if you're using rsync, it's *critical* that you give it the --checksum option, which tells rsync to ignore file size and timestamp. >>>> my latter explanation is: use pg_basebackup, it will do it automatically >>>> for you. >>> Yes, if you are unsure about how to perform a safe backup properly, >>> using pg_basebackup or one of the existing backup tools is, by far, the >>> best approach. Attempting to roll your own backup system based on rsync >>> is not something I am comfortable recommending any more because it is >>> *not* simple to do correctly. >> OK, that is fine, and actually we are using that. > You must be sure to use one of the methods with pg_basebackup that keeps > all of the WAL created during the full backup. That would be one of: > pg_basebackup -x, pg_basebackup -X stream, or pg_basebackup + > pg_receivexlog. > >> the reason why i explain about start_backup and stop_backup is to give a >> gradual understand, and hoping that people will get the mechanism in the >> back understandable. > I'm more than happy to have people explaining about > pg_start/stop_backup, but I do have an issue when the explanation is > incorrect and could cause a user to use a backup method which will > result in an invalid backup. The other *critical* thing with PITR backups: you must test EVERY backup that you take. No test == no backup. There's far, far too many things that can go wrong, especially if you're rolling your own tool. The complexities around PITR are why I always recommend also using pg_dump on a periodic (usually weekly) basis as part of your full DR strategy. You'll probably never use the pg_dump backups, but (in most cases) they're a really cheap insurance policy. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
On 1/23/17 9:27 AM, Stephen Frost wrote: > If you want my 2c on that, running with BLKSZ <> 8192 is playing with > fire, or at least running with scissors. I've never seen it myself, but I'm under the impression that it's not unheard of for OLAP environments. Given how sensitive PG is to IO latency a larger block size could theoretically mean a big performance improvement in some scenarios. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532)
Greetings,
* julyanto SUTANDANG (julyanto@equnix.co.id) wrote:
> CORRECTION:
>
> "you might you pg_start_backup to tell the server not to write into the
> DATADIR"
>
> become
>
> "you might *use* pg_start_backup to tell the server not to write into the
> *BASEDIR*, actually server still writes but only to XLOGDIR "
Just to make sure anyone reading the mailing list archives isn't
confused, running pg_start_backup does *not* make PG stop writing to
BASEDIR (or DATADIR, or anything, really). PG *will* continue to write
data into BASEDIR after pg_start_backup has been called.
The only thing that pg_start_backup does is identify an entry in the WAL
stream, from which point all WAL must be replayed when restoring the
backup. All WAL generated from that point (pg_start_backup point) until
the pg_stop_backup point *must* be replayed when restoring the backup or
the database will not be consistent.
Greetings, * Torsten Zuehlsdorff (mailinglists@toco-domains.de) wrote: > I just have around 11 TB but switched to ZFS based backups only. I'm > using snapshots therefore which gives some flexibility. I can > rolback them, i can just clone it and work with a full copy as a > different cluster (and just the differences are stored) and i can > send them incrementally to other servers. This is very fine for my > use cases but it doesn't fit everything of course. While that's another approach, it does require that those snapshots are performed correctly (both by you and by ZFS) and are entirely atomic to the entire PG instance. For example, I don't believe ZFS snapshots will be atomic if multiple ZFS filesystems on independent ZFS pools are being used underneath a single PG instance. And, as others have also said, always test, test, test. Thanks! Stephen
Attachment
* Jeff Janes (jeff.janes@gmail.com) wrote: > On Sun, Jan 22, 2017 at 6:57 AM, Stephen Frost <sfrost@snowman.net> wrote: > > Just to make sure anyone reading the mailing list archives isn't > > confused, running pg_start_backup does *not* make PG stop writing to > > BASEDIR (or DATADIR, or anything, really). PG *will* continue to write > > data into BASEDIR after pg_start_backup has been called. > > Correct. Unfortunately it is a very common myth that it does cause > PostgreSQL to stop writing to the base dir. I would love a way to dispel that myth. :/ If you have any suggestions of how we could improve the docs, I'd certainly be happy to take a look and try to help. > > The only thing that pg_start_backup does is identify an entry in the WAL > > stream, from which point all WAL must be replayed when restoring the > > backup. All WAL generated from that point (pg_start_backup point) until > > the pg_stop_backup point *must* be replayed when restoring the backup or > > the database will not be consistent. > > pg_start_backup also forces full_page_writes to be effectively 'on' for the > duration of the backup, if it is not already explicitly on (which it > usually will already be). This affects pg_xlog, of course, not base. But > it is an essential step for people who run with full_page_writes=off, as it > ensures that anything in base which got changed mid-copy will be fixed up > during replay of the WAL. Agreed. Thanks! Stephen
Attachment
On 1/22/17 11:32 AM, Stephen Frost wrote:The 1-second window concern is regarding the validity of a subsequent
incremental backup.
BTW, there's a simpler scenario here:
Postgres touches file.
rsync notices file has different timestamp, starts copying.
Postgres touches file again.
If those 3 steps happen in the same second, you now have an invalid backup. There's probably other scenarios as well.
In short, if you're using rsync, it's *critical* that you give it the --checksum option, which tells rsync to ignore file size and timestamp.
On 23 January 2017 at 17:12, Jeff Janes <jeff.janes@gmail.com> wrote: >> Just to make sure anyone reading the mailing list archives isn't >> confused, running pg_start_backup does *not* make PG stop writing to >> BASEDIR (or DATADIR, or anything, really). PG *will* continue to write >> data into BASEDIR after pg_start_backup has been called. > > > > Correct. Unfortunately it is a very common myth that it does cause > PostgreSQL to stop writing to the base dir. Never heard that one before. Wow. Who's been saying that? It's taken me years to hunt down all invalid backup memes and terminate them. Never fails to surprise me how many people don't read the docs. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
* Jeff Janes (jeff.janes@gmail.com) wrote: > On Mon, Jan 23, 2017 at 7:28 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote: > > On 1/22/17 11:32 AM, Stephen Frost wrote: > >> The 1-second window concern is regarding the validity of a subsequent > >> incremental backup. > > > > BTW, there's a simpler scenario here: > > > > Postgres touches file. > > rsync notices file has different timestamp, starts copying. > > Postgres touches file again. > > > > If those 3 steps happen in the same second, you now have an invalid > > backup. There's probably other scenarios as well. Ah, yeah, I think the outline I had was why we decided that even a file with the same timestamp as the start of the backup couldn't be trusted. > To be clear, you don't have an invalid backup *now*, as replay of the WAL > will fix it up. You will have an invalid backup next time you take a > backup, using a copy of the backup you just took now as the rsync > destination of that future backup. Correct. > If you were to actually fire up a copy of the backup and go through > recovery, then shut it down, and then use that post-recovery copy as the > destination of the rsync, would that eliminate the risk (barring clock skew > between systems)? I believe it would *change* things, but not eliminate the risk- consider this: what's the timestamp going to be on the files that were modified through WAL recovery? It would be *after* the backup was done. I believe (but not sure) that rsync will still copy the file if there's any difference in timestamp, but it's technically possible that you could get really unlikely and have the same post-backup timestamp as the file ends up having when the following backup is done, meaning that the file isn't copied even though its contents are no longer the same (the primary server's copy has whatever was written to that file in the same second that the restored server was writing the WAL replay into the file). Admittedly, that's pretty unlikely, but it's not impossible and that's where you can get into *serious* trouble because it becomes darn near impossible to figure out what the heck went wrong, and that's just not cool with backups. Do it properly, or use something that does. This isn't where you want to be playing fast-and-loose. > > In short, if you're using rsync, it's *critical* that you give it the > > --checksum option, which tells rsync to ignore file size and timestamp. > > Which unfortunately obliterates much of the point of using rsync for many > people. You can still save on bandwidth, but not on local IO on each end. No, it means that rsync is *not* a good tool for doing incremental backups of PG. Would be great if we could get more people to understand that. 'cp' is an equally inappropriate and bad tool for doing WAL archiving, btw. Would be great if our docs were clear on that. Thanks! Stephen
Attachment
* Simon Riggs (simon@2ndquadrant.com) wrote: > On 23 January 2017 at 17:12, Jeff Janes <jeff.janes@gmail.com> wrote: > >> Just to make sure anyone reading the mailing list archives isn't > >> confused, running pg_start_backup does *not* make PG stop writing to > >> BASEDIR (or DATADIR, or anything, really). PG *will* continue to write > >> data into BASEDIR after pg_start_backup has been called. > > > > Correct. Unfortunately it is a very common myth that it does cause > > PostgreSQL to stop writing to the base dir. > > Never heard that one before. Wow. Who's been saying that? Well, this conversation started because of such a comment, so at least one person on this thread (though I believe that's been clarified sufficiently now). > It's taken me years to hunt down all invalid backup memes and terminate them. A never-ending and thankless task, so, my thanks to you for your efforts. :) > Never fails to surprise me how many people don't read the docs. +1MM. Thanks again! Stephen
Attachment
Will do that to inform other about this matter.
Julyanto SUTANDANG
Equnix Business Solutions, PT
(An Open Source and Open Mind Company)
www.equnix.co.id
Pusat Niaga ITC Roxy Mas Blok C2/42. Jl. KH Hasyim Ashari 125, Jakarta Pusat
T: +6221 22866662 F: +62216315281 M: +628164858028
Caution: The information enclosed in this email (and any attachments) may be legally privileged and/or confidential and is intended only for the use of the addressee(s). No addressee should forward, print, copy, or otherwise reproduce this message in any manner that would allow it to be viewed by any individual not originally listed as a recipient. If the reader of this message is not the intended recipient, you are hereby notified that any unauthorized disclosure, dissemination, distribution, copying or the taking of any action in reliance on the information herein is strictly prohibited. If you have received this communication in error, please immediately notify the sender and delete this message.Unless it is made by the authorized person, any views expressed in this message are those of the individual sender and may not necessarily reflect the views of PT Equnix Business Solutions.
On Sun, Jan 22, 2017 at 6:57 AM, Stephen Frost <sfrost@snowman.net> wrote:Greetings,
* julyanto SUTANDANG (julyanto@equnix.co.id) wrote:
> CORRECTION:
>
> "you might you pg_start_backup to tell the server not to write into the
> DATADIR"
>
> become
>
> "you might *use* pg_start_backup to tell the server not to write into the
> *BASEDIR*, actually server still writes but only to XLOGDIR "
Just to make sure anyone reading the mailing list archives isn't
confused, running pg_start_backup does *not* make PG stop writing to
BASEDIR (or DATADIR, or anything, really). PG *will* continue to write
data into BASEDIR after pg_start_backup has been called.Correct. Unfortunately it is a very common myth that it does cause PostgreSQL to stop writing to the base dir.
The only thing that pg_start_backup does is identify an entry in the WAL
stream, from which point all WAL must be replayed when restoring the
backup. All WAL generated from that point (pg_start_backup point) until
the pg_stop_backup point *must* be replayed when restoring the backup or
the database will not be consistent.pg_start_backup also forces full_page_writes to be effectively 'on' for the duration of the backup, if it is not already explicitly on (which it usually will already be). This affects pg_xlog, of course, not base. But it is an essential step for people who run with full_page_writes=off, as it ensures that anything in base which got changed mid-copy will be fixed up during replay of the WAL.Cheers,Jeff
On 23 January 2017 at 17:12, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Just to make sure anyone reading the mailing list archives isn't
>> confused, running pg_start_backup does *not* make PG stop writing to
>> BASEDIR (or DATADIR, or anything, really). PG *will* continue to write
>> data into BASEDIR after pg_start_backup has been called.
>
>
>
> Correct. Unfortunately it is a very common myth that it does cause
> PostgreSQL to stop writing to the base dir.
Never heard that one before. Wow. Who's been saying that?
It's taken me years to hunt down all invalid backup memes and terminate them.
Never fails to surprise me how many people don't read the docs.
On Mon, Jan 23, 2017 at 9:43 AM, Simon Riggs <simon@2ndquadrant.com> wrote:On 23 January 2017 at 17:12, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Just to make sure anyone reading the mailing list archives isn't
>> confused, running pg_start_backup does *not* make PG stop writing to
>> BASEDIR (or DATADIR, or anything, really). PG *will* continue to write
>> data into BASEDIR after pg_start_backup has been called.
>
>
>
> Correct. Unfortunately it is a very common myth that it does cause
> PostgreSQL to stop writing to the base dir.
Never heard that one before. Wow. Who's been saying that?
It's taken me years to hunt down all invalid backup memes and terminate them.
Never fails to surprise me how many people don't read the docs.I've seen it on stackexchange, and a few times on the pgsql mailing lists, and talking to people in person. I've never traced it back some "authoritative" source who is making the claim, I think many people just independently think up "How would I implement pg_start_backup if I were doing it" and then come up with the same false conclusion, and then all reinforce each other.I don't think the docs are particularly clear on this. There is the comment "Some file system backup tools emit warnings or errors if the files they are trying to copy change while the copy proceeds. When taking a base backup of an active database, this situation is normal and not an error" but the reader could think that comment could apply to any of the files in the datadirectory (in particular, pg_xlog), and could think that it doesn't apply to the files in datadirectory/base in particular. In other words, once they form the wrong understanding, the docs (if read) don't force them to change it, as they could interpret it in ways that are consistent.Of course the docs aren't a textbook and aren't trying to fully describe the theory of operation; just give the people a recipe they can follow. But people will make inferences from that recipe anyway. I don't know if it is worth trying preemptively dispel these mistakes in the docs.Cheers,Jeff
Greetings, * Rick Otten (rottenwindfish@gmail.com) wrote: > Actually, I think this may be the way Oracle Hot Backups work. It was my > impression that feature temporarily suspends writes into a specific > tablespace so you can take a snapshot of it. It has been a few years since > I've had to do Oracle work though and I could be mis-remembering. People > may be confusing Oracle and PostgreSQL. Yes, that thought has occured to me as well, in some other database systems you can ask for the system to be quiesced. Thanks! Stephen
Attachment
On 1/20/17 10:06 AM, Stephen Frost wrote: > Ah, yes, I noticed that you passed over the file twice but wasn't quite > sure what functools.partial() was doing and a quick read of the docs > made me think you were doing seeking there. > > All the pages are the same size, so I'm surprised you didn't consider > just having a format along the lines of: magic+offset+page, > magic+offset+page, magic+offset+page, etc... > > I'd have to defer to David on this, but I think he was considering > having some kind of a bitmap to indicate which pages changed instead > of storing the full offset as, again, all the pages are the same size. I have actually gone through a few different ideas (including both of the above) and haven't settled on anything final yet. Most of the ideas I've come up with so far are more optimal for backup performance but I would rather bias towards restores which tend to be more time sensitive. The ideal solution would be something that works well for both. -- -David david@pgmasters.net