Thread: Off-site storage for PITR logs
Hi, I am planning an off-site backup solution for a fairly busy (mostly-write) 8.3 database. The database is currently about 200GB in size. I though about using log shipping and cold standby since it's easy in terms of administration and also offers point in time recovery. The database generates about 3 PITR log files per minute. If my calculations are correct the sites need to be connected with a 7MBit connection and the logs will need about 68GB of storage per day! Does anyone have any suggestions on how to significantly reduce the volume of log files or recommend another off-site backup solution that would require less bandwidth and storage? Thanks. -- Regards, Nicos Panayides IT Manager Magneta Technologies Ltd Tel: +357 22721919, 22317400 Fax: +357 22721917 Web: http://www.magneta.eu
Nicos Panayides wrote: > Hi, > I am planning an off-site backup solution for a fairly busy > (mostly-write) 8.3 database. The database is currently about 200GB in > size. I though about using log shipping and cold standby since it's easy > in terms of administration and also offers point in time recovery. > > The database generates about 3 PITR log files per minute. If my > calculations are correct the sites need to be connected with a 7MBit > connection and the logs will need about 68GB of storage per day! > > Does anyone have any suggestions on how to significantly reduce the > volume of log files or recommend another off-site backup solution that > would require less bandwidth and storage? I would use pg_lesslog to reduce the size of the WAL files: http://pgfoundry.org/projects/pglesslog/ However, there is a bug in pg_lesslog so I wouldn't use it until that is fixed: http://archives.postgresql.org/pgsql-announce/2010-02/msg00005.php Koichi, do you have any update on this? I don't see that a new version has been uploaded, and I also see no mention on the pgfoundry site about the bug. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do
Nicos Panayides <nicos@magneta.com.cy> wrote: > The database generates about 3 PITR log files per minute. If my > calculations are correct the sites need to be connected with a > 7MBit connection and the logs will need about 68GB of storage per > day! I'd have put the minimum line speed at 8Mb/second, but I figure 10 bits per byte to allow for parity and packet overhead, and the assumption that I might want to use the line for something else (like monitoring) at the same time; but yeah, those numbers are in the right ballpark. > Does anyone have any suggestions on how to significantly reduce > the volume of log files or recommend another off-site backup > solution that would require less bandwidth and storage? We stream the WAL files through gzip in our archive script. That reduces them to 4MB to 8MB during normal usage, and 12MB to 15MB during our routine database vacuums. If you have relatively idle periods during which the log is pushed by archive_timeout, you should filter through pg_clearxlogtail or pglesslog before using gzip; the former yields a 16kB file during idle periods. (I'm not sure about the other, but it should be capable of getting even smaller.) You'll need to determine how far back it's valuable to have the "point in time recovery" capabilities. We do weekly base backups and keep WAL files to restore forward from the earlier to current, but then keep monthly "archival" backups, where each base backup is kept with just the WAL files needed to successfully restore it. -Kevin
Sorry for that. The bug was incorrect calculation of XNOOP record size in GiST WAL when incremental log is created from the full backup. Because the bug impact is so serious, I'm rebuilding the test environment which is taking long. I'll report the cause of the bug to hackers and bugs. If GiST is not used, WAL compression works correctly but I'm not comfortable to use lesslog with known bug. ---------- Koichi Suzuki 2010/3/13 Bruce Momjian <bruce@momjian.us>: > Nicos Panayides wrote: >> Hi, >> I am planning an off-site backup solution for a fairly busy >> (mostly-write) 8.3 database. The database is currently about 200GB in >> size. I though about using log shipping and cold standby since it's easy >> in terms of administration and also offers point in time recovery. >> >> The database generates about 3 PITR log files per minute. If my >> calculations are correct the sites need to be connected with a 7MBit >> connection and the logs will need about 68GB of storage per day! >> >> Does anyone have any suggestions on how to significantly reduce the >> volume of log files or recommend another off-site backup solution that >> would require less bandwidth and storage? > > I would use pg_lesslog to reduce the size of the WAL files: > > http://pgfoundry.org/projects/pglesslog/ > > However, there is a bug in pg_lesslog so I wouldn't use it until that is > fixed: > > http://archives.postgresql.org/pgsql-announce/2010-02/msg00005.php > > Koichi, do you have any update on this? I don't see that a new version > has been uploaded, and I also see no mention on the pgfoundry site about > the bug. > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > PG East: http://www.enterprisedb.com/community/nav-pg-east-2010.do >
On Fri, 2010-03-12 at 18:40 +0200, Nicos Panayides wrote: > I am planning an off-site backup solution for a fairly busy > (mostly-write) 8.3 database. The database is currently about 200GB in > size. I though about using log shipping and cold standby since it's easy > in terms of administration and also offers point in time recovery. > > The database generates about 3 PITR log files per minute. If my > calculations are correct the sites need to be connected with a 7MBit > connection and the logs will need about 68GB of storage per day! > > Does anyone have any suggestions on how to significantly reduce the > volume of log files or recommend another off-site backup solution that > would require less bandwidth and storage? Consider turning off full_page_writes which will permanently minimise the amount of WAL traffic, as well as being an integrated and fully-supported option with Postgres. You will need additional non-volatile cache on your disks to ensure that is a safe option, though that will yield performance advantages also. -- Simon Riggs www.2ndQuadrant.com
Hi Simon, thanks for the suggestion. What kind of space savings should I expected from turning off full_page_writes? The servers have battery-backed write-caches for the disk controllers, so it should be safe to give this a try. Does turning off full_page_writes have any other effect on the database? Regards, Nicos Panayides IT Manager Magneta Technologies Ltd Tel: +357 22721919, 22317400 Fax: +357 22721917 Web: http://www.magneta.eu On 14/03/2010 10:50, Simon Riggs wrote: > On Fri, 2010-03-12 at 18:40 +0200, Nicos Panayides wrote: > >> I am planning an off-site backup solution for a fairly busy >> (mostly-write) 8.3 database. The database is currently about 200GB in >> size. I though about using log shipping and cold standby since it's easy >> in terms of administration and also offers point in time recovery. >> >> The database generates about 3 PITR log files per minute. If my >> calculations are correct the sites need to be connected with a 7MBit >> connection and the logs will need about 68GB of storage per day! >> >> Does anyone have any suggestions on how to significantly reduce the >> volume of log files or recommend another off-site backup solution that >> would require less bandwidth and storage? > > Consider turning off full_page_writes which will permanently minimise > the amount of WAL traffic, as well as being an integrated and > fully-supported option with Postgres. You will need additional > non-volatile cache on your disks to ensure that is a safe option, though > that will yield performance advantages also. >
On Mon, 2010-03-15 at 11:27 +0200, Nicos Panayides wrote: > thanks for the suggestion. What kind of space savings should I expected > from turning off full_page_writes? Substantial, though you should measure it and see, since it is workload dependent. -- Simon Riggs www.2ndQuadrant.com
Dear all, I have been testing PITR and I have noticed a 20% increase on the load of the Disk subsystem. Is that a normal thing to expect? Have you come across this or have you noticed this increase? I know some of you guys have been using it for quite some time, would you mind in sharing your experiences, please? Thank you very much Best regards Renato Renato Oliveira Systems Administrator e-mail: renato.oliveira@grant.co.uk Tel: +44 (0)1763 260811 Fax: +44 (0)1763 262410 http://www.grant.co.uk/ Grant Instruments (Cambridge) Ltd Company registered in England, registration number 658133 Registered office address: 29 Station Road, Shepreth, CAMBS SG8 6GB UK P Please consider the environment before printing this email CONFIDENTIALITY: The information in this e-mail and any attachments is confidential. It is intended only for the named recipients(s).If you are not the named recipient please notify the sender immediately and do not disclose the contents toanother person or take copies. VIRUSES: The contents of this e-mail or attachment(s) may contain viruses which could damage your own computer system. WhilstGrant Instruments (Cambridge) Ltd has taken every reasonable precaution to minimise this risk, we cannot accept liabilityfor any damage which you sustain as a result of software viruses. You should therefore carry out your own viruschecks before opening the attachment(s). OpenXML: For information about the OpenXML file format in use within Grant Instruments please visit our http://www.grant.co.uk/Support/openxml.html
Renato Oliveira <renato.oliveira@grant.co.uk> wrote: > I have been testing PITR and I have noticed a 20% increase on the > load of the Disk subsystem. > > Is that a normal thing to expect? It depends on your workload and archive script, but that doesn't seem too shocking. For one thing, turning on archiving causes more to be written to the WAL files, so you should expect some increase just by enabling archiving, even if your archive script just consists of 'exit 0'. Reading the WAL file in the archive script will often be reading from OS cache, but not necessarily. If you're copying the WAL files to a local file system (which often makes sense), that obviously increases I/O. If there's enough delay between the archive script writing the file to a local drive and something (e.g., rsync) picking it up, it might need to read from the disk, and rsync is likely to do some logging. I would be surprised if you couldn't measure any increase, and 20% is certainly within a reasonable range, depending on all the above. If you need to try to minimize the impact, besides considering all of the above, you might want to look at the full_page_writes configuration option. http://www.postgresql.org/docs/8.4/interactive/runtime-config-wal.html#GUC-FULL-PAGE-WRITES -Kevin
On Tue, 2010-03-16 at 10:05 +0000, Renato Oliveira wrote: > I have been testing PITR and I have noticed a 20% increase on the load of the Disk subsystem. > > Is that a normal thing to expect? Have you come across this or have you noticed this increase? > > I know some of you guys have been using it for quite some time, would you mind in sharing your experiences, please? I think you need to explain what things you've been testing and how. There are some well documented bulk data load optimizations that we cannot take advantage of when running archiving, but apart from those the load is actually very low in normal running. -- Simon Riggs www.2ndQuadrant.com
Hi everybody.
We are trying to migrate a gforge (again) with postgres 8.1/2 to postgres 8.4.
We have adapted our schema in order to use tsearch2 from contrib folder for not having to rewrite all functions. We have cleaned most of old tsearch2 stuff, but some tsearch2 functions have dissapeared or at least we don't know how to invoke with postgres 8.4
Theses functions are:
exectsq
rexectsq
concat
I've been unable to locate them at pg_catalog. I've not found reference in tsearch2 documentation for exectsq and rexectsq. But concat should be in place but it's missing!!
Any hint?
Thanks in advance.
We are trying to migrate a gforge (again) with postgres 8.1/2 to postgres 8.4.
We have adapted our schema in order to use tsearch2 from contrib folder for not having to rewrite all functions. We have cleaned most of old tsearch2 stuff, but some tsearch2 functions have dissapeared or at least we don't know how to invoke with postgres 8.4
Theses functions are:
exectsq
rexectsq
concat
I've been unable to locate them at pg_catalog. I've not found reference in tsearch2 documentation for exectsq and rexectsq. But concat should be in place but it's missing!!
Any hint?
Thanks in advance.
=?ISO-8859-1?Q?I=F1igo?= Martinez Lasala <imartinez@vectorsf.com> writes: > We have adapted our schema in order to use tsearch2 from contrib folder > for not having to rewrite all functions. We have cleaned most of old > tsearch2 stuff, but some tsearch2 functions have dissapeared or at least > we don't know how to invoke with postgres 8.4 > Theses functions are: > exectsq > rexectsq > concat None of those functions were ever meant to be invoked directly. Use the corresponding operators (@@ and ||) instead. regards, tom lane
Thank you, Tom. In fact what was failing was the creation of operators || and @@. Both used the procedures concat, exectsq and rexectsq.So we can delete those operator creation sentences. ----- Mensaje original ----- De: "Tom Lane" <tgl@sss.pgh.pa.us> Para: "Iñigo Martinez Lasala" <imartinez@vectorsf.com> CC: "pgsql-admin" <pgsql-admin@postgresql.org> Enviados: Jueves, 18 de Marzo 2010 17:46:46 GMT +01:00 Amsterdam / Berlín / Berna / Roma / Estocolmo / Viena Asunto: Re: [ADMIN] Missing tsearch2 functions =?ISO-8859-1?Q?I=F1igo?= Martinez Lasala <imartinez@vectorsf.com> writes: > We have adapted our schema in order to use tsearch2 from contrib folder > for not having to rewrite all functions. We have cleaned most of old > tsearch2 stuff, but some tsearch2 functions have dissapeared or at least > we don't know how to invoke with postgres 8.4 > Theses functions are: > exectsq > rexectsq > concat None of those functions were ever meant to be invoked directly. Use the corresponding operators (@@ and ||) instead. regards, tom lane
=?utf-8?Q?I=C3=B1igo_Martinez_Lasala?= <imartinez@vectorsf.com> writes: > In fact what was failing was the creation of operators || and @@. Both used the procedures concat, exectsq and rexectsq.So we can delete those operator creation sentences. Actually, it sounds to me like you might want to think about installing the newer version's contrib/tsearch2 first. See http://www.postgresql.org/docs/8.4/static/textsearch-migration.html In any case note that it's not expected that a dump containing the old tsearch2 object definitions will load without any noise. You should expect some errors like this. Just ignore the errors, unless they are preventing loading some object that doesn't belong to the old tsearch2 module. regards, tom lane