Thread: Documenting pglesslog
In thinking about how to communicate to users about reducing continuous archiving storage requirements, I realized we don't mention pglesslog in our official documentation. The attached patch documents how to use pglesslog and gzip/gunzip to reduce storage requirements. Comments? Also, I assume pg_lesslog removes the padding we use to make all WAL files 16MB, effectively doing the function of clearxlogtail too, right? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/backup.sgml =================================================================== RCS file: /cvsroot/pgsql/doc/src/sgml/backup.sgml,v retrieving revision 2.121 diff -c -c -r2.121 backup.sgml *** doc/src/sgml/backup.sgml 9 Nov 2008 17:51:15 -0000 2.121 --- doc/src/sgml/backup.sgml 11 Jan 2009 01:41:12 -0000 *************** *** 1337,1342 **** --- 1337,1359 ---- WAL files are part of the same <application>tar</> file. Please remember to add error handling to your backup scripts. </para> + + <para> + If archive storage size is a concern, use <application>pg_compresslog</>, + <ulink url="http://pglesslog.projects.postgresql.org"></ulink>, to + remove unnecessary <xref linkend="guc-full-page-writes"> and trailing + space from the WAL files. You can then use + <application>gzip</application> to further compress the output of + <application>pg_compresslog</>: + <programlisting> + archive_command = 'pg_compresslog %p - | gzip > /var/lib/pgsql/archive/%f' + </programlisting> + You will then need to use <application>gunzip</> and + <application>pg_decompresslog</> during recovery: + <programlisting> + restore_command = 'gunzip < /mnt/server/archivedir/%f | pg_decompresslog - %p' + </programlisting> + </para> </sect3> <sect3 id="backup-scripts">
On Sat, 2009-01-10 at 21:09 -0500, Bruce Momjian wrote: > Comments? If this is for backpatching, it makes sense. We should at least wait until sync rep is accepted or rejected and docs written. In general I don't think we should refer/link to other companies' copyrighted materials in our documentation. That could cause difficulties. If you're going to do this, then I think you should go through the docs and refer directly to many other commonly used tools that are also on pg_foundry. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Simon Riggs wrote: > > On Sat, 2009-01-10 at 21:09 -0500, Bruce Momjian wrote: > > > Comments? > > If this is for backpatching, it makes sense. We should at least wait > until sync rep is accepted or rejected and docs written. No, it is not for backpatching. > In general I don't think we should refer/link to other companies' > copyrighted materials in our documentation. That could cause > difficulties. It is BSD licensed. I don't see any copyright issues: http://pglesslog.projects.postgresql.org/ > If you're going to do this, then I think you should go through the docs > and refer directly to many other commonly used tools that are also on > pg_foundry. I add items where they fit logically. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Sat, 2009-01-10 at 23:38 -0500, Bruce Momjian wrote: > It is BSD licensed. I don't see any copyright issues: > > http://pglesslog.projects.postgresql.org/ A licence and copyright are different things. Why do we insist on changing copyright on our code if it is unimportant? -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
On Sun, 2009-01-11 at 03:12 +0000, Simon Riggs wrote: > On Sat, 2009-01-10 at 21:09 -0500, Bruce Momjian wrote: > > > Comments? > > If this is for backpatching, it makes sense. We should at least wait > until sync rep is accepted or rejected and docs written. Why? Even if sync rep is accepted, pglesslog would still be useful for those who aren't using wal archiving right? > > In general I don't think we should refer/link to other companies' > copyrighted materials in our documentation. That could cause > difficulties. What? That seems a bit odd. I see zero problem with linking to the page, especially considering it is an open source project hosted on a postgresql project service. > > If you're going to do this, then I think you should go through the docs > and refer directly to many other commonly used tools that are also on > pg_foundry. > Well have some more information would certainly be useful but in this particular case I don't know of anything else on pgfoundry that would actually help with the problem Bruce is trying to solve. Joshua D. Drake > -- > Simon Riggs www.2ndQuadrant.com > PostgreSQL Training, Services and Support > > -- PostgreSQL Consulting, Development, Support, Training 503-667-4564 - http://www.commandprompt.com/ The PostgreSQL Company,serving since 1997
Simon Riggs wrote: > > On Sat, 2009-01-10 at 23:38 -0500, Bruce Momjian wrote: > > > It is BSD licensed. I don't see any copyright issues: > > > > http://pglesslog.projects.postgresql.org/ > > A licence and copyright are different things. Why do we insist on > changing copyright on our code if it is unimportant? Because the BSD copyright has no enforcement, I don't think the copyright holder is important --- look at all the companies that take our code and use it in their commercial products. We rebrand code we accept so it is clear who maintains it; note we take NetBSD code and add our copyright name to theirs and distribute it as our own. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
On Sun, 2009-01-11 at 09:47 -0500, Bruce Momjian wrote: > Simon Riggs wrote: > > > > On Sat, 2009-01-10 at 23:38 -0500, Bruce Momjian wrote: > > > > > It is BSD licensed. I don't see any copyright issues: > > > > > A licence and copyright are different things. Why do we insist on > > changing copyright on our code if it is unimportant? > > Because the BSD copyright has no enforcement AFAIK there is no such thing as a BSD copyright. There is Copyright and there is a BSD licence, issued by the copyright holder. In general, IMHO, I don't think it's a good direction to go in to include links to works of other copyright holders. Specifically, I have nothing but good to say about pglesslog and its authors. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
> In general, IMHO, I don't think it's a good direction to go in to > include links to works of other copyright holders. I think it's a great idea. IMHO, one of the major selling points of PostgreSQL is its awesome documentation. However, one of its weaknesses is that contrib module, pgfoundry projects, etc. are often not mentioned in the parts of the main documentation to which they relate. While I certainly don't want to go in the direction of telling people "Don't worry about the fact that we handle X poorly because there is a 5-year old, unmaintained pgfoundry module that fixes it", giving people references tools that the community thinks are good and useful seems very helpful to me. I am completely mystified as what linking "other copyright holders" has to do with it. That seems to imply that you fear some sort of legal entanglement, but I can't imagine what it could possibly be. Admittedly, IANAL. ...Robert
Bruce Momjian wrote: > In thinking about how to communicate to users about reducing continuous > archiving storage requirements, I realized we don't mention pglesslog in > our official documentation. > > The attached patch documents how to use pglesslog and gzip/gunzip to > reduce storage requirements. Comments? > > Also, I assume pg_lesslog removes the padding we use to make all WAL > files 16MB, effectively doing the function of clearxlogtail too, right? Applied. --------------------------------------------------------------------------- > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + [ text/x-diff is unsupported, treating like TEXT/PLAIN ] > Index: doc/src/sgml/backup.sgml > =================================================================== > RCS file: /cvsroot/pgsql/doc/src/sgml/backup.sgml,v > retrieving revision 2.121 > diff -c -c -r2.121 backup.sgml > *** doc/src/sgml/backup.sgml 9 Nov 2008 17:51:15 -0000 2.121 > --- doc/src/sgml/backup.sgml 11 Jan 2009 01:41:12 -0000 > *************** > *** 1337,1342 **** > --- 1337,1359 ---- > WAL files are part of the same <application>tar</> file. > Please remember to add error handling to your backup scripts. > </para> > + > + <para> > + If archive storage size is a concern, use <application>pg_compresslog</>, > + <ulink url="http://pglesslog.projects.postgresql.org"></ulink>, to > + remove unnecessary <xref linkend="guc-full-page-writes"> and trailing > + space from the WAL files. You can then use > + <application>gzip</application> to further compress the output of > + <application>pg_compresslog</>: > + <programlisting> > + archive_command = 'pg_compresslog %p - | gzip > /var/lib/pgsql/archive/%f' > + </programlisting> > + You will then need to use <application>gunzip</> and > + <application>pg_decompresslog</> during recovery: > + <programlisting> > + restore_command = 'gunzip < /mnt/server/archivedir/%f | pg_decompresslog - %p' > + </programlisting> > + </para> > </sect3> > > <sect3 id="backup-scripts"> > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Hi, I have no intention to make pglesslog to conflict to PostgreSQL license. Any advice is welcome to make pglesslog available without any license concern. I've a question and ideas. Bruce's modification directly points to my pgfoundry page. I'm not sure what it means. Does it mean that I have to maintain the page for a while? If pglesslog helps for future releases, can it be a part of PostgreSQL release, as contrib module so that all the documentation in pgfoundry (although very simple) is included in the release material? As many hackers know, I've posted another code to speedup PITR after slipping FPW, which does work with 8.3 as external module (pg_readahead). I'm now working to work this with synchronous replication. Maybe it's a good idea to use pglesslog with pg_readahead. Although I'm not sure if pg_readahead integration with synchronous replication will be done within 8.4 development period, I'm quite ready to post pg_readahead for 8.4 sililar to that for 8.3, which also could be in contrib module. Looking forward to inputs. 2009/1/12 Simon Riggs <simon@2ndquadrant.com>: > > On Sun, 2009-01-11 at 09:47 -0500, Bruce Momjian wrote: >> Simon Riggs wrote: >> > >> > On Sat, 2009-01-10 at 23:38 -0500, Bruce Momjian wrote: >> > >> > > It is BSD licensed. I don't see any copyright issues: >> > > >> > A licence and copyright are different things. Why do we insist on >> > changing copyright on our code if it is unimportant? >> >> Because the BSD copyright has no enforcement > > AFAIK there is no such thing as a BSD copyright. There is Copyright and > there is a BSD licence, issued by the copyright holder. > > In general, IMHO, I don't think it's a good direction to go in to > include links to works of other copyright holders. > > Specifically, I have nothing but good to say about pglesslog and its > authors. > > -- > Simon Riggs www.2ndQuadrant.com > PostgreSQL Training, Services and Support > > > -- > Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-hackers > -- ------ Koichi Suzuki
On Tue, 2009-01-13 at 13:21 +0900, Koichi Suzuki wrote: > I have no intention to make pglesslog to conflict to PostgreSQL > license. Any advice is welcome to make pglesslog available without > any license concern. I understand, no part of my comments were against you or your work. > I've a question and ideas. > > Bruce's modification directly points to my pgfoundry page. I'm not > sure what it means. Does it mean that I have to maintain the page for > a while? If pglesslog helps for future releases, can it be a part of > PostgreSQL release, as contrib module so that all the documentation in > pgfoundry (although very simple) is included in the release material? I think it would be better to create a Wiki page that is directly controlled by the project, which describes additions to PITR (or other aspects of the project) and contains links. I think everyone accepts that the Wiki can have off-project links. That way people can submit their work without needing to make off-project links permanent from the docs. If people then change their site content in future we can more easily change the link. For example, Josh can make contributions there as well. -- Simon Riggs www.2ndQuadrant.comPostgreSQL Training, Services and Support
Koichi Suzuki wrote: > Hi, > > I have no intention to make pglesslog to conflict to PostgreSQL > license. Any advice is welcome to make pglesslog available without > any license concern. I certainly have no concerns. > I've a question and ideas. > > Bruce's modification directly points to my pgfoundry page. I'm not > sure what it means. Does it mean that I have to maintain the page for > a while? If pglesslog helps for future releases, can it be a part of > PostgreSQL release, as contrib module so that all the documentation in > pgfoundry (although very simple) is included in the release material? I think eventually we should put pglesslog into /contrib, and if we ever do that, we would update your web page. I have not heard any mention of it being moved into /contrib for 8.4 though. If you would like me to point to another URL, please let me know. I think there is definately demand for pglesslog because not only does it truncate dead space from the WAL file, it also removes full page write images, and is best done in archive_command, and hence externally like your tool does. > As many hackers know, I've posted another code to speedup PITR after > slipping FPW, which does work with 8.3 as external module > (pg_readahead). I'm now working to work this with synchronous > replication. Maybe it's a good idea to use pglesslog with > pg_readahead. Although I'm not sure if pg_readahead integration > with synchronous replication will be done within 8.4 development > period, I'm quite ready to post pg_readahead for 8.4 sililar to that > for 8.3, which also could be in contrib module. Sorry, I don't know enough about pg_readahead. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Pg_readahead is a tool to prefetch data pages before redoing, based on the contents of archive/active WAL segment. For 8.3 and 8.4 without sync.rep, this works together with restore_command. Pg_readahead analyze WAL segment, schedule and issue posix_fadvise() to prefetch data pages quickly before redoing. Discussions and materials will be found at http://archives.postgresql.org/pgsql-hackers/2008-10/msg01372.php So far, external command implemantation speeds up PITR up to six times! Therefore, overall recovery time can be a little longer than that with FPW. 2009/1/14 Bruce Momjian <bruce@momjian.us>: > Koichi Suzuki wrote: >> Hi, >> >> I have no intention to make pglesslog to conflict to PostgreSQL >> license. Any advice is welcome to make pglesslog available without >> any license concern. > > I certainly have no concerns. > >> I've a question and ideas. >> >> Bruce's modification directly points to my pgfoundry page. I'm not >> sure what it means. Does it mean that I have to maintain the page for >> a while? If pglesslog helps for future releases, can it be a part of >> PostgreSQL release, as contrib module so that all the documentation in >> pgfoundry (although very simple) is included in the release material? > > I think eventually we should put pglesslog into /contrib, and if we ever > do that, we would update your web page. I have not heard any mention of > it being moved into /contrib for 8.4 though. > > If you would like me to point to another URL, please let me know. > > I think there is definately demand for pglesslog because not only does > it truncate dead space from the WAL file, it also removes full page > write images, and is best done in archive_command, and hence externally > like your tool does. > >> As many hackers know, I've posted another code to speedup PITR after >> slipping FPW, which does work with 8.3 as external module >> (pg_readahead). I'm now working to work this with synchronous >> replication. Maybe it's a good idea to use pglesslog with >> pg_readahead. Although I'm not sure if pg_readahead integration >> with synchronous replication will be done within 8.4 development >> period, I'm quite ready to post pg_readahead for 8.4 sililar to that >> for 8.3, which also could be in contrib module. > > Sorry, I don't know enough about pg_readahead. > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + > -- ------ Koichi Suzuki
Koichi Suzuki wrote: > Pg_readahead is a tool to prefetch data pages before redoing, based on > the contents of archive/active WAL segment. For 8.3 and 8.4 without > sync.rep, this works together with restore_command. Pg_readahead > analyze WAL segment, schedule and issue posix_fadvise() to prefetch > data pages quickly before redoing. > > Discussions and materials will be found at > > http://archives.postgresql.org/pgsql-hackers/2008-10/msg01372.php > > So far, external command implemantation speeds up PITR up to six > times! Therefore, overall recovery time can be a little longer than > that with FPW. Now that 8.4 is using fsync, sounds like something that should be integrated into the core code, rather than as a /contrib. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + If your life is a hard drive, Christ can be your backup. +
Pg_readahead uses posix_fadvise, which is included in Greg's patch andI've already posted pg_readahead patch integrated intothe core. Integration with snc.rep. will be a separate patch which will be posted in a couple of days. 2009/1/14 Bruce Momjian <bruce@momjian.us>: > Koichi Suzuki wrote: >> Pg_readahead is a tool to prefetch data pages before redoing, based on >> the contents of archive/active WAL segment. For 8.3 and 8.4 without >> sync.rep, this works together with restore_command. Pg_readahead >> analyze WAL segment, schedule and issue posix_fadvise() to prefetch >> data pages quickly before redoing. >> >> Discussions and materials will be found at >> >> http://archives.postgresql.org/pgsql-hackers/2008-10/msg01372.php >> >> So far, external command implemantation speeds up PITR up to six >> times! Therefore, overall recovery time can be a little longer than >> that with FPW. > > Now that 8.4 is using fsync, sounds like something that should be > integrated into the core code, rather than as a /contrib. > > -- > Bruce Momjian <bruce@momjian.us> http://momjian.us > EnterpriseDB http://enterprisedb.com > > + If your life is a hard drive, Christ can be your backup. + > -- ------ Koichi Suzuki