Thread: checkpoint_timeout and archive_timeout
I have a database that only receives updates maybe a few times a week. It is the storage for some web content where the content is not changed very often. I set archive_timeout to 5 minutes, because when an update is made I would like the log that contains that change to get archived fairly soon, and I'm OK with having 16MB of log for each update, or sequence of closely timed updates. However, archive_timeout and checkpoint_timeout have a pernicious interaction. Each one individually suppresses needless operations, i.e. not checkpointing if no WAL was written since last checkpoint, and not log-switching if no WAL was written since the last log-switch. But in combination, a checkpoint writes WAL that triggers a log-switch, and a log-switch writes WAL that triggers a checkpoint. So a server that is completely free of user activity will still generate an endless stream of WAL files, averaging one file per max(archive_timeout, checkpoint_timeout). That comes out to one 16MB file per hour (since it is not possible to set checkpoint_timeout > 1h) which seems a bit much when absolutely no user-data changes are occurring. Other than running a streaming standby server, which is rather excessive for such a simple use case, is there some way to avoid this? Ideally I could just set checkpoint_timeout to 0 (meaning infinity) but that is not an allowed setting. (This is 9.1.4, but the same behavior applies to other versions, including 9.2beta) Thanks, Jeff
Jeff Janes <jeff.janes@gmail.com> wrote: > archive_timeout and checkpoint_timeout have a pernicious > interaction. Each one individually suppresses needless > operations, i.e. not checkpointing if no WAL was written since > last checkpoint, and not log-switching if no WAL was written since > the last log-switch. But in combination, a checkpoint writes WAL > that triggers a log-switch, and a log-switch writes WAL that > triggers a checkpoint. So a server that is completely free of > user activity will still generate an endless stream of WAL files, > averaging one file per max(archive_timeout, checkpoint_timeout). > That comes out to one 16MB file per hour (since it is not possible > to set checkpoint_timeout > 1h) which seems a bit much when > absolutely no user-data changes are occurring. Tom actually had a patch to change that, but at the time it would have left those of us using PITR backup techniques with no good way to detect certain types of failures. We had a monitoring system that looked at the last checkpoint time on the warm standby to detect when it was "stale", and nobody could suggest a good alternative at the time. So Tom backed off on that patch. Now that there are alternatives, it may be time to take another look at that. BTW, that's also why I wrote the pg_clearxlogtail utility (source code on pgfoundry). We pipe our archives through that and gzip which changes this to an endless stream of 16KB files. Those three orders of magnitude can make all the difference. :-) -Kevin
On Thu, Aug 16, 2012 at 9:30 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Jeff Janes <jeff.janes@gmail.com> wrote: > >> So a server that is completely free of >> user activity will still generate an endless stream of WAL files, >> averaging one file per max(archive_timeout, checkpoint_timeout). >> That comes out to one 16MB file per hour (since it is not possible >> to set checkpoint_timeout > 1h) which seems a bit much when >> absolutely no user-data changes are occurring. > ... > > BTW, that's also why I wrote the pg_clearxlogtail utility (source > code on pgfoundry). We pipe our archives through that and gzip > which changes this to an endless stream of 16KB files. Those three > orders of magnitude can make all the difference. :-) Thanks. Do you put the clearxlogtail and the gzip into the archive_command, or just do a simple copy into the archive and then have a cron job do the processing in the archives later? I'm not really sure what the failure modes are for having pipelines built into the archive_command. Thanks, Jeff
Jeff Janes wrote: > Kevin Grittner wrote: >> Jeff Janes wrote: >>> So a server that is completely free of user activity will still >>> generate an endless stream of WAL files, averaging one file per >>> max(archive_timeout, checkpoint_timeout). That comes out to one >>> 16MB file per hour (since it is not possible to set >>> checkpoint_timeout > 1h) which seems a bit much when absolutely >>> no user-data changes are occurring. >> BTW, that's also why I wrote the pg_clearxlogtail utility (source >> code on pgfoundry). We pipe our archives through that and gzip >> which changes this to an endless stream of 16KB files. Those three >> orders of magnitude can make all the difference. :-) > > Thanks. Do you put the clearxlogtail and the gzip into the > archive_command, or just do a simple copy into the archive and then > have a cron job do the processing in the archives later? I'm not > really sure what the failure modes are for having pipelines built > into the archive_command. We pipe the file into pg_clearxlogtail | gzip and pipe it out to the archive directory (with a ".gz" suffix), rather than using cp and processing it later. Well, actually, we pipe it to a directory on the same mount point as the archive directory and mv it into place, as part of our scheme to avoid problems with partial files. -Kevin
On Tue, Sep 11, 2012 at 5:36 AM, Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: > Jeff Janes wrote: >> Kevin Grittner wrote: > >>> BTW, that's also why I wrote the pg_clearxlogtail utility (source >>> code on pgfoundry). We pipe our archives through that and gzip >>> which changes this to an endless stream of 16KB files. Those three >>> orders of magnitude can make all the difference. :-) >> >> Thanks. Do you put the clearxlogtail and the gzip into the >> archive_command, or just do a simple copy into the archive and then >> have a cron job do the processing in the archives later? I'm not >> really sure what the failure modes are for having pipelines built >> into the archive_command. > > We pipe the file into pg_clearxlogtail | gzip and pipe it out to the > archive directory (with a ".gz" suffix), rather than using cp and > processing it later. Well, actually, we pipe it to a directory on > the same mount point as the archive directory and mv it into place, > as part of our scheme to avoid problems with partial files. Do you have an example of that which you could share? I've run into two problems I'm trying to overcome. One is that pg_clearxlogtail fails on file formats it doesn't recognize but is asked to archive anyway, such as '000000010000000200000065.00000020.backup', for example. Perhaps it could just issue a warning and then pass the unrecognized file through unchanged, instead of bailing out with a fatal error. The other is that a pipeline in bash reports success even if an interior member of it failed. I think I know how to fix that under an actual shell script (as opposed to a pipeline stuffed into "archive_command"), but would like to see how other people have dealt with it. Thanks, Jeff
Jeff Janes <jeff.janes@gmail.com> wrote: > Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote: >> We pipe the file into pg_clearxlogtail | gzip and pipe it out to >> the archive directory (with a ".gz" suffix), rather than using cp >> and processing it later. Well, actually, we pipe it to a >> directory on the same mount point as the archive directory and mv >> it into place, as part of our scheme to avoid problems with >> partial files. > > Do you have an example of that which you could share? > > I've run into two problems I'm trying to overcome. One is that > pg_clearxlogtail fails on file formats it doesn't recognize but is > asked to archive anyway, such as > '000000010000000200000065.00000020.backup', for example. Perhaps > it could just issue a warning and then pass the unrecognized file > through unchanged, instead of bailing out with a fatal error. > > The other is that a pipeline in bash reports success even if an > interior member of it failed. I think I know how to fix that > under an actual shell script (as opposed to a pipeline stuffed > into "archive_command"), but would like to see how other people > have dealt with it. I make no claims that this couldn't be improved upon, but it has been humming along without any problems without modification for about five years now on 80-some machines. Our archive_command is just a call to a bash script that looks like this: walPathname=$1 walFilename=$2 tempfile=/var/pgsql/data/wal-files-temp/$walFilename.gz targetfile=/var/pgsql/data/wal-files/$walFilename.gz errfile=/home/ccsa/archive.err emailfile=/home/ccsa/$walFilename.rpt cat <<END > runtime walPathname=$walPathname walFilename=$walFilename tempfile=$tempfile targetfile=$targetfile errfile=$errfile emailfile=$emailfile END if [ -f $targetfile ] ; then echo "Cannot copy $walPathName to $targetfile \ because it already exists" 2> $errfile archivereturnval=3 else if echo $walFilename | grep -Eq ^[0-9A-F]{24}$ - ; then pg_clearxlogtail < $walPathname 2> $errfile \ | gzip > $tempfile 2>> $errfile \ && mv $tempfile $targetfile 2>> $errfile archivereturnval=$? else gzip < $walPathname > $tempfile 2> $errfile \ && mv $tempfile $targetfile 2>> $errfile archivereturnval=$? fi fi if [ "$archivereturnval" -eq "0" ] ; then # Successful archive. # Delete any email file from a possible prior failure. rm -f $emailfile $errfile exit 0 fi # Failed to archive. if [ ! -f $emailfile ] ; then # No existing email files -- make one and send it. echo "date: $(date +%F_%H%M)" > $emailfile echo "to: dba.group@wicourts.gov" >> $emailfile echo "subject: Problem with archiving $walFilename ">> $emailfile echo " " >> $emailfile echo "$archivereturnval return code from archive" >> $emailfile echo " " >> $emailfile cat $errfile >> $emailfile echo " " >> $emailfile cat runtime >> $emailfile /usr/sbin/sendmail -t < $emailfile fi cat $errfile > /dev/stderr exit $archivereturnval