Thread: checkpoint_timeout and archive_timeout

checkpoint_timeout and archive_timeout

From

Jeff Janes

Date:

16 August 2012, 15:58:23

I have a database that only receives updates maybe a few times a week.
 It is the storage for some web content where the content is not
changed very often.

I set archive_timeout to 5 minutes, because when an update is made I
would like the log that contains that change to get archived fairly
soon, and I'm OK with having 16MB of log for each update, or sequence
of closely timed updates.

However, archive_timeout and checkpoint_timeout have a pernicious
interaction.  Each one individually suppresses needless operations,
i.e. not checkpointing if no WAL was written since last checkpoint,
and not log-switching if no WAL was written since the last log-switch.
 But in combination, a checkpoint writes WAL that triggers a
log-switch, and a log-switch writes WAL that triggers a checkpoint.
So a server that is completely free of user activity will still
generate an endless stream of WAL files, averaging one file per
max(archive_timeout, checkpoint_timeout).  That comes out to one 16MB
file per hour (since it is not possible to set checkpoint_timeout >
1h) which seems a bit much when absolutely no user-data changes are
occurring.

Other than running a streaming standby server, which is rather
excessive for such a simple use case, is there some way to avoid this?
 Ideally I could just set checkpoint_timeout to 0 (meaning infinity)
but that is not an allowed setting.

(This is 9.1.4, but the same behavior applies to other versions,
including 9.2beta)

Thanks,

Jeff

Re: checkpoint_timeout and archive_timeout

From

"Kevin Grittner"

Date:

16 August 2012, 16:30:13

Jeff Janes <jeff.janes@gmail.com> wrote:

> archive_timeout and checkpoint_timeout have a pernicious
> interaction.  Each one individually suppresses needless
> operations, i.e. not checkpointing if no WAL was written since
> last checkpoint, and not log-switching if no WAL was written since
> the last log-switch.  But in combination, a checkpoint writes WAL
> that triggers a log-switch, and a log-switch writes WAL that
> triggers a checkpoint.  So a server that is completely free of
> user activity will still generate an endless stream of WAL files,
> averaging one file per max(archive_timeout, checkpoint_timeout).
> That comes out to one 16MB file per hour (since it is not possible
> to set checkpoint_timeout > 1h) which seems a bit much when
> absolutely no user-data changes are occurring.

Tom actually had a patch to change that, but at the time it would
have left those of us using PITR backup techniques with no good way
to detect certain types of failures.  We had a monitoring system
that looked at the last checkpoint time on the warm standby to
detect when it was "stale", and nobody could suggest a good
alternative at the time.  So Tom backed off on that patch.  Now that
there are alternatives, it may be time to take another look at that.

BTW, that's also why I wrote the pg_clearxlogtail utility (source
code on pgfoundry).  We pipe our archives through that and gzip
which changes this to an endless stream of 16KB files.  Those three
orders of magnitude can make all the difference.  :-)

-Kevin

Re: checkpoint_timeout and archive_timeout

From

Jeff Janes

Date:

10 September 2012, 23:03:47

On Thu, Aug 16, 2012 at 9:30 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Jeff Janes <jeff.janes@gmail.com> wrote:
>
>> So a server that is completely free of
>> user activity will still generate an endless stream of WAL files,
>> averaging one file per max(archive_timeout, checkpoint_timeout).
>> That comes out to one 16MB file per hour (since it is not possible
>> to set checkpoint_timeout > 1h) which seems a bit much when
>> absolutely no user-data changes are occurring.
>
...
>
> BTW, that's also why I wrote the pg_clearxlogtail utility (source
> code on pgfoundry).  We pipe our archives through that and gzip
> which changes this to an endless stream of 16KB files.  Those three
> orders of magnitude can make all the difference.  :-)

Thanks.  Do you put the clearxlogtail and the gzip into the
archive_command, or just do a simple copy into the archive and then
have a cron job do the processing in the archives later?  I'm not
really sure what the failure modes are for having pipelines built into
the archive_command.

Thanks,

Jeff

Re: checkpoint_timeout and archive_timeout

From

"Kevin Grittner"

Date:

11 September 2012, 12:36:16

Jeff Janes  wrote:
> Kevin Grittner  wrote:
>> Jeff Janes  wrote:

>>> So a server that is completely free of user activity will still
>>> generate an endless stream of WAL files, averaging one file per
>>> max(archive_timeout, checkpoint_timeout). That comes out to one
>>> 16MB file per hour (since it is not possible to set
>>> checkpoint_timeout > 1h) which seems a bit much when absolutely
>>> no user-data changes are occurring.

>> BTW, that's also why I wrote the pg_clearxlogtail utility (source
>> code on pgfoundry). We pipe our archives through that and gzip
>> which changes this to an endless stream of 16KB files. Those three
>> orders of magnitude can make all the difference. :-)
>
> Thanks. Do you put the clearxlogtail and the gzip into the
> archive_command, or just do a simple copy into the archive and then
> have a cron job do the processing in the archives later? I'm not
> really sure what the failure modes are for having pipelines built
> into the archive_command.

We pipe the file into pg_clearxlogtail | gzip and pipe it out to the
archive directory (with a ".gz" suffix), rather than using cp and
processing it later.  Well, actually, we pipe it to a directory on
the same mount point as the archive directory and mv it into place,
as part of our scheme to avoid problems with partial files.

-Kevin

Re: checkpoint_timeout and archive_timeout

From

Jeff Janes

Date:

18 September 2012, 21:10:53

On Tue, Sep 11, 2012 at 5:36 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Jeff Janes  wrote:
>> Kevin Grittner  wrote:
>
>>> BTW, that's also why I wrote the pg_clearxlogtail utility (source
>>> code on pgfoundry). We pipe our archives through that and gzip
>>> which changes this to an endless stream of 16KB files. Those three
>>> orders of magnitude can make all the difference. :-)
>>
>> Thanks. Do you put the clearxlogtail and the gzip into the
>> archive_command, or just do a simple copy into the archive and then
>> have a cron job do the processing in the archives later? I'm not
>> really sure what the failure modes are for having pipelines built
>> into the archive_command.
>
> We pipe the file into pg_clearxlogtail | gzip and pipe it out to the
> archive directory (with a ".gz" suffix), rather than using cp and
> processing it later.  Well, actually, we pipe it to a directory on
> the same mount point as the archive directory and mv it into place,
> as part of our scheme to avoid problems with partial files.

Do you have an example of that which you could share?

I've run into two problems I'm trying to overcome.  One is that
pg_clearxlogtail fails on file formats it doesn't recognize but is
asked to archive anyway, such as
'000000010000000200000065.00000020.backup', for example.  Perhaps it
could just issue a warning and then pass the unrecognized file through
unchanged, instead of bailing out with a fatal error.

The other is that a pipeline in bash reports success even if an
interior member of it failed.  I think I know how to fix that under an
actual shell script (as opposed to a pipeline stuffed into
"archive_command"), but would like to see how other people have dealt
with it.

Thanks,

Jeff

Re: checkpoint_timeout and archive_timeout

From

"Kevin Grittner"

Date:

18 September 2012, 21:31:18

Jeff Janes <jeff.janes@gmail.com> wrote:
> Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:

>> We pipe the file into pg_clearxlogtail | gzip and pipe it out to
>> the archive directory (with a ".gz" suffix), rather than using cp
>> and processing it later.  Well, actually, we pipe it to a
>> directory on the same mount point as the archive directory and mv
>> it into place, as part of our scheme to avoid problems with
>> partial files.
>
> Do you have an example of that which you could share?
>
> I've run into two problems I'm trying to overcome.  One is that
> pg_clearxlogtail fails on file formats it doesn't recognize but is
> asked to archive anyway, such as
> '000000010000000200000065.00000020.backup', for example.  Perhaps
> it could just issue a warning and then pass the unrecognized file
> through unchanged, instead of bailing out with a fatal error.
>
> The other is that a pipeline in bash reports success even if an
> interior member of it failed.  I think I know how to fix that
> under an actual shell script (as opposed to a pipeline stuffed
> into "archive_command"), but would like to see how other people
> have dealt with it.

I make no claims that this couldn't be improved upon, but it has
been humming along without any problems without modification for
about five years now on 80-some machines.  Our archive_command is
just a call to a bash script that looks like this:


walPathname=$1
walFilename=$2

tempfile=/var/pgsql/data/wal-files-temp/$walFilename.gz
targetfile=/var/pgsql/data/wal-files/$walFilename.gz
errfile=/home/ccsa/archive.err
emailfile=/home/ccsa/$walFilename.rpt

cat <<END > runtime
walPathname=$walPathname
walFilename=$walFilename
tempfile=$tempfile
targetfile=$targetfile
errfile=$errfile
emailfile=$emailfile
END

if [ -f $targetfile ] ; then
  echo "Cannot copy $walPathName to $targetfile \
because it already exists" 2> $errfile
  archivereturnval=3
else
  if echo $walFilename | grep -Eq ^[0-9A-F]{24}$ - ; then
    pg_clearxlogtail < $walPathname 2> $errfile \
      | gzip > $tempfile 2>> $errfile \
      && mv $tempfile $targetfile 2>> $errfile
    archivereturnval=$?
  else
    gzip < $walPathname > $tempfile 2> $errfile \
      && mv $tempfile $targetfile 2>> $errfile
    archivereturnval=$?
  fi
fi

if [ "$archivereturnval" -eq "0" ] ; then
  # Successful archive.
  # Delete any email file from a possible prior failure.
  rm -f $emailfile $errfile
  exit 0
fi

# Failed to archive.
if [ ! -f $emailfile ] ; then
  # No existing email files -- make one and send it.
  echo "date: $(date +%F_%H%M)" > $emailfile
  echo "to: dba.group@wicourts.gov" >> $emailfile
  echo "subject: Problem with archiving $walFilename ">> $emailfile
  echo " "  >> $emailfile
  echo "$archivereturnval return code from archive" >> $emailfile
  echo " "  >> $emailfile
  cat $errfile >> $emailfile
  echo " "  >> $emailfile
  cat runtime >> $emailfile
  /usr/sbin/sendmail -t < $emailfile
fi
cat $errfile > /dev/stderr
exit $archivereturnval