Thread: Checkpoints occur too frequently

Checkpoints occur too frequently

From

Simon Riggs

Date:

14 December 2004, 22:09:46

Recent OSDL test reports for 8.0RC1 show that checkpoints occur too
frequently for higher settings of checkpoint_segments and
checkpoint_timeout. This is incorrect according to the manual, and the
intention of the current implementation from examining the code.

Early tests showed that checkpoints were occurring too frequently.
checkpoint_timeout was set to 600 and 1800, on two separate tests, yet
checkpoints occurred in around 6m 15s. checkpoint_segments was set to
8192 for the last test, yet checkpoints did not occur any less
frequently. The presence of manual CHECKPOINTs has been ruled out by the
test author, Mark Wong.

DEBUG1 messages showed that there is an apparent limit of 255 xlog files
per checkpoint - this cannot be just a reporting bug since the
checkpoint timings are simply not increasing as requested by the
parameter settings. Many checkpoints occur with a spread of values from
0..255, though in one run there are 18 consecutive checkpoints with
exactly 255 files. Across the whole test there were 48 checkpoints, of
which 25 checkpoints recycled exactly 255 files and all others recycled
less. This situation could occur by chance, but with an extremely low
probability, which I estimate to be of at least of the order of 1 in 1
million - even a single test result higher than 255 would disprove this,
but none are available (anyone?).

Results referred to above are shown here:
http://www.osdl.org/projects/dbt2dev/results/dev4-010/207/db/log

checkpoint_segments is limited to the range 0..INT_MAX in the code, so
should not be limited to only 255 files, which dare I say is
suspiciously 1 bytes worth of files.

Not only is this a bug, but it limits high-end performance for those who
would wish to set those parameters higher.

Brief examination of the code reveals no explanation for this
observation, so I'm raising it for general investigation.

Happy for someone to explain how to make checkpoints occur less
frequently... (apart from don't write to the database).

--
Best Regards, Simon Riggs

Re: Checkpoints occur too frequently

From

Tom Lane

Date:

14 December 2004, 23:31:37

Simon Riggs <simon@2ndquadrant.com> writes:
> DEBUG1 messages showed that there is an apparent limit of 255 xlog files
> per checkpoint -

The volume-based checkpoint trigger code is

                if (IsUnderPostmaster &&
                    (openLogId != RedoRecPtr.xlogid ||
                     openLogSeg >= (RedoRecPtr.xrecoff / XLogSegSize) +
                     (uint32) CheckPointSegments))
                {
#ifdef WAL_DEBUG
                    if (XLOG_DEBUG)
                        elog(LOG, "time for a checkpoint, signaling bgwriter");
#endif
                    RequestCheckpoint(false);
                }

which now that I look at it obviously forces a checkpoint whenever
xlogid (the upper half of XLogRecPtr) changes, ie every 4GB of WAL
output.  I suppose on a high-performance platform it's possible that
one would want checkpoints further apart than that, though the idea
of plowing through multiple gigabytes of WAL in order to recover from
a crash is a bit daunting.

It's not immediately obvious how to recast the comparison without
either creating overflow bugs or depending on 64-bit-int arithmetic
being available.  Thoughts?

            regards, tom lane

Re: Checkpoints occur too frequently

From

Simon Riggs

Date:

15 December 2004, 00:27:12

On Tue, 2004-12-14 at 23:31, Tom Lane wrote:
> Simon Riggs <simon@2ndquadrant.com> writes:
> > DEBUG1 messages showed that there is an apparent limit of 255 xlog files
> > per checkpoint -
>
> The volume-based checkpoint trigger code is
>
>                 if (IsUnderPostmaster &&
>                     (openLogId != RedoRecPtr.xlogid ||
>                      openLogSeg >= (RedoRecPtr.xrecoff / XLogSegSize) +
>                      (uint32) CheckPointSegments))
>                 {
> #ifdef WAL_DEBUG
>                     if (XLOG_DEBUG)
>                         elog(LOG, "time for a checkpoint, signaling bgwriter");
> #endif
>                     RequestCheckpoint(false);
>                 }
>
> which now that I look at it obviously forces a checkpoint whenever
> xlogid (the upper half of XLogRecPtr) changes, ie every 4GB of WAL
> output.  I suppose on a high-performance platform it's possible that
> one would want checkpoints further apart than that, though the idea
> of plowing through multiple gigabytes of WAL in order to recover from
> a crash is a bit daunting.
>
> It's not immediately obvious how to recast the comparison without
> either creating overflow bugs or depending on 64-bit-int arithmetic
> being available.  Thoughts?

Thanks for finding it. It was staring me in the face.

I'd say no code changes for 8.0, now we know what's causing it. A doc
patch to show the limit is probably just going to annoy the translators
at this stage also.

Reasons:
- you can recompile using larger XLogSegSize, if you care to
- the real answer is to reduce the xlog volume

--
Best Regards, Simon Riggs

Re: Checkpoints occur too frequently

From

Tom Lane

Date:

15 December 2004, 03:58:09

Simon Riggs <simon@2ndquadrant.com> writes:
> I'd say no code changes for 8.0, now we know what's causing it. A doc
> patch to show the limit is probably just going to annoy the translators
> at this stage also.

We could adjust guc.c to limit checkpoint_segments to the range 1..255
without having to touch any translatable strings.  This isn't a
necessary change but it seems harmless ... any objections?

            regards, tom lane

Re: Checkpoints occur too frequently

From

Tom Lane

Date:

17 December 2004, 00:13:19

Tom Lane <tgl@sss.pgh.pa.us> writes:
> Simon Riggs <simon@2ndquadrant.com> writes:
>> I'd say no code changes for 8.0, now we know what's causing it. A doc
>> patch to show the limit is probably just going to annoy the translators
>> at this stage also.

> We could adjust guc.c to limit checkpoint_segments to the range 1..255
> without having to touch any translatable strings.  This isn't a
> necessary change but it seems harmless ... any objections?

Or we could just fix it.  After thinking a bit more, I realized that
it's not hard to push the forced-checkpoint boundary out to 2^32
segments instead of 255.  That should be enough to still any complaints.

            regards, tom lane

Re: Checkpoints occur too frequently

From

Simon Riggs

Date:

18 December 2004, 20:06:25

On Fri, 2004-12-17 at 00:12, Tom Lane wrote:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
> > Simon Riggs <simon@2ndquadrant.com> writes:
> >> I'd say no code changes for 8.0, now we know what's causing it. A doc
> >> patch to show the limit is probably just going to annoy the translators
> >> at this stage also.
>
> > We could adjust guc.c to limit checkpoint_segments to the range 1..255
> > without having to touch any translatable strings.  This isn't a
> > necessary change but it seems harmless ... any objections?
>
> Or we could just fix it.  After thinking a bit more, I realized that
> it's not hard to push the forced-checkpoint boundary out to 2^32
> segments instead of 255.  That should be enough to still any complaints.

Sorry for the delay in replying.

Thanks for considering this further.

If it can be fixed in 8.0, that would be good. If this means any risk or
non-portability, then I would defer.

--
Best Regards, Simon Riggs