On Thu, Oct 6, 2011 at 12:06 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Simon Riggs <simon@2ndQuadrant.com> writes:
>> The current idea is that if there has been no activity then we skip
>> checkpoint. But all it takes is a single WAL record and off we go with
>> another checkpoint. If there hasn't been much WAL activity, there is
>> not much point in having another checkpoint record since there is
>> little if any time to be saved in recovery.
>
>> So why not avoid checkpoints until we have written at least 1 WAL file
>> worth of data?
>
> +1, but I think you need to compare to the last checkpoint's REDO
> pointer, not to the position of the checkpoint record itself.
> Otherwise, the argument falls down if there was a lot of activity
> during the last checkpoint (which is not unlikely in these days of
> spread checkpoints).
>
> Also I think the comment needs more extensive revision than you gave it.
If we go with this approach, we presumably also need to update the
documentation, especially for checkpoint_timeout (which will no longer
be a hard limit on the time between checkpoints).
I'm not entirely sure I understand the rationale, though. I mean, if
very little has happened since the last checkpoint, then the
checkpoint will be very cheap. In the totally degenerate case Fujii
Masao is reporting, where absolutely nothing has happened, it should
be basically free. We'll loop through a whole bunch of things, decide
there's nothing to fsync, and call it a day.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company