On 11/02/2011 05:48 PM, Simon Riggs wrote:
> On Wed, Nov 2, 2011 at 6:27 PM, Robert Haas<robertmhaas@gmail.com> wrote:
>
>
>> Again, it's no longer the maximum time between automatic checkpoints.
>>
> You're missing the point that it never was like that. I've not altered
> the situation.
>
Robert's point is more that the existing docs are already broken; this
new patch can just increase how bad the drift between reality and
documentation can be. Before, the only people who ran into this had
zero activity on the server, which meant there wasn't any data to be
lost, either. Now it's potentially broader than that.
With some trivial checkpoints containing a small amount of data skipped
now, aren't there some cases where less WAL data will be written than
before? In that case, the user visible behavior here would be
different. I'd be most concerned about file-based log shipping case.
I don't think there's any change needed to the "Write Ahead Log" section
of the "Server Configuration" chapter. In the "Reliability and the
Write-Ahead Log" chapter, this text in "WAL Configuration" was already
highlighted as the problem here:
The server's background writer process automatically performs a
checkpoint every so often. A checkpoint is created every
checkpoint_segments log segments, or every checkpoint_timeout seconds,
whichever comes first. The default settings are 3 segments and 300
seconds (5 minutes), respectively. It is also possible to force a
checkpoint by using the SQL command CHECKPOINT.
I think this needs a change like this, to address the hole that was
already in the docs and cover the new behavior too; this goes just
before " It is also possible to force..."
In cases where there are little or no writes to the WAL, checkpoints
will be skipped even if checkpoint_timeout has passed. At least one new
WAL segment must have been created before an automatic checkpoint
occurs. The time between checkpoints and when new WAL segments are
created are not related in any other way. If file-based WAL shipping is
being used and you want to bound how often files are sent to standby
server, to reduce potential data loss you should adjust archive_timeout
parameter rather than the checkpoint ones.
This area is a confusing one, so some clarification of the relation
between checkpoints and replication is a useful docs improvement.
--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us