Thread: Checkpoints vs restartpoints

Checkpoints vs restartpoints

From

Thomas Munro

Date:

09 June 2015, 23:20:26

Hi

Why do standby servers not simply treat every checkpoint as a
restartpoint?  As I understand it, setting checkpoint_timeout and
checkpoint_segments higher on a standby server effectively instruct
standby servers to skip some checkpoints.  Even with the same settings
on both servers, the server could still choose to skip a checkpoint
near the checkpoint_timeout limit due to the vagaries of time keeping
(though I suppose it's very unlikely).  But what could the advantage
of skipping checkpoints be?  Do people deliberately set hot standby
machines up like this to trade a longer crash recover time for lower
write IO?

I was wondering about this in the context of the recent multixact
work, since such configurations could leave you with different SLRU
files on disk which in some versions might change the behaviour in
interesting ways.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: Checkpoints vs restartpoints

From

Jeff Janes

Date:

10 June 2015, 00:20:28

On Tue, Jun 9, 2015 at 4:20 PM, Thomas Munro <thomas.munro@enterprisedb.com> wrote:

Hi

Why do standby servers not simply treat every checkpoint as a
restartpoint? As I understand it, setting checkpoint_timeout and
checkpoint_segments higher on a standby server effectively instruct
standby servers to skip some checkpoints. Even with the same settings
on both servers, the server could still choose to skip a checkpoint
near the checkpoint_timeout limit due to the vagaries of time keeping
(though I suppose it's very unlikely). But what could the advantage
of skipping checkpoints be? Do people deliberately set hot standby
machines up like this to trade a longer crash recover time for lower
write IO?

When a hot standby server is initially being set up using a rather old base backup and an archive directory, it could be applying WAL at a very high rate such that it would replay master checkpoints multiple times a second (when the master has long periods with little write activity and has checkpoints driven by timeouts during those periods). Actually doing restartpoints that often could be annoying. Presumably there would be few dirty buffers to write out, since each checkpoint saw little activity, but you would still have to circle the shared_buffers twice, and fsync whichever files did happen to get some changes.

Cheers,

Jeff

Re: Checkpoints vs restartpoints

From

Bruce Momjian

Date:

10 June 2015, 00:34:03

On Tue, Jun  9, 2015 at 05:20:23PM -0700, Jeff Janes wrote:
> On Tue, Jun 9, 2015 at 4:20 PM, Thomas Munro <thomas.munro@enterprisedb.com>
> wrote:
> 
>     Hi
> 
>     Why do standby servers not simply treat every checkpoint as a
>     restartpoint?  As I understand it, setting checkpoint_timeout and
>     checkpoint_segments higher on a standby server effectively instruct
>     standby servers to skip some checkpoints.  Even with the same settings
>     on both servers, the server could still choose to skip a checkpoint
>     near the checkpoint_timeout limit due to the vagaries of time keeping
>     (though I suppose it's very unlikely).  But what could the advantage
>     of skipping checkpoints be?  Do people deliberately set hot standby
>     machines up like this to trade a longer crash recover time for lower
>     write IO?
> 
> 
> When a hot standby server is initially being set up using a rather old base
> backup and an archive directory, it could be applying WAL at a very high rate
> such that it would replay master checkpoints multiple times a second (when the
> master has long periods with little write activity and has checkpoints driven
> by timeouts during those periods).  Actually doing restartpoints that often
> could be annoying.  Presumably there would be few dirty buffers to write out,
> since each checkpoint saw little activity, but you would still have to circle
> the shared_buffers twice, and fsync whichever files did happen to get some
> changes.

Ah, so even thought standbys don't have to write WAL, they are fsyncing
shared buffers.  Where is the restart point recorded, in pg_controldata?
c
--  Bruce Momjian  <bruce@momjian.us>        http://momjian.us EnterpriseDB
http://enterprisedb.com
 + Everyone has their own god. +

Re: Checkpoints vs restartpoints

From

Michael Paquier

Date:

10 June 2015, 01:09:15

On Wed, Jun 10, 2015 at 9:33 AM, Bruce Momjian wrote:
> Ah, so even thought standbys don't have to write WAL, they are fsyncing
> shared buffers.  Where is the restart point recorded, in pg_controldata?
> c

Yep. Latest checkpoint's REDO location, or
ControlFile->checkPointCopy.redo. During recovery, a copy is kept as
well in XLogCtlData.lastCheckPoint.
-- 
Michael

Re: Checkpoints vs restartpoints

From

Andres Freund

Date:

10 June 2015, 08:12:31

On 2015-06-10 11:20:19 +1200, Thomas Munro wrote:
> I was wondering about this in the context of the recent multixact
> work, since such configurations could leave you with different SLRU
> files on disk which in some versions might change the behaviour in
> interesting ways.

Note that trigger a restartpoint everytime a checkpoint is replayed
wouldn't realistically fix this. Restartpoints are performed in the
background (the checkpointer), not in the startup process itself. Not
doing that would be prohibitive performance wise, because each
checkpoint would stop replication progress for seconds to tens of
minutes.

- Andres