Home > mailing lists

Re: Load Distributed Checkpoints, revised patch - Mailing list pgsql-patches

From	Simon Riggs
Subject	Re: Load Distributed Checkpoints, revised patch
Date	June 16, 2007 06:13:15
Msg-id	1181985016.17734.67.camel@silverbirch.site Whole thread Raw
In response to	Load Distributed Checkpoints, revised patch (Heikki Linnakangas <heikki@enterprisedb.com>)
Responses	Re: Load Distributed Checkpoints, revised patch Re: Load Distributed Checkpoints, revised patch
List	pgsql-patches

Tree view

On Fri, 2007-06-15 at 11:34 +0100, Heikki Linnakangas wrote:

> - What units should we use for the new GUC variables? From
> implementation point of view, it would be simplest if
> checkpoint_write_rate is given as pages/bgwriter_delay, similarly to
> bgwriter_*_maxpages. I never liked those *_maxpages settings, though, a
> more natural unit from users perspective would be KB/s.

checkpoint_maxpages would seem like a better name; we've already had
those _maxpages settings for 3 releases, so changing that is not really
an option (at so late a stage). We don't really care about units because
the way you use it is to nudge it up a little and see if that works
etc..

Can we avoid having another parameter? There must be some protection in
there to check that a checkpoint lasts for no longer than
checkpoint_timeout, so it makes most sense to vary the checkpoint in
relation to that parameter.

> - The signaling between RequestCheckpoint and bgwriter is a bit tricky.
> Bgwriter now needs to deal immediate checkpoint requests, like those
> coming from explicit CHECKPOINT or CREATE DATABASE commands, differently
> from those triggered by checkpoint_segments. I'm afraid there might be
> race conditions when a CHECKPOINT is issued at the same instant as
> checkpoint_segments triggers one. What might happen then is that the
> checkpoint is performed lazily, spreading the writes, and the CHECKPOINT
> command has to wait for that to finish which might take a long time. I
> have not been able to convince myself neither that the race condition
> exists or that it doesn't.

Is there a mechanism for requesting immediate/non-immediate checkpoints?

pg_start_backup() should be a normal checkpoint I think. No need for
backup to be an intrusive process.

> - to coordinate the writes with with checkpoint_segments, we need to
> read the WAL insertion location. To do that, we need to acquire the
> WALInsertLock. That means that in the worst case, WALInsertLock is
> acquired every bgwriter_delay when a checkpoint is in progress. I don't
> think that's a problem, it's only held for a very short duration, but I
> thought I'd mention it.

I think that is a problem. Do we need to know it so exactly that we look
at WALInsertLock? Maybe use info_lck to request the latest page, since
that is less heavily contended and we need never wait across I/O.

> - How should we deal with changing GUC variables that affect LDC, on the
> fly when a checkpoint is in progress? The attached patch finishes the
> in-progress checkpoint ASAP, and reloads the config after that. We could
> reload the config immediately, but making the new settings effective
> immediately is not trivial.

No need to do this during a checkpoint, there'll be another along
shortly anyhow.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com

pgsql-patches by date:

From: "Jaime Casanova"
Date: 16 June 2007, 01:48:51
Subject: Re: Maintaining cluster order on insert

From: "Simon Riggs"
Date: 16 June 2007, 06:17:48
Subject: Re: Load Distributed Checkpoints, revised patch

Re: Load Distributed Checkpoints, revised patch - Mailing list pgsql-patches

Previous

Next