Re: Load Distributed Checkpoints, revised patch - Mailing list pgsql-patches

From Simon Riggs
Subject Re: Load Distributed Checkpoints, revised patch
Date
Msg-id 1181985016.17734.67.camel@silverbirch.site
Whole thread Raw
In response to Load Distributed Checkpoints, revised patch  (Heikki Linnakangas <heikki@enterprisedb.com>)
Responses Re: Load Distributed Checkpoints, revised patch
Re: Load Distributed Checkpoints, revised patch
List pgsql-patches
On Fri, 2007-06-15 at 11:34 +0100, Heikki Linnakangas wrote:

> - What units should we use for the new GUC variables? From
> implementation point of view, it would be simplest if
> checkpoint_write_rate is given as pages/bgwriter_delay, similarly to
> bgwriter_*_maxpages. I never liked those *_maxpages settings, though, a
> more natural unit from users perspective would be KB/s.

checkpoint_maxpages would seem like a better name; we've already had
those _maxpages settings for 3 releases, so changing that is not really
an option (at so late a stage). We don't really care about units because
the way you use it is to nudge it up a little and see if that works
etc..

Can we avoid having another parameter? There must be some protection in
there to check that a checkpoint lasts for no longer than
checkpoint_timeout, so it makes most sense to vary the checkpoint in
relation to that parameter.

> - The signaling between RequestCheckpoint and bgwriter is a bit tricky.
> Bgwriter now needs to deal immediate checkpoint requests, like those
> coming from explicit CHECKPOINT or CREATE DATABASE commands, differently
> from those triggered by checkpoint_segments. I'm afraid there might be
> race conditions when a CHECKPOINT is issued at the same instant as
> checkpoint_segments triggers one. What might happen then is that the
> checkpoint is performed lazily, spreading the writes, and the CHECKPOINT
> command has to wait for that to finish which might take a long time. I
> have not been able to convince myself neither that the race condition
> exists or that it doesn't.

Is there a mechanism for requesting immediate/non-immediate checkpoints?

pg_start_backup() should be a normal checkpoint I think. No need for
backup to be an intrusive process.

> - to coordinate the writes with with checkpoint_segments, we need to
> read the WAL insertion location. To do that, we need to acquire the
> WALInsertLock. That means that in the worst case, WALInsertLock is
> acquired every bgwriter_delay when a checkpoint is in progress. I don't
> think that's a problem, it's only held for a very short duration, but I
> thought I'd mention it.

I think that is a problem. Do we need to know it so exactly that we look
at WALInsertLock? Maybe use info_lck to request the latest page, since
that is less heavily contended and we need never wait across I/O.

> - How should we deal with changing GUC variables that affect LDC, on the
> fly when a checkpoint is in progress? The attached patch finishes the
> in-progress checkpoint ASAP, and reloads the config after that. We could
> reload the config immediately, but making the new settings effective
> immediately is not trivial.

No need to do this during a checkpoint, there'll be another along
shortly anyhow.

--
  Simon Riggs
  EnterpriseDB   http://www.enterprisedb.com



pgsql-patches by date:

Previous
From: "Jaime Casanova"
Date:
Subject: Re: Maintaining cluster order on insert
Next
From: "Simon Riggs"
Date:
Subject: Re: Load Distributed Checkpoints, revised patch