Home > mailing lists

Re: Hot standby, recovery infra - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: Hot standby, recovery infra
Date	January 28, 2009 08:48:42
Msg-id	1233146554.2327.2343.camel@ebony.2ndQuadrant Whole thread Raw
In response to	Hot standby, recovery infra (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses	Re: Hot standby, recovery infra
List	pgsql-hackers

Tree view

On Wed, 2009-01-28 at 12:04 +0200, Heikki Linnakangas wrote:
> I've been reviewing and massaging the so called recovery infra patch.

Thanks.

> I feel quite good about this patch now. Given the amount of code
> churn, it requires testing, and I'll read it through one more time
> after sleeping over it. 

There's nothing major I feel we should discuss.

The way restartpoints happen is a useful improvement, thanks.

> Simon, do you see anything wrong with this?

Few minor points

* I think we are now renaming the recovery.conf file too early. The
comment says "We have already restored all the WAL segments we need from
the archive, and we trust that they are not going to go away even if we
crash." We have, but the files overwrite each other as they arrive, so
if the last restartpoint is not in the last restored WAL file then it
will only exist in the archive. The recovery.conf is the only place
where we store the information on where the archive is and how to access
it, so by renaming it out of the way we will be unable to crash recover
until the first checkpoint is complete. So the way this was in the
original patch is the correct way to go, AFAICS.

* my original intention was to deprecate log_restartpoints and would
still like to do so. log_checkpoints does just as well for that. Even
less code than before...

* comment on BgWriterShmemInit() refers to CHECKPOINT_IS_STARTUP, but
the actual define is CHECKPOINT_STARTUP. Would prefer the "is" version
because it matches the IS_SHUTDOWN naming.

* In CreateCheckpoint() the if test on TruncateSubtrans() has been
removed, but the comment has not been changed (to explain why).

* PG_CONTROL_VERSION bump should be just one increment, to 844. I
deliberately had it higher to help spot mismatches earlier, and to avoid
needless patch conflicts.

So it looks pretty much ready for commit very soon.

We should continue to measure performance of recovery in the light of
these changes. I still feel that fsyncing the control file on each
XLogFileRead() will give a noticeable performance penalty, mostly
because we know doing exactly the same thing in normal running caused a
performance penalty. But that is easily changed and cannot be done with
any certainty without wider feedback, so no reason to delay code commit.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

pgsql-hackers by date:

From: Stephen Frost
Date: 28 January 2009, 07:54:54
Subject: Re: pg_upgrade project status

From: Gregory Stark
Date: 28 January 2009, 08:56:51
Subject: Re: posix_fadvise v22

Re: Hot standby, recovery infra - Mailing list pgsql-hackers

Previous

Next