Home > mailing lists

Re: Hot standby, recovery infrastructure - Mailing list pgsql-hackers

From	Simon Riggs
Subject	Re: Hot standby, recovery infrastructure
Date	January 27, 2009 11:03:53
Msg-id	1233068760.2327.2155.camel@ebony.2ndQuadrant Whole thread Raw
In response to	Hot standby, recovery infrastructure (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses	Re: Hot standby, recovery infrastructure
List	pgsql-hackers

Tree view

On Tue, 2009-01-27 at 15:59 +0200, Heikki Linnakangas wrote:
> Regarding this comment:
> 
> > +   /*
> > +    * Prior to 8.4 we wrote a Shutdown Checkpoint at the end of recovery.
> > +    * This could add minutes to the startup time, so we want bgwriter
> > +    * to perform it. This then frees the Startup process to complete so we can
> > +    * allow transactions and WAL inserts. We still write a checkpoint, but
> > +    * it will be an online checkpoint. Online checkpoints have a redo
> > +    * location that can be prior to the actual checkpoint record. So we want
> > +    * to derive that redo location *before* we let anybody else write WAL,
> > +    * otherwise we might miss some WAL records if we crash.
> > +    */
> 
> Does this describe a failure case or something that would cause 
> corruption? The tone of the message implies so, but I don't see anything 
> wrong with deriving the redo location for the first checkpoint the usual 
> way.
> 
> I belive the case of "missing some WAL records" refers to the 
> possibility that someone connects to the database and does a WAL logged 
> change before the first checkpoint starts. But if we then crash before 
> the checkpoint finishes, we'll start crash recovery from the previous 
> restartpoint/checkpoint as usual, and replay that WAL record as well. 
> And if the first checkpoint finishes, the redo ptr of that checkpoint is 
> after that WAL record, 

Sorry, this is another one of those "yes I thought that at first"
moments.

> and those changes are safely on disk.

They may not be. They might have happened after BufferSync marks all
dirty buffers BM_CHECKPOINT_NEEDED and yet before we write the physical
checkpoint record.

The idea of the checkpoint is to confirm the recovery is complete and
make sure the starting point for crash recovery isn't somewhere in the
archive.

We must record the logical start before we allow any changes to be
written, otherwise we might miss the intermediate changes.

Just think standard-online-checkpoint and it all fits.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support

pgsql-hackers by date:

From: Dave Page
Date: 27 January 2009, 10:56:19
Subject: Re: pg_upgrade project status

From: Stephen Frost
Date: 27 January 2009, 11:05:49
Subject: Re: pg_upgrade project status

Re: Hot standby, recovery infrastructure - Mailing list pgsql-hackers

Previous

Next