Re: Deriving Recovery Snapshots - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Deriving Recovery Snapshots
Date
Msg-id 1224673546.27145.218.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Re: Deriving Recovery Snapshots  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Deriving Recovery Snapshots
List pgsql-hackers
On Wed, 2008-10-22 at 12:29 +0300, Heikki Linnakangas wrote:
> Simon Riggs wrote:
> > On Thu, 2008-10-16 at 18:52 +0300, Heikki Linnakangas wrote:
> >> Simon Riggs wrote:
> >>> * The backend slot may not be reused for some time, so we should take
> >>> additional actions to keep state current and true. So we choose to log a
> >>> snapshot from the master into WAL after each checkpoint. This can then
> >>> be used to cleanup any unobserved xids. It also provides us with our
> >>> initial state data, see later.
> >> We don't need to log a complete snapshot, do we? Just oldestxmin should 
> >> be enough.
> > 
> > Possibly, but you're thinking that once we're up and running we can use
> > less info.
> > 
> > Trouble is, you don't know when/if the standby will crash/be shutdown.
> > So we need regular full snapshots to allow it to re-establish full
> > information at regular points. So we may as well drop the whole snapshot
> > to WAL every checkpoint. To do otherwise would mean more code and less
> > flexibility.
> 
> Surely it's less code to write the OldestXmin to the checkpoint record, 
> rather than a full snapshot, no? And to read it off the checkpoint record.

You may be missing my point.

We need an initial state to work from.

I am proposing we write a full snapshot after each checkpoint because it
allows us to start recovery again from that point. If we wrote only
OldestXmin as you suggest it would optimise the size of the WAL record
but it would prevent us from restarting at that point.

Also, passing OldestXmin only would not work in the presence of long
running statements. Passing the snapshot allows us to see that FATAL
errors have occurred much sooner.

BTW, the way I have coded it means that if we skip writing a checkpoint
on a quiet system then we would also skip writing the snapshot.

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: Martin Pihlak
Date:
Subject: Re: Withdraw PL/Proxy from commitfest
Next
From: "Merlin Moncure"
Date:
Subject: Re: binary representation of datatypes