Re: Deriving Recovery Snapshots - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Deriving Recovery Snapshots
Date
Msg-id 48FF38DC.9080206@enterprisedb.com
Whole thread Raw
In response to Re: Deriving Recovery Snapshots  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Deriving Recovery Snapshots
List pgsql-hackers
Simon Riggs wrote:
> On Wed, 2008-10-22 at 12:29 +0300, Heikki Linnakangas wrote:
>> Simon Riggs wrote:
>>> On Thu, 2008-10-16 at 18:52 +0300, Heikki Linnakangas wrote:
>>>> Simon Riggs wrote:
>>>>> * The backend slot may not be reused for some time, so we should take
>>>>> additional actions to keep state current and true. So we choose to log a
>>>>> snapshot from the master into WAL after each checkpoint. This can then
>>>>> be used to cleanup any unobserved xids. It also provides us with our
>>>>> initial state data, see later.
>>>> We don't need to log a complete snapshot, do we? Just oldestxmin should 
>>>> be enough.
>>> Possibly, but you're thinking that once we're up and running we can use
>>> less info.
>>>
>>> Trouble is, you don't know when/if the standby will crash/be shutdown.
>>> So we need regular full snapshots to allow it to re-establish full
>>> information at regular points. So we may as well drop the whole snapshot
>>> to WAL every checkpoint. To do otherwise would mean more code and less
>>> flexibility.
>> Surely it's less code to write the OldestXmin to the checkpoint record, 
>> rather than a full snapshot, no? And to read it off the checkpoint record.
> 
> You may be missing my point.
> 
> We need an initial state to work from.
> 
> I am proposing we write a full snapshot after each checkpoint because it
> allows us to start recovery again from that point. If we wrote only
> OldestXmin as you suggest it would optimise the size of the WAL record
> but it would prevent us from restarting at that point.

Well, you'd just need to treat anything > oldestxmin, and not marked as 
finished in clog, as unobserved. Which doesn't seem too bad. Not that 
storing the full list of in-progress xids is that bad either, though.

Hmm. What about in-progress subtransactions that have overflowed the 
shared mem cache? Can we rely that subtrans is up-to-date, up to the 
checkpoint record that recovery starts from?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Deriving Recovery Snapshots
Next
From: Simon Riggs
Date:
Subject: Re: Deriving Recovery Snapshots