On 10 November 2012 18:16, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> I wrote:
>> I'm inclined to think that we need to fix this by getting rid of
>> RestoreBkpBlocks per se, and instead having the per-WAL-record restore
>> routines dictate when each full-page image is restored (and whether or
>> not to release the buffer lock immediately). That's not going to be a
>> small change unfortunately :-(
>
> Here's a WIP patch that attacks it in this way. I've only gone as far as
> fixing btree_xlog_split() for the moment, since the main point is to see
> whether the coding change seems reasonable. At least in this function,
> it seems that locking considerations are handled correctly as long as no
> full-page image is used, so it's pretty straightforward to tweak it to
> handle the case with full-page image(s) as well. I've not looked
> through any other replay functions yet --- some of them may need more
> invasive hacking. But so far this looks pretty good.
Looks fine, but we need a top-level comment in btree code explaining
why/how it follows the very well explained rules in the README.
> Note that this patch isn't correct yet even for btree_xlog_split(),
> because I've not removed the RestoreBkpBlocks() call in btree_redo().
> All the btree replay routines will have to get fixed before it's
> testable at all.
I was just about to write you back you missed that...
> One thing that seems a bit annoying is the use of zero-based backup
> block indexes in RestoreBackupBlock, when most (not all) of the callers
> are referencing macros with one-based names (XLR_BKP_BLOCK_1 etc).
> That's a bug waiting to happen. We could address it by changing
> RestoreBackupBlock to accept a one-based argument, but what I'm thinking
> of doing instead is getting rid of the one-based convention entirely;
> that is, remove XLR_BKP_BLOCK_1 through XLR_BKP_BLOCK_4 and instead have
> all code use the XLR_SET_BKP_BLOCK() macro, which is zero-based. One
> advantage of doing that is that it'll force inspection of all code
> related to this.
I wouldn't do that in a back branch, but I can see why its a good idea.
Thanks for fixing.
-- Simon Riggs http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services