Re: Hot standby and b-tree killed items - Mailing list pgsql-hackers

From Simon Riggs
Subject Re: Hot standby and b-tree killed items
Date
Msg-id 1229679524.4793.481.camel@ebony.2ndQuadrant
Whole thread Raw
In response to Hot standby and b-tree killed items  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Hot standby and b-tree killed items  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On Fri, 2008-12-19 at 10:49 +0200, Heikki Linnakangas wrote:

> Whenever a B-tree index scan fetches a heap tuple that turns out to be 
> dead, the B-tree item is marked as killed by calling _bt_killitems. When 
> the page gets full, all the killed items are removed by calling 
> _bt_vacuum_one_page.
> 
> That's a problem for hot standby. If any of the killed b-tree items 
> point to a tuple that is still visible to a running read-only 
> transaction, we have the same situation as with vacuum, and have to 
> either wait for the read-only transaction to finish before applying the 
> WAL record or kill the transaction.
> 
> It looks like there's some cosmetic changes related to that in the 
> patch, the signature of _bt_delitems is modified, but there's no actual 
> changes that would handle that situation. I didn't see it on the TODO on 
> the hot standby wiki either. Am I missing something, or the patch?

ResolveRedoVisibilityConflicts() describes the current patch's position
on this point, which on review is wrong, I agree.

It looks like I assumed that _bt_delitems is only called during VACUUM,
which I knew it wasn't. I know I was going to split XLOG_BTREE_VACUUM
into two record types at one point, one for delete, one for vacuum. In
the end I didn't. Anyhow, its wrong.

We have infrastructure in place to make this work correctly, just need
to add latestRemovedXid field to xl_btree_vacuum. So that part is easily
solved.

Thanks for spotting it. More like that please!

-- Simon Riggs           www.2ndQuadrant.comPostgreSQL Training, Services and Support



pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Sync Rep: First Thoughts on Code
Next
From: Grzegorz Jaskiewicz
Date:
Subject: Re: possible bug in 8.4