Simon Riggs wrote:
> On Fri, 2009-01-16 at 22:09 +0200, Heikki Linnakangas wrote:
>
>>>> RecentGlobalXmin is just a hint, it lags behind the real oldest xmin
>>>> that GetOldestXmin() would return. If another backend has a more recent
>>>> RecentGlobalXmin value, and has killed more recent tuples on the page,
>>>> the latestRemovedXid written here is too old.
>>> What do you think we should do instead?
>> Dunno. Maybe call GetOldestXmin().
>
> We are discussing btree deletes, not btree vacuums.
Pardon my ignorance, but what's the difference?
> If we are doing
> btree delete then we have an unreleased snapshot therefore we also have
> a non-zero xmin. How can another backend have a later RecentGlobalXmin
> or result from GetOldestXmin() than we do?
Sure it can, for example:
1. Transaction 1 begins in backend A
2. Transaction 2 begins in backend B, xmin = 1
3. Transaction 1 ends
4. Transaction 3 begins in backend C, xmin = 2
5. Backend C gets snapshot, TransactionXmin = 2, RecentGlobalXmin = 1
6. Transaction 2 ends.
7. Transaction 4 begins in backend A, gets snapshot TransactionXmin = 2,
RecentGlobalXmin = 2
8. Transaction 4 kills tuple, using its RecentGlobalxmin of 1
9. Transaciont 3 splits the page, emits a delete xlog record, setting
latestRemovedXid to its RecentGlobalXmin of 2
-- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com