Re: HOT chain validation in verify_heapam() - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: HOT chain validation in verify_heapam() |
Date | |
Msg-id | 20221110014607.dlqzedl33csz36x2@awork3.anarazel.de Whole thread Raw |
In response to | Re: HOT chain validation in verify_heapam() (Peter Geoghegan <pg@bowt.ie>) |
Responses |
Re: HOT chain validation in verify_heapam()
Re: HOT chain validation in verify_heapam() |
List | pgsql-hackers |
Hi, On 2022-11-09 17:32:46 -0800, Peter Geoghegan wrote: > > The xmin horizon is very coarse grained. Just because it is 7 doesn't mean > > that xid 10 is still running. All it means that one backend or slot has an > > xmin or xid of 7. > > Of course that's true. But I wasn't talking about the general case -- > I was talking about your "xmin 10, xmax 5 -> xmin 5, xmax invalid" > update chain case specifically, with its "skewered" OldestXmin of 7. The sequence below produces such an OldestXmin: > > s1: acquire xid 5 > > s2: acquire xid 7 > > s3: acquire xid 10 > > > > s3: insert > > s3: commit > > s1: update > > s1: commit > > > > s2: get a new snapshot, xmin 7 (or just hold no snapshot) > > > > At this point the xmin horizon is 7. The first tuple's xmin can't be > > frozen. The second tuple's xmin can be. > > Basically what I'm saying about OldestXmin is that it ought to "work > transitively", from the updater to the inserter that inserted the > now-updated tuple. That is, the OldestXmin should either count both > XIDs that appear in the update chain, or neither XID. It doesn't work that way. The above sequence shows one case where it doesn't. > > > I believe you're right that an update chain that looks like this one > > > is possible. However, I don't think it's possible for > > > OldestXmin/FreezeLimit to take on a value like that (i.e. a value that > > > "skewers" the update chain like this, the value 7 from your example). > > > We ought to be able to rely on an OldestXmin value that can never let > > > such a situation emerge. Right? > > > > I don't see anything that'd guarantee that currently, nor do immediately see a > > possible way to get there. > > > > What do you think prevents such an OldestXmin? > > ComputeXidHorizons() computes VACUUM's OldestXmin (actually it > computes h->data_oldest_nonremovable values) by scanning the proc > array. And counts PGPROC.xmin from each running xact. So ultimately > the inserter and updater are tied together by that. It's either an > OldestXmin that includes both, or one that includes neither. > Here are some facts that I think we both agree on already: > > 1. It is definitely possible to have an update chain like your "xmin > 10, xmax 5 -> xmin 5, xmax invalid" example. > > 2. It is definitely not possible to "freeze xmax" by setting its value > to FrozenTransactionId or something similar -- there is simply no code > path that can do that, and never has been. (The term "freeze xmax" is > a bit ambiguous, though it usually means set xmax to > InvalidTransactionId.) > > 3. There is no specific reason to believe that there is a live bug here. I don't think there's a live bug here. I think the patch isn't dealing correctly with that issue though. > Putting all 3 together: doesn't it seem quite likely that the way that > we compute OldestXmin is the factor that prevents "skewering" of an > update chain? What else could possibly be preventing corruption here? > (Theoretically it might never have been discovered, but that seems > pretty hard to believe.) I don't see how that follows. The existing code is just ok with that. In fact we have explicit code trying to exploit this: /* * If the DEAD tuple is at the end of the chain, the entire chain is * dead and the root line pointer can be marked dead. Otherwise just * redirect the root to the correct chain member. */ if (i >= nchain) heap_prune_record_dead(prstate, rootoffnum); else heap_prune_record_redirect(prstate, rootoffnum, chainitems[i]); Greetings, Andres Freund
pgsql-hackers by date: