Re: HOT chain validation in verify_heapam() - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: HOT chain validation in verify_heapam() |
Date | |
Msg-id | CAH2-Wz=j_5nrVXF666prXr_ntEH1n_H2U5sqz0xu4fz46yMBjg@mail.gmail.com Whole thread Raw |
In response to | Re: HOT chain validation in verify_heapam() (Andres Freund <andres@anarazel.de>) |
Responses |
Re: HOT chain validation in verify_heapam()
|
List | pgsql-hackers |
On Mon, Nov 14, 2022 at 12:58 PM Andres Freund <andres@anarazel.de> wrote: > I wonder if we ought to add an error check to heap_prepare_freeze_tuple() > against this scenario. We're working towards being more aggressive around > freezing, which will make it more likely to hit corner cases around this. In theory my work on freezing doesn't change the basic rules about how freezing works, and doesn't know anything about HOT, so it shouldn't introduce any new risk. Even still, I agree that this seems like something to do in the scope of the same work, just in case. Plus it's just important. It would be possible to have exhaustive heap_prepare_freeze_tuple checks in assert-only builds -- we can exhaustively check the final array of prepared freeze plans that we collected for a given heap page, and check it against the page exhaustively right before freezing is executed. That's not perfect, but it would be a big improvement. Right now I am not entirely sure what I would need to check in such a mechanism. I am legitimately unsure of what the rules are in light of this new information. > Which'd still be fine if the whole chain were already "fully dead". One way I > think this can happen is <= PG 13's HEAPTUPLE_DEAD handling in > lazy_scan_heap(). You mean the tupgone thing? Perhaps it would have avoided this particular problem, or one like it. But it had so many other problems that I don't see why it matters now. > At least as long as correctness requires not ending up in endless loops - > indeed. We end up with lazy_scan_prune() endlessly retrying. Without a chance > to interrupt. Shouldn't there at least be a CFI somewhere? Probably, but that wouldn't change the fact that it's a bug when this happens. Obviously it's more important to avoid such a bug than it is to ameliorate it. > I think the problem partially is that the proposed verify_heapam() code is too > "aggressive" considering things to be part of the same hot chain - which then > means we have to be very careful about erroring out. > > The attached isolationtester test triggers: > "unfrozen tuple was updated to produce a tuple at offset %u which is frozen" > "updated version at offset 3 is also the updated version of tuple at offset %u" > > Despite there afaict not being any corruption. Worth noting that this happens > regardless of hot/non-hot updates being used (uncomment s3ci to see). Why don't you think that there is corruption? The terminology here is tricky. It's possible that the amcheck patch makes a very good point here, even without necessarily complaining about a state that leads to obviously wrong behavior. It's also possible that there really is wrong behavior, at least in my mind -- I don't know what your remarks about no corruption are really based on. I feel like I'm repeating myself more than I should, but: why isn't it as simple as "HOT chain traversal logic is broken by frozen xmin in the obvious way, therefore all bets are off"? Maybe you're right about the proposed new functionality getting things wrong with your adversarial isolation test, but I seem to have missed the underlying argument. Are you just talking about regular update chains here, not HOT chains? Something else? > One important protection right now is that vacuumlazy.c uses a more > pessimistic horizon than pruneheap.c. Even if visibility determinations within > pruning recompute the horizon, vacuumlazy.c won't freeze based on the advanced > horizon. I don't quite know where we we'd best put a comment with a warning > about this fact. We already have comments discussing the relationship between OldestXmin and vistest (as well as rel_pages) in heap_vacuum_rel(). That seems like the obvious place to put something like this, at least to me. -- Peter Geoghegan
pgsql-hackers by date: