Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin |
Date | |
Msg-id | 20240625120315.hik4rvciesuu5ahd@alap3.anarazel.de Whole thread Raw |
In response to | Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
|
List | pgsql-hackers |
Hi, On 2024-06-24 16:35:50 -0400, Robert Haas wrote: > On Mon, Jun 24, 2024 at 3:23 PM Melanie Plageman > <melanieplageman@gmail.com> wrote: > > Are you more concerned about having a single horizon for pruning or > > about having a horizon that does not move backwards after being > > established at the beginning of vacuuming the relation? > > I'm not sure I understand. The most important thing here is fixing the > bug. But if we have a choice of how to fix the bug, I'd prefer to do > it by having the pruning code test one horizon that is always correct, > rather than (as I think the patch does) having it test against two > horizons because as a way of covering possible discrepancies between > those values. I think that's going in the wrong direction. We *want* to prune more aggressively if we can (*), the necessary state is represented by the vistest. That's a different thing than *having* to prune tuples beyond a certain xmin (the cutoff determined by vacuum.c/vacuumlazy.c). The problem we're having here is that the two states can get out of sync due to the vistest "moving backwards", because of hot_standby_feedback (and perhaps also an issue around aborts). To prevent that we can a) make sure that we always take the hard cutoff into account b) prevent vistest from going backwards (*) we really ought to become more aggressive, by removing intermediary row versions when they're not visible to anyone, but not yet old enough to be below the horizon. But that realistically will only be possible in *some* cases, e.g. when the predecessor row version is on the same page. > > I had always thought it was because the vacuuming backend's > > GlobalVisState will get updated periodically throughout vacuum and so, > > assuming the oldest running transaction changes, our horizon for > > vacuum would change. But, in writing this repro, it is actually quite > > hard to get GlobalVisState to update. Our backend's RecentXmin needs > > to have changed. And there aren't very many places where we take a new > > snapshot after starting to vacuum a relation. One of those is at the > > end of index vacuuming, but that can only affect the pruning horizon > > if we have to do multiple rounds of index vacuuming. Is that really > > the case we are thinking of when we say we want the pruning horizon to > > move forward during vacuum? > > I thought the idea was that the GlobalVisTest stuff would force a > recalculation now and then, but maybe it doesn't actually do that? It forces an accurate horizon to be determined the first time it would require it to determine visibility. The "first time" is determined by RecentXmin not having changed. The main goal of the vistest stuff was to not have to determine an accurate horizon in GetSnapshotData(). Determining an accurate horizon requires accessing each backends ->xmin, which causes things to scale badly, as ->xmin changes so frequently. The cost of determining the accurate horizon is irrelevant for vacuums, but it's not at all irrelevant for on-access pruning. > Suppose process A begins a transaction, acquires an XID, and then goes > idle. Process B now begins a giant vacuum. At some point in the middle > of the vacuum, A ends the transaction. Are you saying that B's > GlobalVisTest never really notices that this has happened? Not at the moment, but we should add heuristics like that. Greetings, Andres Freund
pgsql-hackers by date: