Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin - Mailing list pgsql-hackers

From Andres Freund
Subject Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
Date
Msg-id 20240625120315.hik4rvciesuu5ahd@alap3.anarazel.de
Whole thread Raw
In response to Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: Vacuum ERRORs out considering freezing dead tuples from before OldestXmin
List pgsql-hackers
Hi,

On 2024-06-24 16:35:50 -0400, Robert Haas wrote:
> On Mon, Jun 24, 2024 at 3:23 PM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> > Are you more concerned about having a single horizon for pruning or
> > about having a horizon that does not move backwards after being
> > established at the beginning of vacuuming the relation?
> 
> I'm not sure I understand. The most important thing here is fixing the
> bug. But if we have a choice of how to fix the bug, I'd prefer to do
> it by having the pruning code test one horizon that is always correct,
> rather than (as I think the patch does) having it test against two
> horizons because as a way of covering possible discrepancies between
> those values.

I think that's going in the wrong direction. We *want* to prune more
aggressively if we can (*), the necessary state is represented by the
vistest. That's a different thing than *having* to prune tuples beyond a
certain xmin (the cutoff determined by vacuum.c/vacuumlazy.c). The problem
we're having here is that the two states can get out of sync due to the
vistest "moving backwards", because of hot_standby_feedback (and perhaps also
an issue around aborts).

To prevent that we can
a) make sure that we always take the hard cutoff into account
b) prevent vistest from going backwards

(*) we really ought to become more aggressive, by removing intermediary row
    versions when they're not visible to anyone, but not yet old enough to be
    below the horizon. But that realistically will only be possible in *some*
    cases, e.g. when the predecessor row version is on the same page.



> > I had always thought it was because the vacuuming backend's
> > GlobalVisState will get updated periodically throughout vacuum and so,
> > assuming the oldest running transaction changes, our horizon for
> > vacuum would change. But, in writing this repro, it is actually quite
> > hard to get GlobalVisState to update. Our backend's RecentXmin needs
> > to have changed. And there aren't very many places where we take a new
> > snapshot after starting to vacuum a relation. One of those is at the
> > end of index vacuuming, but that can only affect the pruning horizon
> > if we have to do multiple rounds of index vacuuming. Is that really
> > the case we are thinking of when we say we want the pruning horizon to
> > move forward during vacuum?
> 
> I thought the idea was that the GlobalVisTest stuff would force a
> recalculation now and then, but maybe it doesn't actually do that?

It forces an accurate horizon to be determined the first time it would require
it to determine visibility. The "first time" is determined by RecentXmin not
having changed.

The main goal of the vistest stuff was to not have to determine an accurate
horizon in GetSnapshotData(). Determining an accurate horizon requires
accessing each backends ->xmin, which causes things to scale badly, as ->xmin
changes so frequently.

The cost of determining the accurate horizon is irrelevant for vacuums, but
it's not at all irrelevant for on-access pruning.


> Suppose process A begins a transaction, acquires an XID, and then goes
> idle. Process B now begins a giant vacuum. At some point in the middle
> of the vacuum, A ends the transaction. Are you saying that B's
> GlobalVisTest never really notices that this has happened?

Not at the moment, but we should add heuristics like that.


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Michail Nikolaev
Date:
Subject: Re: Issues with ON CONFLICT UPDATE and REINDEX CONCURRENTLY
Next
From: Dagfinn Ilmari Mannsåker
Date:
Subject: Re: Proposal: Document ABI Compatibility