Hi,
On 2020-09-08 18:13:58 -0400, Tom Lane wrote:
> Andres Freund <andres@anarazel.de> writes:
> > On 2020-09-08 15:24:54 -0400, Tom Lane wrote:
> >> Andres Freund <andres@anarazel.de> writes:
> >>> But now I do wonder why we need to know whether the command is top level
> >>> or not? Why isn't the correct thing to instead look at what the current
> >>> backend's xmin is? Seems like you could just replace
> >>> *oldestXmin = XidFromFullTransactionId(ReadNextFullTransactionId());
> >>> with
> >>> *oldestXmin = MyProc->xmin;
> >>> Assert(TransactionIdIsValid(*oldestXmin));
> 
> >> Ummm ... since VACUUM doesn't run inside a transaction, it won't be
> >> advertising an xmin will it?
> 
> > We do run it in a transaction though:
> 
> Ah.  But still, this is not the semantics I want for VACUUM, because the
> process xmin will involve other processes' behavior.  The point of the
> change as far as I'm concerned is to ensure vacuuming of dead temp rows
> independent of anything else in the system.
I was thinking about this a bit more, and I think the answer might be to
use Min(latestCompletedXid, MyProc->xid). That would, as far as I can
tell, never miss something vacuumable in a temporary table, doesn't
require to know whether we're running as the top-level command.
The reason for preferring latestCompletedXid over nextXid is that the
former is protected by ProcArrayLock and already accessed in
GetSnapshotData(), so we can cheaply compute the horizons as part of
pruning.
I think that cannot miss something vacuumable in a temp table for VACUUM
because that would have to have been left over by an already completed
transaction (by us, before the VACUUM).
In addition this allows HOT pruning etc on temp tables to be just as
aggressive as VACUUM is.
I wrote a patch to do so for [1], but it seemed topically more relevant
here. Running tests in a loop, no failures after the first few
iterations.
Greetings,
Andres Freund
[1] https://postgr.es/m/20200921212003.wrizvknpkim2whzo%40alap3.anarazel.de