Re: New strategies for freezing, advancing relfrozenxid early - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: New strategies for freezing, advancing relfrozenxid early
Date
Msg-id CAH2-Wzmu8B+exBbV49Hm2Wstf_nmsRXdaFZ60k+Du7D=G4OGKw@mail.gmail.com
Whole thread Raw
In response to Re: New strategies for freezing, advancing relfrozenxid early  (Jeremy Schneider <schnjere@amazon.com>)
Responses Re: New strategies for freezing, advancing relfrozenxid early
List pgsql-hackers
On Thu, Aug 25, 2022 at 3:35 PM Jeremy Schneider <schnjere@amazon.com> wrote:
> We should be careful here. IIUC, the current autovac behavior helps
> bound the "spread" or range of active multixact IDs in the system, which
> directly determines the number of distinct pages that contain those
> multixacts. If the proposed change herein causes the spread/range of
> MXIDs to significantly increase, then it will increase the number of
> blocks and increase the probability of thrashing on the SLRUs for these
> data structures.

As a general rule VACUUM will tend to do more eager freezing with the
patch set compared to HEAD, though it should never do less eager
freezing. Not even in corner cases -- never.

With the patch, VACUUM pretty much uses the most aggressive possible
XID-wise/MXID-wise cutoffs in almost all cases (though only when we
actually decide to freeze a page at all, which is now a separate
question). The fourth patch in the patch series introduces a very
limited exception, where we use the same cutoffs that we'll always use
on HEAD (FreezeLimit + MultiXactCutoff) instead of the aggressive
variants (OldestXmin and OldestMxact). This isn't just *any* xmax
containing a MultiXact: it's a Multi that contains *some* XIDs that
*need* to go away during the ongoing VACUUM, and others that *cannot*
go away. Oh, and there usually has to be a need to keep two or more
XIDs for this to happen -- if there is only one XID then we can
usually swap xmax with that XID without any fuss.

> PS. see also
> https://www.postgresql.org/message-id/247e3ce4-ae81-d6ad-f54d-7d3e0409a950@ardentperf.com

I think that the problem you describe here is very real, though I
suspect that it needs to be addressed by making opportunistic cleanup
of Multis happen more reliably. Running VACUUM more often just isn't
practical once a table reaches a certain size. In general, any kind of
processing that is time sensitive probably shouldn't be happening
solely during VACUUM -- it's just too risky. VACUUM might take a
relatively long time to get to the affected page. It might not even be
that long in wall clock time or whatever -- just too long to reliably
avoid the problem.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Jeremy Schneider
Date:
Subject: Re: New strategies for freezing, advancing relfrozenxid early
Next
From: Peter Geoghegan
Date:
Subject: Re: New strategies for freezing, advancing relfrozenxid early