effective_multixact_freeze_max_age issue - Mailing list pgsql-hackers

From Peter Geoghegan
Subject effective_multixact_freeze_max_age issue
Date
Msg-id CAH2-Wzn2hODG0KQNvtFsT-Ka=MrUNHRpQXZ8tRiiC-fudQ_kEw@mail.gmail.com
Whole thread Raw
Responses Re: effective_multixact_freeze_max_age issue
List pgsql-hackers
The effective_multixact_freeze_max_age mechanism added by commit
53bb309d2d forces aggressive VACUUMs to take place earlier, as
protection against wraparound of the MultiXact member space SLRU.
There was also a follow-up bugfix several years later -- commit
6bda2af039 -- which made sure that the MXID-wise cutoff used to
determine which MXIDs to freeze in vacuumlazy.c could never exceed
oldestMxact (VACUUM generally cannot freeze MultiXacts that are still
seen as running by somebody according to oldestMxact).

I would like to talk about making the
effective_multixact_freeze_max_age stuff more robust, particularly in
the presence of a long held snapshot that holds things up even as SLRU
space for MultiXact members dwindles. I have to admit that I always
found this part of vacuum_set_xid_limits() confusing. I suspect that
the problem has something to do with how we calculate vacuumlazy.c's
multiXactCutoff (as well as FreezeLimit): vacuum_set_xid_limits() just
subtracts a freezemin value from GetOldestMultiXactId(). This is
confusingly similar (though different in important ways) to the
handling for other related cutoffs that happens nearby. In particular,
we start from ReadNextMultiXactId() (not from GetOldestMultiXactId())
for the cutoff that determines if the VACUUM is going to be
aggressive. I think that this can be fixed -- see the attached patch.

Of course, it wouldn't be safe to allow vacuum_set_xid_limits() to
hand off a multiXactCutoff to vacuumlazy.c that is (for whatever
reason) less than GetOldestMultiXactId()/oldestMxact (the bug fixed by
6bda2af039 involved just such a scenario). But that doesn't seem like
much of a problem to me. We can just handle it directly, as needed.
The attached patch handles it as follows:

    /* Compute multiXactCutoff, being careful to generate a valid value */
    *multiXactCutoff = nextMXID - mxid_freezemin;
    if (*multiXactCutoff < FirstMultiXactId)
        *multiXactCutoff = FirstMultiXactId;
    /* multiXactCutoff must always be <= oldestMxact */
    if (MultiXactIdPrecedes(*oldestMxact, *multiXactCutoff))
        *multiXactCutoff = *oldestMxact;

That is, we only need to make sure that the "multiXactCutoff must
always be <= oldestMxact" invariant holds once, by checking for it
directly. The same thing happens with OldestXmin/FreezeLimit. That
seems like a simpler foundation. It's also a lot more logical. Why
should the cutoff for freezing be held back by a long running
transaction, except to the extent that it is strictly necessary to do
so to avoid wrong answers (wrong answers seen by the long running
transaction)?

This allows us to simplify the code that issues a WARNING about
oldestMxact/OldestXmin inside vacuum_set_xid_limits(). Why not
actually test oldestMxact/OldestXmin directly, without worrying about
the limits (multiXactCutoff/FreezeLimit)? That also seems more
logical; there is more to be concerned about than freezing being
blocked when OldestXmin gets very old. Though we still rely on the
autovacuum_freeze_max_age GUC to represent "a wildly unreasonable
number of XIDs for OldestXmin to be held back by", just because that's
still convenient.

-- 
Peter Geoghegan

Attachment

pgsql-hackers by date:

Previous
From: Arne Roland
Date:
Subject: Re: missing indexes in indexlist with partitioned tables
Next
From: Nathan Bossart
Date:
Subject: Re: optimize lookups in snapshot [sub]xip arrays