Hi,
I'm still working on this patch. I found a bunch of issues. Non of them
super critical and some pre-existing; but nonetheless I don't feel like
it's ready to push yet. So we'll have the first alpha without those
fixes :(
The biggest issues so far are:
* There's, both in the posted patch and as-is in all the branches, a lot
of places that aren't actually safe against a concurrent truncation. A
bunch of places grab oldestMultiXactId with an lwlock held, release
it, and then make decisions based on that.
A bunch of places (including the find_multixact_start callsites) is
actually vulnerable to that, for a bunch of others it's less likely to
be a problem. All callers of GetMultiXactIdMembers() are vulnerable,
but with the exception of pg_get_multixact_members() they'll never
pass in a value that's older than the new oldest member value.
That's a problem for the current branches. Afaics that can lead to a
useless round of emergency autovacs via
SetMultiXactIdLimit()->SetOffsetVacuumLimit().
SetOffsetVacuumLimit() can protect easily agains that by taking the
new MultiXactTruncationLock lock. We could do the same for
pg_get_multixact_members() - afaics the only caller that'll look up too
old values otherwise - but I don't think it matters, you'll get a
slightly obscure error if you access a too old xact that's just being
truncated away and that is that.
* There was no update of the in-memory oldest* values when replaying a
truncation. That's problematic if a standby is promoted after
replaying a truncation record, but before a following checkpoint
record. This would be fixed by an emergency autovacuum, but that's
obviously not nice. Trivial to fix.
* The in-memory oldest values were updated *after* the truncation
happened. It's unlikely to matter in reality, but it's safer to
update them before, so a concurrent GetMultiXactIdMembers() of stuff
from before the truncation will get the proper error.
* PerformMembersTruncation() probably confused Alvaro because it wasn't
actually correct - there's no need to have the segment containing old
oldestOffset (in contrast to oldestOffsetAlive) survive. Except
leaking a segment that's harmless, but obviously not desirable.
Additionally I'm changing some stuff, some requested by review:
* xl_multixact_truncate's members are now called
(start|end)Trunc(Off|Memb)
* (start|end)TruncOff have the appropriate type now
* typo fixes
* comment improvements
* pgindent
New version attached.
Greetings,
Andres Freund