I took another look at the code coverage situation around freezing
following pushing the page-level freezing patch earlier today. I
spotted an issue that I'd missed up until now: certain sanity checks
in heap_prepare_freeze_tuple() call TransactionIdDidCommit() more
often than really seems necessary.
Theoretically this is an old issue that dates back to commit
699bf7d05c, as opposed to an issue in the page-level freezing patch.
But that's not really true in a real practical sense. In practice the
calls to TransactionIdDidCommit() will happen far more frequently
following today's commit 1de58df4fe (since we're using OldestXmin as
the cutoff that gates performing freeze_xmin/freeze_xmax processing --
not FreezeLimit).
No regressions related to clog lookups by VACUUM were apparent from my
performance validation of the page-level freezing work, but I suspect
that the increase in TransactionIdDidCommit() calls will cause
noticeable regressions with the right/wrong workload and/or
configuration. The page-level freezing work is expected to reduce clog
lookups in VACUUM in general, which is bound to have been a
confounding factor.
I see no reason why we can't just condition the XID sanity check calls
to TransactionIdDidCommit() (for both the freeze_xmin and the
freeze_xmax callsites) on a cheap tuple hint bit precheck not being
enough. We only need an expensive call to TransactionIdDidCommit()
when the precheck doesn't immediately indicate that the tuple xmin
looks committed when that's what the sanity check expects to see (or
that the tuple's xmax looks aborted, in the case of the callsite where
that's what we expect to see).
Attached patch shows what I mean. A quick run of the standard
regression tests (with custom instrumentation) shows that this patch
eliminates 100% of all relevant calls to TransactionIdDidCommit(), for
both the freeze_xmin and the freeze_xmax callsites.
--
Peter Geoghegan