Re: First draft of the PG 15 release notes - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: First draft of the PG 15 release notes
Date
Msg-id CAH2-Wzk-oOpKObMKJ=Df4WBERX0ja14ymu3h5JrDt_FtmvH1yQ@mail.gmail.com
Whole thread Raw
In response to Re: First draft of the PG 15 release notes  (Bruce Momjian <bruce@momjian.us>)
Responses Re: First draft of the PG 15 release notes
List pgsql-hackers
On Tue, Jun 28, 2022 at 1:35 PM Bruce Momjian <bruce@momjian.us> wrote:
> Okay, text updated, thanks.  Applied patch attached.

I have some notes on these items:

1. "Allow vacuum to be more aggressive in setting the oldest frozenxid
(Peter Geoghegan)"

2. "Add additional information to VACUUM VERBOSE and autovacuum
logging messages (Peter Geoghegan)"

The main enhancement to VACUUM for Postgres 15 was item 1, which
taught VACUUM to dynamically track the oldest remaining XID (and the
oldest remaining MXID) that will remain in the table at the end of the
same VACUUM operation. These final/oldest XID/MXID values are what we
now use to set relfrozenxid and relminmxid in pg_class. Previously we
just set relfrozenxid/relminmxid to whatever XID/MXID value was used
to determine which XIDs/MXIDs needed to be frozen. These values often
indicated more about VACUUM implementation details (like the
vacuum_freeze_min_age GUc's value) than the actual true contents of the
table at the end of the most recent VACUUM.

It might be worth explaining the shift directly in the release notes.
The new approach is simpler and makes a lot more sense -- why should
the relfrozenxid be closely tied to freezing? We don't necessarily
have to freeze any tuple to advance relfrozenxid right up to the
removable cutoff/OldestXmin used by VACUUM. For example,
anti-wraparound VACUUMs that run against static tables now set
relfrozenxid/relminmxid to VACUUM's removable cutoff/OldestXmin
directly, without freezing anything (after the first time). Same with
tables that happen to have every row deleted -- only the actual
unfrozen XIDs/MXIDs left in the table matter, and if there happen to
be none at all then we can use the same relfrozenxid as we would for
a CREATE TABLE. All depends on what the workload allows.

There will also be a real practical benefit for users that allocate a
lot of MultiXactIds: We'll now have pg_class.relminmxid values that
are much more reliable indicators of what is really going on in the
table, MultiXactId-wise. I expect that this will make it much less
likely that anti-wraparound VACUUMs will run needlessly against the
largest tables, where there probably wasn't ever one single
MultiXactId. In other words, the implementation will have more
accurate information at the level of each table, and won't .

I think that very uneven consumption of MultiXactIds at the table
level is probably common in real databases. Plus VACUUM can usually
remove a non-running MultiXact from a tuple's xmax, regardless of
whether or not the mxid happens to be before the
vacuum_multixact_freeze_min_age-based MXID cutoff -- VACUUM has
always just set xmax to InvalidXid in passing when it's possible to do so
easily. MultiXacts are inherently pretty short-lived information about
row lockers at a point in time. We don't really need to keep them
around for very long. We may now be able to truncate the two MultiXact
related SLRUs much more frequently with some workloads.

Finally, note that the new VACUUM VERBOSE output (which is now pretty
much the same as the autovacuum log output) shows when and how
relfrozenxid/relminmxid have advanced. This should make it relatively
easy to observe these effects where they exist.

Thanks

--
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Logging query parmeters in auto_explain
Next
From: David Rowley
Date:
Subject: tuplesort Generation memory contexts don't play nicely with index builds