On Sat, May 2, 2015 at 7:08 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Fri, May 1, 2015 at 6:51 AM, Thomas Munro
> <thomas.munro@enterprisedb.com> wrote:
>> Those other places are for capping the effective table and tuple
>> multixact freeze ages for manual vacuums, so that manual vacuums (say
>> in nightly cronjobs) get a chance to run a wraparound scans before
>> autovacuum kicks in at a less convenient time. So, yeah, I think we
>> want to incorporate member wraparound prevention into that logic, and
>> I will add that in the next version of the patch.
>
> +1. On a quick read-through of the patch, the biggest thing that
> jumped out at me was that it only touches the autovacuum logic.
Here's a new version which sets up the multixact parameters in
ExecVacuum for regular VACUUM commands just like it does for
autovacuum if needed. When computing
max_multixact_age_to_avoid_member_wrap for a manual vacuum, it uses
lower constants, so that any manually scheduled vacuums get a chance
to deal with some of this problem before autovacuum has to. Here are
the arbitrary constants currently used: at 50% member address space
usage, autovacuum starts wraparound scan of tables with the oldest
active multixacts, and then younger ones as the usage increases, until
at 75% usage it vacuums with multixact_freeze_table_age = 0; for
manual VACUUM those numbers are halved so that it has a good head
start.
Halving the thresholds so much lower for manual vacuums may be too
much of a head start, but if we give it only a small head start, it
seems to me that you'd finish up with slim odds of actually getting
the wraparound scans done by your scheduled vacuum job. A head start
of 25% of the usage before autovacuum starts on tables with the oldest
relminmxids creates a target big enough that you might hit it with
your vacuum cronjob. Also, as far as I can tell, manual vacuums will
only help you get the wraparound scans on your *connectable* databases
done at a time that suits you better, you'll still be dependent on
autovacuum to deal with non-connectable databases, in other words
template0. In practice I guess if you run vacuumdb -a at midnight
you'll see all the pg_database.datminmxid values advance except for
template0's, and then some time later, most likely during busy load
time producing many multixact members, member space usage will finally
hit 50%, autovacuum will very quickly process template0, the
cluster-wide oldest mxid will finally advance, and then segment files
will be deleted at the next checkpoint. Or am I missing something?
Also attached is the output of the monitor.sh script posted upthread,
while running explode_mxact_members.c. It looks better than the last
results to me: whenever usage reaches 50%, autovacuum advances things
such that usage drops right back to 0% (because it now uses
multixact_freeze_min_age = 0) , and the system will happily chug on
forever. What this test doesn't really show adequately is that if you
had a lot of different tables and databases with different relminmxid
values, they'd be vacuumed at different times. I should probably come
up with a way to demonstrate that...
--
Thomas Munro
http://www.enterprisedb.com