Robert, Tom, Stephen,
So, first, a description of the specific problem I've encountered at two
sites. I'm working on another email suggesting workarounds and
solutions, but that's going to take a bit longer.
Observation
-----------
This problem occured on two database systems which shared the following
characteristics:
1) They were running with default autovacuum & vacuum settings, except
that one database had 5 workers instead of 3.
2) They have large partitioned tables, in which the partitions are
time-based and do not receive UPDATES after a certain date. Each
partition was larger than RAM.
3) The databases are old enough, and busy enough, to have been through
XID wraparound at least a couple of times.
Users reported that the database system became unresponsive, which was
surprising since both of these DBs had been placed on hardware which was
engineered for at least 100% growth over the current database size. On
investigation, we discovered the following things:
a) Each database had autovacuum_max_workers (one DB 5, one DB 3) doing
anti-wraparound vacuum on several partitions simultaneously.
b) The I/O created by the anti-wraparound vacuum was tying up the system.
c) terminating any individual autovacuum process didn't help, as it
simply caused autovac to start on a different partition.
So, first question was: why was autovacuum wanting to anti-wrapround
vacuum dozens of tables at the same time? A quick check showed that all
of these partitions had nearly identical XID ages (as in less than
100,000 transactions apart), which all had exceeded
autovacuum_max_freeze_age. How did this happen? I'm still not sure.
One thought is: this is an artifact of the *previous* wraparound vacuums
on each database. On cold partitions with old dead rows which have
been through wraparound vacuum several times, this tends to result in
the cold partitions converging towards having the same relfrozenxid over
time; I'm still working on the math to prove this. Alternately, it's
possible that a schema change to the partitioned tables gave them all
the same effective relfrozenxid at some point in the past; both
databases are still in development.
So there are two parts to this problem, each of which needs a different
solution:
1. Databases can inadvertently get to the state where many tables need
wraparound vacuuming at exactly the same time, especially if they have
many "cold" data partition tables.
2. When we do hit wraparound thresholds for multiple tables, autovacuum
has no hesitation about doing autovacuum_max_workers worth of wraparound
vacuum simultaneously, even when that exceeds the I/O capactity of the
system.
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com