On Sat, 2020-05-16 at 12:19 -0500, Don Seiler wrote:
> PG 9.6 (9.6.8). Got an alert today when template0 and template1 were both over 50%
> towards TXID wraparound. I could vacuum template1 but couldn't vacuum template0
> without first allowing connections. This is what it looked like before:
>
> # SELECT datname
> , age(datfrozenxid)
> , current_setting('autovacuum_freeze_max_age')
> FROM pg_database
> ORDER BY 2 DESC;
> datname | age | current_setting
> --------------------+------------+-----------------
> foo_db | 1022106099 | 200000000
> template0 | 1000278345 | 200000000
> postgres | 643729 | 200000000
> template1 | 643729 | 200000000
> (4 rows)
>
> I've since allowed connections and ran "vacuumdb --freeze" on it and then immediately
> disabled the connections to it again. But I'm curious how template0 would be growing
> in age like this. Even now I see the template0 age growing.
That is indeed strange.
Did you see any weird messages when you vacuumed "template0"?
Did "datfrozenxid" shrink after the operation?
"foo_db" seems to be the bigger problem.
Perhaps autovacuum never handled "template0" because it concluded (rightly) that
it has to deal with "foo_db" first.
> I can say that these DB
> has previously been altered for locale changes as well.
Would you care to explain that? You changed "template0"? How?
> I'm also running a long "vacuum freeze" on foo_db that will take a few days after
> seeing that autovacuum on a big table had been running on it since Feb 2 and making
> no progress, with over 850M dead tuples according to pg_stat_all_tables.
> I estimate 3-4 more days to go on that one. Once that's done I'll be scheduling
> manual vacuum jobs. Just wondering if that would somehow affect regular template0
> cleanup though.
As I said, perhaps.
What are your non-default autovacuum settings? Perhaps you should speed up autovacuum
by reducing "autovacuum_vacuum_cost_delay" to 2ms or less, and by increasing
"maintenance_work_mem".
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com