Thread: pg_stat_progress_vacuum comes up empty ...?

pg_stat_progress_vacuum comes up empty ...?

From
Michael Harris
Date:
18 July 2019, 05:04:22
Hello,

We have a database cluster which recently got very close to XID Wraparound. To get
it back under control I've been running a lot of aggressive manual vacuums.

However, I have noticed a few anomolies. When I try to check the status of vacuum commands:

qtodb_pmxtr=# select * from  pg_stat_progress_vacuum;
 pid | datid | datname | relid | phase | heap_blks_total | heap_blks_scanned | heap_blks_vacuumed | index_vacuum_count
|max_dead_tuples | num_dead_tuples 

-----+-------+---------+-------+-------+-----------------+-------------------+--------------------+--------------------+-----------------+-----------------
(0 rows)

Yet there definitely are plenty running:

qtodb_pmxtr=# select pid, state, current_timestamp-query_start as duration,query from pg_stat_activity where
datname='qtodb_pmxtr'and query~'VACUUM' ; 
  pid  | state  |    duration     |                                                                   query

-------+--------+-----------------+-------------------------------------------------------------------------------------------------------------------------------------------
 40615 | active | 13:46:35.081203 | autovacuum: VACUUM qn.mtrds_cnestmeas_oo_18032
 40617 | active | 00:46:35.270232 | autovacuum: VACUUM qn.mtrds_cantu100_oo_18046
 40622 | active | 00:00:04.55167  | autovacuum: VACUUM qn.mtrds_cbeekops_on_17684 (to prevent wraparound)
 25685 | active | 00:00:04.568989 | VACUUM FREEZE pg_toast.pg_toast_228072029;
 25686 | active | 00:00:02.716111 | VACUUM FREEZE pg_toast.pg_toast_228072943;
 25687 | active | 00:00:03.788131 | VACUUM FREEZE pg_toast.pg_toast_228069006;
 25688 | active | 00:00:02.531885 | VACUUM FREEZE pg_toast.pg_toast_228067023;
 25689 | active | 00:00:02.098389 | VACUUM FREEZE pg_toast.pg_toast_228071980;
 25690 | active | 00:00:00.621036 | VACUUM FREEZE pg_toast.pg_toast_228071852;
 25691 | active | 00:00:11.424717 | VACUUM FREEZE pg_toast.pg_toast_228069597;
 25692 | active | 00:00:03.359416 | VACUUM FREEZE pg_toast.pg_toast_228073892;
 25693 | active | 00:00:04.569248 | VACUUM FREEZE pg_toast.pg_toast_228068022;
 25694 | active | 00:00:20.151786 | VACUUM FREEZE pg_toast.pg_toast_228068878;
 25695 | active | 00:00:00.517688 | VACUUM FREEZE pg_toast.pg_toast_228068478;
 25696 | active | 00:00:23.746402 | VACUUM FREEZE pg_toast.pg_toast_228067431;
 25697 | active | 00:00:10.759025 | VACUUM FREEZE pg_toast.pg_toast_228072997;
 25698 | active | 00:00:14.281798 | VACUUM FREEZE pg_toast.pg_toast_228074613;
 25699 | active | 00:00:05.631052 | VACUUM FREEZE pg_toast.pg_toast_228074247;
 25700 | active | 00:00:00.056749 | VACUUM FREEZE pg_toast.pg_toast_228071681;
 28008 | active | 00:00:00        | select pid, state, current_timestamp-query_start as duration,query from
pg_stat_activitywhere datname='qtodb_pmxtr' and query~'VACUUM' ; 
(20 rows)

Why don't any of these (manual OR auto) show up in the pg_stat_progress_vacuum?

Another concern: the normal autovacuums seem to be stalling. The table
qn.mtrds_cnestmeas_oo_18032 should surely not take more than 13 hours to
vacuum, since it is only 160KB in size ...!

qtodb_pmxtr=# select pg_size_pretty(pg_relation_size('qn.mtrds_cnestmeas_oo_18032'::regclass));
 pg_size_pretty
----------------
 160 kB
(1 row)

We have autovacuum_cost_delay set to 0.

I also don't understand why only one autovac worker is working on the
wraparound issue, as there are thousands of tables with oldest xid > autovacuum_freeze_max_age.
I would have thought it would be prioritizing those.

I'm worried that something is wrong with autovacuum on this database, which might
be responsible for it getting into this state to begin with. Other similar databases we
have, running the same application and with similar configuration, are managing to
keep up with the xid freezing nicely.

The database was on 9.6, but was recently upgraded to 11.4.

Any advice welcome!

Cheers
Mike.