When vacuum is running for hours on a table with many indexes, we can see what phase it is in and how many index vacuum cycles have happened (index_vacuum_count), but not how much time each phase has taken. That makes it hard to tell what is actually slow.
The attached patch adds three columns to pg_stat_progress_vacuum that report the total time spent in heap vacuuming, index vacuuming, and index cleanup. The values are accumulated across overflow cycles when the dead TID store fills up.
These columns make it easy to see whether the bottleneck is heap scanning or index processing, analyze why vacuum is slow, and adjust maintenance_work_mem accordingly to reduce TID store overflow cycles. They also make it possible to estimate how long vacuum takes to finish - polling these timing columns along with scanned blocks gives a good estimated time.
We could put this behind a track_vacuum_timing GUC similar to track_io_timing and track_wal_timing of the time capturing calls seem costly. I'd prefer not to add another GUC, but I'm open to thoughts on this.
For example, I tested with a 3M-row table (3.9 GB), 4 btree indexes, and maintenance_work_mem=256kB (kept low to force TID-store overflows). About 34 minutes in, only 13% through the vacuum [1]. index_vacuum_time is 2006s vs heap_vacuum_time of 2.1s - index processing is where all the time goes. 13,188 overflow cycles × 4 indexes = 52,752 index scans so far. In production, tables with more indexes, these columns show exactly where vacuum time is going and whether raising maintenance_work_mem would help.
With the help of the attached sql function, the estimated time to finish the vacuum is about 3.5 hours [2].