Home > mailing lists

Add heap and index vacuum timings to pg_stat_progress_vacuum - Mailing list pgsql-hackers

From	Bharath Rupireddy
Subject	Add heap and index vacuum timings to pg_stat_progress_vacuum
Date	May 4 08:41:32
Msg-id	CALj2ACVD08XgDcQpTSegKDiyJHwJ6eK89X+1WksWqUJxT86A_A@mail.gmail.com Whole thread
List	pgsql-hackers

Tree view

Hi,

When vacuum is running for hours on a table with many indexes, we can see what phase it is in and how many index vacuum cycles have happened (index_vacuum_count), but not how much time each phase has taken. That makes it hard to tell what is actually slow.

The attached patch adds three columns to pg_stat_progress_vacuum that report the total time spent in heap vacuuming, index vacuuming, and index cleanup. The values are accumulated across overflow cycles when the dead TID store fills up.

These columns make it easy to see whether the bottleneck is heap scanning or index processing, analyze why vacuum is slow, and adjust maintenance_work_mem accordingly to reduce TID store overflow cycles. They also make it possible to estimate how long
vacuum takes to finish - polling these timing columns along with scanned blocks gives a good estimated time.

We could put this behind a track_vacuum_timing GUC similar to track_io_timing and track_wal_timing of the time capturing calls seem costly. I'd prefer not to add another GUC, but I'm open to thoughts on this.

For example, I tested with a 3M-row table (3.9 GB), 4 btree indexes, and maintenance_work_mem=256kB (kept low to force TID-store overflows). About 34 minutes in, only 13% through the vacuum [1]. index_vacuum_time is 2006s vs heap_vacuum_time of 2.1s - index processing is where all the time goes. 13,188 overflow cycles × 4 indexes = 52,752 index scans so far. In production, tables with more indexes, these columns show exactly where vacuum time is going and whether raising maintenance_work_mem would help.

With the help of the attached sql function, the estimated time to finish the vacuum is about 3.5 hours [2].

Thoughts?

[1]
SELECT * FROM pg_stat_progress_vacuum;
pid | 18764
relid | 16384
phase | vacuuming indexes
heap_blks_total | 500000
heap_blks_scanned | 66202
heap_blks_vacuumed | 66202
index_vacuum_count | 13188
max_dead_tuple_bytes | 262144
dead_tuple_bytes | 0
num_dead_item_ids | 0
indexes_total | 4
indexes_processed | 4
delay_time | 0
mode | manual
started_by | user
heap_vacuum_time | 2.098345
index_vacuum_time | 2005.931383
index_cleanup_time | 0

[2]
SELECT * FROM estimate_vacuum_completion();
pid | 18764
relname | vac_timing_test
pct_done | 13.24
time_remaining | 03:35:39

Bharath Rupireddy
Amazon Web Services: https://aws.amazon.com

Attachment

pgsql-hackers by date:

From: shveta malik
Date: 04 May, 08:21:07
Subject: Re: Include schema-qualified names in publication error messages.

From: shveta malik
Date: 04 May, 08:48:37
Subject: Re: Proposal: Conflict log history table for Logical Replication

Add heap and index vacuum timings to pg_stat_progress_vacuum - Mailing list pgsql-hackers

Attachment

Previous

Next