Resume vacuum and autovacuum from interruption and cancellation - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Resume vacuum and autovacuum from interruption and cancellation
Date
Msg-id CAD21AoBqfMVWdk7Odh4A4OpF-m5GytRjXME5E8cEGXvhSJb8zw@mail.gmail.com
Whole thread Raw
Responses Re: Resume vacuum and autovacuum from interruption and cancellation  (Masahiko Sawada <sawada.mshk@gmail.com>)
List pgsql-hackers
Hi all,

Long-running vacuum could be sometimes cancelled by administrator. And
autovacuums could be cancelled by concurrent processes. Even if it
retries after cancellation, since it always restart from the first
block of table it could vacuums blocks again that we vacuumed last
time. We have visibility map to skip scanning all-visible blocks but
in case where the table is large and often modified, we're more likely
to reclaim more garbage from blocks other than we processed last time
than scanning from the first block.

So I'd like to propose to make vacuums save its progress and resume
vacuuming based on it. The mechanism I'm thinking is simple; vacuums
periodically report the current block number to the stats collector.
If table has indexes, reports it after heap vacuum whereas reports it
every certain amount of blocks (e.g. 1024 blocks = 8MB) if no indexes.
We can see that value on new column vacuum_resume_block of
pg_stat_all_tables. I'm going to add one vacuum command option RESUME
and one new reloption vacuum_resume. If the option is true vacuums
fetch the block number from stats collector before starting and start
vacuuming from that block. I wonder if we could make it true by
default for autovacuums but it must be false when aggressive vacuum.

If we start to vacuum from not first block, we can update neither
relfrozenxid nor relfrozenxmxid. And we might not be able to update
even relation statistics.

Comment and feedback are very welcome.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: "Tsunakawa, Takayuki"
Date:
Subject: RE: [PATCH] Speedup truncates of relation forks
Next
From: Amit Langote
Date:
Subject: Re: Should we warn against using too many partitions?