Re: Resume vacuum and autovacuum from interruption and cancellation - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Resume vacuum and autovacuum from interruption and cancellation
Date
Msg-id CAD21AoDivb+R92NgkSJ1dKvwOPv3ddrP2nmAiu4S_iRU3oL-5A@mail.gmail.com
Whole thread Raw
In response to Resume vacuum and autovacuum from interruption and cancellation  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: Resume vacuum and autovacuum from interruption and cancellation  (Rafia Sabih <rafia.pghackers@gmail.com>)
List pgsql-hackers
On Wed, Jun 12, 2019 at 1:30 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> Hi all,
>
> Long-running vacuum could be sometimes cancelled by administrator. And
> autovacuums could be cancelled by concurrent processes. Even if it
> retries after cancellation, since it always restart from the first
> block of table it could vacuums blocks again that we vacuumed last
> time. We have visibility map to skip scanning all-visible blocks but
> in case where the table is large and often modified, we're more likely
> to reclaim more garbage from blocks other than we processed last time
> than scanning from the first block.
>
> So I'd like to propose to make vacuums save its progress and resume
> vacuuming based on it. The mechanism I'm thinking is simple; vacuums
> periodically report the current block number to the stats collector.
> If table has indexes, reports it after heap vacuum whereas reports it
> every certain amount of blocks (e.g. 1024 blocks = 8MB) if no indexes.
> We can see that value on new column vacuum_resume_block of
> pg_stat_all_tables. I'm going to add one vacuum command option RESUME
> and one new reloption vacuum_resume. If the option is true vacuums
> fetch the block number from stats collector before starting and start
> vacuuming from that block. I wonder if we could make it true by
> default for autovacuums but it must be false when aggressive vacuum.
>
> If we start to vacuum from not first block, we can update neither
> relfrozenxid nor relfrozenxmxid. And we might not be able to update
> even relation statistics.
>

Attached the first version of patch. And registered this item to the
next commit fest.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachment

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: POC: Cleaning up orphaned files using undo logs
Next
From: Julien Rouhaud
Date:
Subject: Re: Add parallelism and glibc dependent only options to reindexdb