Re: autovacuum next steps - Mailing list pgsql-hackers

From Galy Lee
Subject Re: autovacuum next steps
Date
Msg-id 45DA5E8F.2060503@oss.ntt.co.jp
Whole thread Raw
In response to Re: autovacuum next steps  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-hackers
Gregory Stark wrote:
> If we could have autovacuum interrupt a vacuum in mid-sweep, perform a cycle
> of vacuums on smaller tables, then resume, that problem would go away. That
> sounds too difficult though, but perhaps we could do something nearly as good.

I think to make vacuum has this interrupted-resume capability is quite 
useful for large table.

It can provide more flexibility for autovacuum to create a good schedule 
scheme. Sometimes it takes a whole day to vacuum the large table 
(Hundreds-GB table may qualify); setting the cost_delay make it even 
lasts for several days. If the system has maintenance time, vacuum task 
of the large table can be split to fit into the maintenance time by 
interrupted-resume feature.

> One option that I've heard before is to have vacuum after a single iteration
> (ie, after it fills maintenance_work_mem and does the index cleanup and the
> second heap pass), remember where it was and pick up from that point next
> time.
> 

Even a single iteration may take a long time, so it is not so much 
useful to have a break in the boundary of the iteration. I think it is 
not so difficult to get vacuum to remember where it leaves and start 
from where it leaves last time. The following is a basic idea.

A typical vacuum process mainly have the following phases: Phase 1. scan heap Phase 2. scan and sweep index Phase 3.
sweepheap Phase 4. update FSM Phase 5. truncate CLOG
 

Where vacuum is interrupted, we can just save the collected information 
into the disk, and restore it later when vacuum restarts. When vacuum 
process is interrupted, we can remember the dead tuple list and the 
block number it has scanned in phase 1; the indexes it has cleanup in 
phase 2; the tuples it has swept in phase 3. Before exiting from vacuum, 
we can also merge the free space information into FSM.

We are working on this feature now.  I will propose it latter to discuss 
with you.

Best Regards
Galy Lee
-- 
NTT Open Source Software Center


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: ToDo: add documentation for operator IS OF
Next
From: Andrew Dunstan
Date:
Subject: Re: \copy (query) delimiter syntax error