Hi
Thanks for a lot of feadback and good ideas on the restartable vacuum.
Here is a new design overview of it based on previous discussions.
There are several ideas to address the problem of long running VACUUM
in a defined maintenance window. One idea might be: when maintenance
time is running out, we can do the following things:
- send a smart stop request to vacuum and vacuum can stop at a right place.(It might take a long time.)
- change the cost delay setting of vacuum on-the-fly to make vacuum less aggressive.
The followings are the discussions for them.
Restartable vacuum design overview
----------------------------------
* Where to stop:
There are two approaches to stop a vacuum:
(1) tell VACUUM where to stop when it is starting VACUUM can be told to stop at a right point when it starts (By SQL
syntaxlike: VACUUM SOME). The optional stop point is after one full fill-workmem-clean-index-clean-deadtuple cycle.
VACUUMstops when it has finished such a cycle.
(2) interrupt VACUUM when it is running. Another approach is to interrupt the running VACUUM. VACUUM checks for a
smartstop request at normal vacuum delay points, if such a request is detected, a flag is set to tell VACUUM to stop
ata right point. VACUUM stops at the end of one full fill-workmem-clean-index -clean-deadtuple cycle.
But I can not figure out a simple way to request a running VACUUM to stop in (2), for the signals of backend have
beenuse up. (1) is simple to be implemented, for it doesn’t require a communication with running VACUUM.
* How to stop
When VACUUM is stopping, - it saves the block number that it had reached to pg_class; - it also updates the free
spaceinformation to FSM. (This might be posted by a separated patch.)
* How to restart:
When VACUUM is restarting, it reads the stored block from pg_class to restart the interrupted scan.
"Change VACUUM cost delay settings on-the-fly" feature
------------------------------------------------------
When the end of maintenance window comes, we might notify VACUUM to use
a set of less aggressive cost delay setting.
I don’t have a clear idea on how to implement this feature yet. Maybe
we need a message passing mechanism between backbends to exchange the
cost delay setting like a patch in here: http://archives.postgresql.org/pgsql-patches/2006-04/msg00047.php
Another simple way to achieve this is to use the setting for different
maintenance window in system catalog. There are some previous
discussions about the implementation of maintenance window, but further
discussions still have not been raised. So it is seems that it is better
to implement this feature after the implementation of maintenance
window.
Implementation plan
-------------------
Changing VACUUM cost delay setting on-the-fly requires a internal
massage passing mechanism or the implementation of maintenance window,
maybe it is not a good timing to rush for it now. But I hope the
*restartable VACUUM feature* can be accepted for 8.3.
Hope your comments and suggestions.
Best Regards
Galy Lee <lee.galy _at_ ntt.oss.co.jp>
NTT Open Source Software Center