On Wed, 20 Aug 2003 15:39:26 -0400, Tom Lane <tgl@sss.pgh.pa.us>
wrote:
>But I think the real point here is that there's no reason to think that
>doing tuple deletion on-the-fly in foreground transactions is superior
>to doing it in background with a vacuum process. You're taking what
>should be noncritical maintenance work and moving it into the critical
>paths of your foreground applications. Not only that, but you're
>probably doing more total work per tuple --- VACUUM "batches" its work
>in more ways than just the index cleanup aspect, IIRC.
Yes, I sign that, 100%. That doesn't mean that we couldn't do any
better. AFAICS Vivek's problem is that it is hard enough to hold a
good part of the working set in the cache, and still his disks are
saturated. Now a VACUUM not only adds one more process to disk I/O
contention, but also makes sure that the working set pages are *not*
in memory which leads to higher I/O rates after the VACUUM.
I can imagine several use cases where only a small part of a large
relation is subject to DELETEs/UPDATEs. Maybe Vivek's application
falls into this category.
If we teach VACUUM to not read pages that don't contain any dead
tuples, this could be a significant improvement. I'm envisioning a
data structure (reclaimable space map, RSM) similar to the FSM.
Whenever a backend encounters a dead tuple it inserts a reference to
its page into the RSM. Dead tuple detection is no problem, it is
already implemented for marking dead index tuples. VACUUM, when run
in a new mode ("decent"), only checks pages that are listed in the
RSM. To get full advantage of not doing unnecessary page reads, we'll
also need to redesign the index bulk delete routines.
The autovaccum daemon will watch the RSM and when the number of
entries is above a configurable threshold, it will start a
VACUUM DECENT ...
Servus
Manfred