I just had a thought that could potentially greatly improve vacuum
performance. What about some kind of TID (or does vacuum use CID?)
index? This would allow vacuum to visit only the pages it needs to
visit. Actually, I guess TID/CID wouldn't even be involved; the only
information needed would be if any tuples on a page have been marked
deleted. Something as simple as a bitmap could work. Storing TID info
might provide added vacuum efficiency, but my guess is it's probably not
worth the extra effort.
This might not help much for tables that just see a lot of random update
activity, but I think it would be very useful for large tables where
pages with dead tuples are likely to be a small percentage of the total
number of pages.
Maintaining this information on a per-transaction basis might prove
difficult to do without causing concurrency issues. Luckily, I think
this could probably be done in the background without much difficulty.
One possibility is to check for dead tuples as pages are written to disk
(actually, by definition, there would have to be dead tuples at that
point I would think). If memory serves writing these pages is now a
background process, so this shouldn't cause contention issues.
--
Jim C. Nasby, Database Consultant decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828
Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"