Re: Piggybacking vacuum I/O - Mailing list pgsql-hackers

From Jim C. Nasby
Subject Re: Piggybacking vacuum I/O
Date
Msg-id 20070122224945.GR64372@nasby.net
Whole thread Raw
In response to Piggybacking vacuum I/O  (Heikki Linnakangas <heikki@enterprisedb.com>)
List pgsql-hackers
On Mon, Jan 22, 2007 at 02:51:47PM +0000, Heikki Linnakangas wrote:
> I've been looking at the way we do vacuums.
> 
> The fundamental performance issue is that a vacuum generates 
> nheapblocks+nindexblocks+ndirtyblocks I/Os. Vacuum cost delay helps to 
> spread the cost like part payment, but the total is the same. In an I/O 
> bound system, the extra I/O directly leads to less throughput.
> 
> Therefore, we need to do less I/O. Dead space map helps by allowing us 
> to skip blocks that don't need vacuuming, reducing the # of I/Os to 
> 2*ndirtyblocks+nindexblocks. That's great, but it doesn't help us if the 
> dead tuples are spread uniformly.
> 
> If we could piggyback the vacuum I/Os to the I/Os that we're doing 
> anyway, vacuum wouldn't ideally have to issue any I/O of its own. I've 
> tried to figure out a way to do that.
> 
> Vacuum is done in 3 phases:
> 
> 1. Scan heap
> 2. Vacuum index
> 3. Vacuum heap


> Instead of doing a sequential scan, we could perform the 1st phase by 
> watching the buffer pool, scanning blocks for dead tuples when they're 
> in memory and keeping track of which pages we've seen. When all pages 
> have been seen, the tid list is sorted and 1st phase is done.
> 
> In theory, the index vacuum could also be done that way, but let's 
> assume for now that indexes would be scanned like they are currently.
> 
> The 3rd phase can be performed similarly to the 1st phase. Whenever a 
> page enters the buffer pool, we check the tid list and remove any 
> matching tuples from the page. When the list is empty, vacuum is complete.

Is there any real reason to demark the start and end of a vacuum? Why
not just go to a continuous process? One possibility is to keep a list
of TIDs for each phase, though that could prove tricky with multiple
indexes.

> A variation of the scheme would be to keep scanning pages that are in 
> cache, until the tid list reaches a predefined size, instead of keeping 
> track of which pages have already been seen. That would deal better with 
> tables with hot and cold spots, but it couldn't advance the relfrozenid 
> because there would be no guarantee that all pages are visited. Also, we 
> could start 1st phase of the next vacuum, while we're still in the 3rd 
> phase of previous one.
What if we tracked freeze status on a per-page basis? Perhaps track the
minimum XID that's on each page. That would allow us to ensure that we
freeze pages that are approaching XID wrap.
-- 
Jim Nasby                                            jim@nasby.net
EnterpriseDB      http://enterprisedb.com      512.569.9461 (cell)


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: send password to pg_dump
Next
From: "Jim C. Nasby"
Date:
Subject: Re: [GENERAL] Autovacuum Improvements