Re: Proposal for background vacuum full/cluster - Mailing list pgsql-hackers

From Jim C. Nasby
Subject Re: Proposal for background vacuum full/cluster
Date
Msg-id 20050421044947.GD58835@decibel.org
Whole thread Raw
In response to Re: Proposal for background vacuum full/cluster  (Paul Tillotson <pntil@shentel.net>)
Responses Re: Proposal for background vacuum full/cluster
List pgsql-hackers
On Wed, Apr 20, 2005 at 08:10:23PM -0400, Paul Tillotson wrote:
> Jim C. Nasby wrote:
> 
> >I talked to a few people on IRC about this and they didn't think I was
> >nuts, so maybe this is something practical...
> >
> >In a nutshell, my idea is to use the normal transactional/XID code to
> >relocate tuples in the heap. Think of doing an UPDATE field=field if you
> >could tell update what page to put the new tuple on. 
> >
> Be careful not to fire UPDATE triggers on the tuple while doing so.

To clarify, I'm not suggesting this actually be coded as a bunch of
updates. I used that as an example only.

> Keep in mind that the transaction that does the update can't also vacuum 
> it's own tuples.  You'd have to end one transaction, then wait until 
> every transaction running while the updater ran finishes, then start the 
> transaction that vacuums.  Obviously your command would need to be able 
> to start and end transactions.  (Meaning that it can't be a user defined 
> function, and it probably can't be a normal self-contained command in 
> postgres.)

Yes. Note how I called it a background vacuum full/cluster.

> Are you thinking of coding this, or just suggesting it for others?  I 
> was thinking of coding something like this but found that I didn't 
> understand enough of the internals of how the vacuum command actually 
> works to be able to write this.  I'd be willing to devote perhaps a few 
> hours a week to it if you want to help me.

I certainly don't have enough knowledge right now to code this, but I'd
be willing to help any way I can.

> P. S.
> 
> The last time I thought about it, I decided that the best solution is 
> probably one that works just like vacuum full except that it scans the 
> table in reverse order.  It would do something like this:
> 
> - Wait for exclusive lock.
That's exactly what I want to avoid. The reality of cluster and vacuum
full is that many (if not most) installs can't use them because of how
they disrupt the system. I'd like a version that doesn't do that.

> - Start at the end of the table -- call this page I.
> - If page I is completely empty, shrink the heap and go to step 1 again.
>         (Page I is not empty now.)
> - *Scan forward in the table until you find a page that is empty.  (Call 
> it J)
>         If no such page is found, there is no more free space in the 
> table.  Exit.
> - Move the tuples from page I to page J.
> - Drop the exclusive lock. and go to step one. 

Same basic idea. I haven't gone into specific details because I want to
see how feasable it is. And since I can't code it myself the best I can
hope for is a TODO; and IMO I shouldn't try and tell whoever takes that
TODO how exactly to make this work.
-- 
Jim C. Nasby, Database Consultant               decibel@decibel.org 
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


pgsql-hackers by date:

Previous
From: "Jim C. Nasby"
Date:
Subject: Re: Proposal for background vacuum full/cluster
Next
From: Tom Lane
Date:
Subject: Re: Postgres: pg_hba.conf, md5, pg_shadow, encrypted passwords