Re: Feedback on getting rid of VACUUM FULL - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: Feedback on getting rid of VACUUM FULL
Date
Msg-id 1253135245.778.62.camel@hvost1700
Whole thread Raw
In response to Re: Feedback on getting rid of VACUUM FULL  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On Wed, 2009-09-16 at 23:53 +0300, Heikki Linnakangas wrote:
> Hannu Krosing wrote:
> > On Wed, 2009-09-16 at 21:23 +0300, Heikki Linnakangas wrote:
> >> 2) Another utility that does something like UPDATE ... WHERE ctid > ? to
> >> move tuples to lower pages. It will be different from current VACUUM
> >> FULL in some ways. It won't require a table lock, for example, but it
> >> won't be able to move update chains as nicely. But it would be trivial
> >> to write one, so I think we should offer that as a contrib module.
> > 
> > I have not checked, but I suspect pg_reorg may already be doing
> > something similar http://pgfoundry.org/forum/forum.php?forum_id=1561
> 
> Hmm, AFAICT pg_reorg is much more complex, writing stuff to a temp table
> and swapping relfilenodes afterwards. More like the VACUUM REWRITE
> that's been discussed.
> 
> For the kicks, I looked at what it would take to write a utility like
> that. It turns out to be quite trivial, patch attached. It uses the same
> principle as VACUUM FULL, scans from the end, moving tuples to
> lower-numbered pages until it can't do it anymore. It requires a small
> change to heap_update(), to override the preference to store the new
> tuple on the same page as the old one, but other than that, it's all in
> the external module.

Exactly as I hoped :D

One thing that would be harder to do, and which CLUSTER currently does
is introducing empty space within pages, based on fillfactor.

Doing that would need a similar, though reversed strategy. But it is
probably not something that is often needed, as a an update on page with
no free space would eventually do almost the same.

> To test:
> 
> -- Create and populate test table
> CREATE TABLE foo (id int4 PRIMARY KEY);
> INSERT INTO foo SELECT a FROM generate_series(1,100000) a;
> 
> -- Delete a lot of tuples from the beginning. This creates the hole that
> we want to compact out.
> DELETE FROM foo WHERE id < 90000;
> 
> -- Vacuum to remove the dead tuples
> VACUUM VERBOSE foo;
> 
> -- Run the utility to "move" the tuples
> SELECT vacuumfull('foo');
> 
> -- Vacuum table again to remove the old tuple versions of the moved rows
> and truncate the file.
> 
> VACUUM VERBOSE foo;

Now, if you could just make vacuumfull('foo'); run in multiple
transactions (say one per N tuples moved, or even per N seconds spent)
to make it friendlier for OLTP workloads, which then dont have to wait
for the whole thing to finish in order to proceed with update of a moved
tuple (and also to deal with deadloks from trying to move an updated
tuple) then I'd claim we have a much better VACUUM FULL :)

-- 
Hannu Krosing   http://www.2ndQuadrant.com
PostgreSQL Scalability and Availability   Services, Consulting and Training




pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Feedback on getting rid of VACUUM FULL
Next
From: Jeff Davis
Date:
Subject: Re: Feedback on getting rid of VACUUM FULL