Re: remove flatfiles.c - Mailing list pgsql-hackers

From Greg Stark
Subject Re: remove flatfiles.c
Date
Msg-id 407d949e0909021230q40d520far51de9e63162f5861@mail.gmail.com
Whole thread Raw
In response to Re: remove flatfiles.c  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: remove flatfiles.c
Re: remove flatfiles.c
Re: remove flatfiles.c
List pgsql-hackers
On Wed, Sep 2, 2009 at 8:10 PM, Robert Haas<robertmhaas@gmail.com> wrote:
> I confess to being a little fuzzy on the details of how this
> implementation (seq-scanning the source table for live tuples) is
> different/better from the current VACUUM FULL implementation.  Can
> someone fill me in?


VACUUM FULL is a *lot* more complex.

It scans pages *backwards* from the end (which does wonderful things
on rotating media). Marks each live tuple it finds as "moved off",
finds a new place for it (using the free space map I think?). Insert
the tuple on the new page and marks it "moved in" and updates the
indexes.

Then it commits the transaction but keeps the lock. Then it has to
vacuum all the indexes of the references to the old tuples at the end
of the table. I think it has to commit that too before it can finally
truncate the table.

The backwards scan is awful for rotating media. The reading from the
end and writing to the beginning is bad too, though hopefully the
cache can help that.

A lot of the complexity comes in from other parts of the system that
have to be aware of tuples that have been "moved off" or "moved in".
They have to be able to check whether the vacuum committed or not.

That reminds me there was another proposal to do an "online" vacuum
full similar to our concurrent index builds. Do noop-updates to tuples
at the end of the table, hopefully finding space for them earlier in
the table. Wait until those transactions are no longer visible to
anyone else and then truncate. (Actually I think you could just not do
anything and let regular lazy vacuum do the truncate). That might be a
good practical alternative for sites where copying their entire table
isn't practical.


--
greg
http://mit.edu/~gsstark/resume.pdf


pgsql-hackers by date:

Previous
From: Andrew Dunstan
Date:
Subject: Re: c function: keep objects in memory for all session or all transaction
Next
From: Christian Gonzalez
Date:
Subject: Re: c function: keep objects in memory for all session or all transaction