VACUUM FULL FREEZE is unsafe - Mailing list pgsql-hackers

From Tom Lane
Subject VACUUM FULL FREEZE is unsafe
Date
Msg-id 12179.1101591711@sss.pgh.pa.us
Whole thread Raw
Responses Re: VACUUM FULL FREEZE is unsafe  (Thomas F.O'Connell <tfo@sitening.com>)
List pgsql-hackers
The point of VACUUM FREEZE is to ensure that there are no tuples
present in the database whose commit status depends on "normal" XIDs.
Without this guarantee, cloning template0 might stop working once
the relevant part of pg_clog has been pruned.

If one combines freezing with moving tuples across pages (ie,
VACUUM FULL FREEZE), then the commit status of moved tuples may
depend on the vacuum's own XID (stored in XVAC).  To maintain the
freeze safety guarantee, we'd want to be sure that upon successful
completion of the VACUUM, there are no moved tuples that haven't had
their status hint bits updated to XMIN_COMMITTED or XMIN_INVALID.

After some digging through vacuum.c, I have convinced myself that
this does occur for all tuples moved down from the end of the table.
update_hint_bits() takes care of all MOVED_IN rows; MOVED_OFF rows
in the page that becomes the physically last page of the table are
fixed near the bottom of repair_frag(); and MOVED_OFF rows in
pages after that don't matter because we'll truncate those pages
away entirely.

Unfortunately this still leaves one case uncovered, which is a tuple
that is moved because it is part of an update chain.  If an original
tuple in an update chain is in a page that is below the new end of
the table, and was not a move target page (eg because it had no free
space), then that tuple will never be visited to change its state from
MOVED_OFF to XMIN_INVALID.

This doesn't break initdb, because there will be no update-chain cases
since no other transactions can be running.  But it poses a nasty hazard
for anyone who is updating and re-freezing a template database during
normal operations (as for example in following the manual bug fix
procedures we had to recommend for some of the 7.4 dot releases).

Also, even though I don't see any failure cases for initdb, it seems
awfully risky to assume that this is all going to work 100%; and if
initdb did leave any improperly frozen tuples behind, it's quite likely
we'd not notice the error until the code got into the field.

ISTM that the safer way to handle this is VACUUM FULL (to compact)
and then VACUUM FREEZE (to freeze).  It's much clearer that lazy VACUUM
can handle freezing reliably, because it never tries to move tuples
around.

Just doing this in initdb is a one-liner change, but I'm wondering if we
ought to enforce that FULL and FREEZE not be specified at the same time,
so that people couldn't risk such a problem in manual freezing of
template databases.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Patrick B Kelly
Date:
Subject: Re: Bitmap index
Next
From: "Dave Page"
Date:
Subject: Re: [pgsql-hackers-win32] [BUGS] pg_autovacuum in 8beta-dev3 small bug