The vacuum-ignore-vacuum patch - Mailing list pgsql-patches

From Alvaro Herrera
Subject The vacuum-ignore-vacuum patch
Date
Msg-id 20060711210127.GA8463@surnet.cl
Whole thread Raw
Responses Re: The vacuum-ignore-vacuum patch
Re: The vacuum-ignore-vacuum patch
List pgsql-patches
Hi,

Hannu Krossing asked me about his patch to ignore transactions running
VACUUM LAZY in other vacuum transactions.  I attach a version of the
patch updated to the current sources.

Just to remind what this is about: the point of the patch is to be able
to run more than one VACUUM LAZY simultaneously and not have them
interefere with each other.  For example, assume you have a database
with two tables, one very big and another very small but with a high
update rate.  One usually wants to vacuum the small one very frequently
in order to keep the number of dead tuples low.  But if one starts to
vacuum the big table, it will take a long time, during which the vacuums
applied to the smaller table won't be able to recover any tuple because
that transaction will think the other transaction may want to read some
of the tuples that the small transaction is trying to remove.

We know this is not so -- a VACUUM can only be run in a standalone
transaction, and it only checks the one table it's vacuuming.  Thus we
can optimize the vacuuming so that if the only thing that's holding the
tuples undeletable is another big vacuum operation, ignore it and delete
the tuples anyway.

One exception is that we can't do that with full vacuums.  The reason is
that full vacuum may want to run user-defined functions to be able to
index the tuples it moves.  This isn't a problem normally, except in the
case where the function tries to scan some other table: if we ignored
that transaction, then another lazy vacuum might delete tuples from that
table that we need to see.

In a previous version of the patch, there was a note somewhere that made
the code not ignore lazy vacuums in the case where we were running
database-wide vacuums.  The reason was that the value we computed was
also used as truncate point for pg_clog; thus if we ignored that
transaction, the truncate point could be further ahead than the vacuum,
so the clog page for the vacuum transaction could be gone and it
wouldn't be able to commit.  This is no longer the case, because with
the patch I committed yesterday, the clog truncation point is calculated
differently and thus we don't need to take special care about this.



--
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"Uno combate cuando es necesario... ¡no cuando está de humor!
El humor es para el ganado, o para hacer el amor, o para tocar el
baliset.  No para combatir."  (Gurney Halleck)

Attachment

pgsql-patches by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: small entab cleanup
Next
From: "Marc G. Fournier"
Date:
Subject: reply to ...