Re: Very long deletion time on a 200 GB database - Mailing list pgsql-performance

From Tom Lane
Subject Re: Very long deletion time on a 200 GB database
Date
Msg-id 3023.1330021832@sss.pgh.pa.us
Whole thread Raw
In response to Re: Very long deletion time on a 200 GB database  (Greg Spiegelberg <gspiegelberg@gmail.com>)
List pgsql-performance
Greg Spiegelberg <gspiegelberg@gmail.com> writes:
> I used LOCK simply because if a VACUUM FULL x; slipped in between the
> SELECT and the DELETE the ctid's could conceivably change.

VACUUM FULL can't "slip in" there, because you'd have AccessShareLock
just from the SELECT.  The real problem goes like this:

    1. You SELECT some ctid and save it in the other table.
    2. Somebody else updates or deletes that row.
    3. Plain VACUUM comes along and frees the dead TID.
    4. Somebody else (maybe not same somebody as #2) inserts a new
       row at that TID position.
    5. You DELETE that TID.  Ooops.

So you might say "okay, the point of the lock is to block plain vacuum,
not vacuum full".  I'm still a bit worried about whether the technique
is entirely safe, though, because of page pruning which can happen
anyway.  What this really boils down to is: how sure are you that no
other userland activity is going to update or delete any of the targeted
rows between the SELECT INTO and the DELETE?  If you're sure, then this
is safe without the extra lock.  Otherwise, I wouldn't trust it.

It might be worth having the SELECT that creates the temp table be a
SELECT FOR UPDATE on the target table, so as to ensure you've locked
down the targeted rows against anybody else.  This is not free though,
as it'll mean extra writes of all the modified tuples.

            regards, tom lane

pgsql-performance by date:

Previous
From: Greg Spiegelberg
Date:
Subject: Re: Very long deletion time on a 200 GB database
Next
From: Alessandro Gagliardi
Date:
Subject: Re: set autovacuum=off