Re: Very long deletion time on a 200 GB database - Mailing list pgsql-performance

From Greg Spiegelberg
Subject Re: Very long deletion time on a 200 GB database
Date
Msg-id CAEtnbpUZkdg-gH_AcbTMG41yN+Y0ECXdtnGTdf232B6T3v_1SA@mail.gmail.com
Whole thread Raw
In response to Re: Very long deletion time on a 200 GB database  ("Reuven M. Lerner" <reuven@lerner.co.il>)
Responses Re: Very long deletion time on a 200 GB database  (Shaun Thomas <sthomas@peak6.com>)
List pgsql-performance
On Thu, Feb 23, 2012 at 8:25 AM, Reuven M. Lerner <reuven@lerner.co.il> wrote:

I've suggested something similar, but was told that we have limited time to execute the DELETE, and that doing it in stages might not be possible.


Just so happens I had this exact problem last week on a rather large table.
 * DELETE FROM x WHERE id IN (SELECT id FROM y ...) was horrible at best.
 * DELETE FROM x USING y WHERE ... was nearly as bad
Both of the above were taking hours and looking more like it would stretch into days.

What I discovered however was this little technique that I'm sure all  the Pg gods will smote me for using however it worked for me.

BEGIN;
LOCK x IN SHARE UPDATE EXCLUSIVE;  -- to prevent VACUUM's
SELECT x.ctid INTO TEMPORARY TABLE recs_to_delete FROM x,y WHERE x.id=y.id;
DELETE FROM x USING recs_to_delete r WHERE r.ctid=x.ctid;
COMMIT;

I know there are perils in using ctid but with the LOCK it should be safe.  This transaction took perhaps 30 minutes and removed 100k rows and once the table was VACUUM'd afterward it freed up close to 20 GB on the file system.

HTH
-Greg

pgsql-performance by date:

Previous
From: Thom Brown
Date:
Subject: Re: set autovacuum=off
Next
From: Alessandro Gagliardi
Date:
Subject: Re: set autovacuum=off