Thread: Most effective settings for deleting lots of data?

Most effective settings for deleting lots of data?

From
Cory Tucker
Date:
Hi, I am using postgres 9.3 and am preparing to migrate to 9.4.  As part of the migration, I was hoping to also delete a bunch of data that is no longer needed (100M+ rows across several tables).

I can fairly trivially delete the data by doing a simple statement like this:

DELETE FROM account WHERE id = 1234;

All the foreign keys are setup to have "ON DELETE CASCADE", so this will effectively drop the entire graph, which is what I want.  My desire is to do this as quickly as possible.  The first thing I have done is to disable autovacuuming on all of the largest tables so that the entire system doesn't lock up, like so:

ALTER TABLE my_data SET (autovacuum_enabled = false, toast.autovacuum_enabled = false);

I was wondering what other settings might be best be applied?  For instance, I was thinking that "work_mem" could be bumped up to some ridiculously large value if needed.  I have the benefit of being able to have exclusive access to the database at the time that this operation needs to happen so most options are on the table.

What settings would you recommend?  Also, it just occurred to me that I should try to disable/drop all indexes (especially since they will be recreated) later so that those are not updated in the process.

thanks in advance,
--Cory

Re: Most effective settings for deleting lots of data?

From
Alan Hodgson
Date:
On Wednesday, August 26, 2015 08:25:02 PM Cory Tucker wrote:
> What settings would you recommend?  Also, it just occurred to me that I
> should try to disable/drop all indexes (especially since they will be
> recreated) later so that those are not updated in the process.

Don't drop the indexes your foreign keys use to find cascading deletes.



Re: Most effective settings for deleting lots of data?

From
Rob Sargent
Date:
On 08/26/2015 02:34 PM, Alan Hodgson wrote:
On Wednesday, August 26, 2015 08:25:02 PM Cory Tucker wrote:
What settings would you recommend?  Also, it just occurred to me that I
should try to disable/drop all indexes (especially since they will be
recreated) later so that those are not updated in the process.
Don't drop the indexes your foreign keys use to find cascading deletes.



Or do drop indexes and walk up your dependency graph manually.