Re: Help with large delete - Mailing list pgsql-general

From Rob Sargent
Subject Re: Help with large delete
Date
Msg-id 70cfeae1-bf2b-09c5-3f31-b708e4bc39e9@gmail.com
Whole thread Raw
In response to Help with large delete  (Perry Smith <pedz@easesoftware.com>)
List pgsql-general
On 4/16/22 07:25, Perry Smith wrote:
Currently I have one table that mimics a file system.  Each entry has a parent_id and a base name where parent_id is an id in the table that must exist in the table or be null with cascade on delete.

I’ve started a delete of a root entry with about 300,000 descendants.  The table currently has about 22M entries and I’m adding about 1600 entries per minute still.  Eventually there will not be massive amounts of entries being added and the table will be mostly static.

I started the delete before from a terminal that got detached.  So I killed that process and started it up again from a terminal less likely to get detached.˘

My question is basically how can I make life easier for Postgres?  I believe (hope) the deletes will be few and far between but they will happen from time to time.  In this case, Dropbox — its a long story that isn’t really pertinent.  The point is that @#$% happens.

“What can I do” includes starting completely over if necessary.  I’ve only got about a week invested in this and its just machine time at zero cost.  I could stop the other processes that are adding entries and let the delete finish if that would help.  etc.

Thank you for your time,
Perry

I would try 1) find any nodes with disproportionately more nodes, deal with them separately.  You my have a gut feel for where these nodes are? 2) Start at least one step down: run a transaction for each entry in root node.  Maybe go two levels down.

pgsql-general by date:

Previous
From: Perry Smith
Date:
Subject: Help with large delete
Next
From: "Peter J. Holzer"
Date:
Subject: Re: Help with large delete