Re: Deleting more efficiently from large partitions - Mailing list pgsql-admin

From Greg Spiegelberg
Subject Re: Deleting more efficiently from large partitions
Date
Msg-id CAEtnbpWYqwSSmbZHPe5OKfx2zqEmTypv_4Y=QrFHdPKPfMKiOw@mail.gmail.com
Whole thread Raw
In response to Deleting more efficiently from large partitions  (Wells Oliver <wells.oliver@gmail.com>)
Responses Re: Deleting more efficiently from large partitions
List pgsql-admin
On Mon, Jun 15, 2020 at 7:39 PM Wells Oliver <wells.oliver@gmail.com> wrote:
Hi all. I have a partitioned table (by month from a date column), where each partition contains something like 400m rows.

Each partition is defined by a PK with a uuid and date field (the parent table is partitioned by range on the date), and two other columns.

In doing a delete for a specific date, e.g. DELETE FROM t WHERE date = '2019-09-01' AND uuid IN (SELECT uuid FROM temptable), it runs very efficiently.

I am trying to write a processing script that deletes for potentially multiple dates & uuid values, and it just takes hours, trying:

DELETE FROM t WHERE date = (SELECT DISTINCT date from temp) AND uuid IN (select uuid from tempuuds) -- no go, hours.

Tried USING, e.g. DELETE FROM t USING temp WHERE t.date = temp.date AND t.uuid = temp.uuid -- no go, hours.

I just can't delete from this table without an explicit date and a set of uuids using a WHERE IN approach, but I need to.I was thinking of making a plpgsql function or something that loops through dates and makes a more explicit DELETE statement, but I'm thinking there must be some better way using indexing or something.

Appreciate any tips.

Have you considered partitioning by day instead of month?  Could eliminate an index you may have on the date column.
How many days are in the many-days DELETE?  Could you simply wrap it in a transaction and do one DELETE per day?

You could potentially get better performance removing the JOIN/sub-SELECT using
DELETE FROM mytable WHERE date_col = ANY( ARRAY['2020-01-01', '2020-01-13']::date[] );

HTH
-Greg

pgsql-admin by date:

Previous
From: Greg Spiegelberg
Date:
Subject: Re: Upgrade streaming replication and log-shipping standby servers
Next
From: Victor Sudakov
Date:
Subject: Re: Upgrade streaming replication and log-shipping standby servers