Re: Deleting more efficiently from large partitions - Mailing list pgsql-admin

From Michaeldba@sqlexec.com
Subject Re: Deleting more efficiently from large partitions
Date
Msg-id 318DA482-E5EB-4368-946C-C27F3493E57A@sqlexec.com
Whole thread Raw
In response to Deleting more efficiently from large partitions  (Wells Oliver <wells.oliver@gmail.com>)
List pgsql-admin
What version of PG?
What does EXPLAIN show?

Sent from my iPad

On Jun 15, 2020, at 9:39 PM, Wells Oliver <wells.oliver@gmail.com> wrote:

Hi all. I have a partitioned table (by month from a date column), where each partition contains something like 400m rows.

Each partition is defined by a PK with a uuid and date field (the parent table is partitioned by range on the date), and two other columns.

In doing a delete for a specific date, e.g. DELETE FROM t WHERE date = '2019-09-01' AND uuid IN (SELECT uuid FROM temptable), it runs very efficiently.

I am trying to write a processing script that deletes for potentially multiple dates & uuid values, and it just takes hours, trying:

DELETE FROM t WHERE date = (SELECT DISTINCT date from temp) AND uuid IN (select uuid from tempuuds) -- no go, hours.

Tried USING, e.g. DELETE FROM t USING temp WHERE t.date = temp.date AND t.uuid = temp.uuid -- no go, hours.

I just can't delete from this table without an explicit date and a set of uuids using a WHERE IN approach, but I need to.I was thinking of making a plpgsql function or something that loops through dates and makes a more explicit DELETE statement, but I'm thinking there must be some better way using indexing or something.

Appreciate any tips.


--

pgsql-admin by date:

Previous
From: Keith
Date:
Subject: Re: block corruption on slave db.
Next
From: Greg Spiegelberg
Date:
Subject: Re: Upgrade streaming replication and log-shipping standby servers