Re: Attempting to delete excess rows from table with BATCH DELETE - Mailing list pgsql-general

From Ron Johnson
Subject Re: Attempting to delete excess rows from table with BATCH DELETE
Date
Msg-id CANzqJaBcHnScWQjsVLoNg3OiQqWh3RcNEyNxwFV65Ps6PErQNw@mail.gmail.com
Whole thread Raw
In response to Attempting to delete excess rows from table with BATCH DELETE  (Gus Spier <gus.spier@gmail.com>)
List pgsql-general
On Tue, Jan 27, 2026 at 10:22 PM Gus Spier <gus.spier@gmail.com> wrote:
Environment: AWS RDS Aurora for PostgreSQL 15 hosting tables that
support scientific research. The development environment predominantly
uses JPA with Hibernate.

Years of neglect have allowed mission tables to accumulate hundreds of
millions of rows of excess data. The developers and the customer
decided we must delete all rows older than 75 days. Table partitioning
was briefly considered but discarded because of the effort needed to
refactor the codebase.

I proposed the straight-forward course of action: delete by batches
from the victim tables.

The solution seemed obvious:

For candidate tables:
- Determine timestamp column iwith the most relevant value and call it
the Discriminator.
- Delete any row whose Discriminator value is 60 days older than the
Discriminator, with a LIMIT of 50000,
- Get the results of the batch by querying GET DIAGNOSTICS. That value
held aside for later use.
- COMMIT the delete transaction
- Loop to the top and continue deleting batches until there are no
more rows older than 60 days.

- Before running a test, I ensure that the Discriminator column is indexed

I write a procedure to accomplish all this work but it persists in
returning a error to the effect that a COMMIT is not valid in a block
tht tries to DELETE data.

Has anybody seen this before?

Yup, when putting the COMMIT in a DO block, which isn't allowed.
 
Is there a solution?

Loop using bash.  In a similar case like this, I first looked for the oldest day's data in the tables, then did something like this bash pseudo-code:
StopDate=$(date -d'60 days ago')
DeleteDay=$1
export PGHOST=foo.example.com
export PGDATABASE=bar
while [[ "$DeleteDay <= "$StopDate" ]]; do
    psql -Xc "DELETE FROM blarge WHERE txn_date > '$DeleteDay' + INTERVAL '1' DAY;"
    DeleteDay=$(date -d "$DeleteDay + 1 day" +"%Y-%m-%d")
done

Using that method, I developed a fast and automated monthly archive process which exported and then deleted from 120 tables.

--
Death to <Redacted>, and butter sauce.
Don't boil me, I'm still alive.
<Redacted> lobster!

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Attempting to delete excess rows from table with BATCH DELETE
Next
From: Ron Johnson
Date:
Subject: Re: Attempting to delete excess rows from table with BATCH DELETE