Slow to delete TOAST table data - Mailing list pgsql-general

From Glen Brown
Subject Slow to delete TOAST table data
Date
Msg-id fb9f1c030810211054nd5cfe8fy61767ff076d3a4fb@mail.gmail.com
Whole thread Raw
List pgsql-general
Hello, I am in need of some help figuring out what is going on with this query. I am cleaning up some data. This data set is about 3.5 million records and I am trying to purge out 400k of them. I was able to remove this data quickly from some of the related tables that had no TOAST data associated with them but when it came to remove the data from the main table that has a lot of binary data stored and that has been TOASTed is is taking a long time.

This is what I have done. I selected into a temporary table the id's that I wanted to remove. Then:

delete from raw as t1
 using temp_raw_ids t2
where t1.raw_id = t2.raw_id;

to remove the data.  There is no index on this temp table because it is a one column wide table and that didn't seem to impact the delete from the first 5 tables but the last delete is still running after 24 hours and has only processed about 100k of the records. (The first 5 were done in less then 1 hour, total.)

Here's the explain:
"Hash Join  (cost=7445.63..300049.01 rows=276517 width=6)"
"  Hash Cond: (t1.raw_id = t2.raw_id)"
"  ->  Seq Scan on raw t1  (cost=0.00..246983.09 rows=3428409 width=14)"
"  ->  Hash  (cost=3989.17..3989.17 rows=276517 width=8)"
"        ->  Seq Scan on temp_raw_ids t2  (cost=0.00..3989.17 rows=276517 width=8)"

What can I look at to determine; first that this process is not just thrashing but actually accomplishing something, and second what is keeping this from being more efficient? The data set has been vacuumed and analyzed before starting this process.

thanks
-glen

--
Glen Brown

pgsql-general by date:

Previous
From: Ivan Sergio Borgonovo
Date:
Subject: tsearch2: setting weights on tsquery plainto_tsquery with | or and weight
Next
From: "Rainer Zaiss"
Date:
Subject: text array accumulate to multidimensional text array