Re: [PERFORM] DELETE taking too much memory - Mailing list pgsql-general

From Jose Ildefonso Camargo Tolosa
Subject Re: [PERFORM] DELETE taking too much memory
Date
Msg-id CAETJ_S_e9qbMpK+vQtsDfS7NRJ_jap8-xTcm51Sxs+Hs-9ZYQQ@mail.gmail.com
Whole thread Raw
In response to Re: DELETE taking too much memory  (Dean Rasheed <dean.a.rasheed@gmail.com>)
List pgsql-general


On Fri, Jul 8, 2011 at 4:35 AM, Dean Rasheed <dean.a.rasheed@gmail.com> wrote:
> On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
>> Hi,
>>
>> I have a delete query taking 7.2G of ram (and counting) but I do not
>> understant why so much memory is necessary. The server has 12G, and
>> I'm afraid it'll go into swap. Using postgres 8.3.14.
>>
>> I'm purging some old data from table t1, which should cascade-delete
>> referencing rows in t2. Here's an anonymized rundown :
>>
>> # explain delete from t1 where t1id in (select t1id from t2 where
>> foo=0 and bar < '20101101');

It looks as though you're hitting one of the known issues with
PostgreSQL and FKs. The FK constraint checks and CASCADE actions are
implemented using AFTER triggers, which are queued up during the query
to be executed at the end. For very large queries, this queue of
pending triggers can become very large, using up all available memory.

There's a TODO item to try to fix this for a future version of
PostgreSQL (maybe I'll have another go at it for 9.2), but at the
moment all versions of PostgreSQL suffer from this problem.

The simplest work-around for you might be to break your deletes up
into smaller chunks, say 100k or 1M rows at a time, eg:

delete from t1 where t1id in (select t1id from t2 where foo=0 and bar
< '20101101' limit 100000);

I'd like to comment here.... I had serious performance issues with a similar query (planner did horrible things), not sure if planner will do the same dumb thing it did for me, my query was against the same table (ie, t1=t2).  I had this query:

delete from t1 where ctid in (select ctid from t1 where created_at<'20101231' limit 10000);   <--- this was slooooow.  Changed to:

delete from t1 where ctid = any(array(select ctid from t1 where created_at<'20101231' limit 10000));   <--- a lot faster.

So... will the same principle work here?, doing this?:

delete from t1 where t1id = any(array(select t1id from t2 where foo=0 and bar
< '20101101' limit 100000));  <-- would this query be faster then original one?

 

Regards,
Dean

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

pgsql-general by date:

Previous
From: Claudio Freire
Date:
Subject: Re: [PERFORM] DELETE taking too much memory
Next
From: Jeff Davis
Date:
Subject: Re: [HACKERS] Creating temp tables inside read only transactions