Re: DELETE taking too much memory - Mailing list pgsql-general

From Guillaume Lelarge
Subject Re: DELETE taking too much memory
Date
Msg-id 1310070406.2046.7.camel@laptop
Whole thread Raw
In response to DELETE taking too much memory  (vincent dephily <vincent.dephily@mobile-devices.fr>)
Responses Re: DELETE taking too much memory  (Dean Rasheed <dean.a.rasheed@gmail.com>)
Re: DELETE taking too much memory  (Vincent de Phily <vincent.dephily@mobile-devices.fr>)
List pgsql-general
On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
> Hi,
>
> I have a delete query taking 7.2G of ram (and counting) but I do not
> understant why so much memory is necessary. The server has 12G, and
> I'm afraid it'll go into swap. Using postgres 8.3.14.
>
> I'm purging some old data from table t1, which should cascade-delete
> referencing rows in t2. Here's an anonymized rundown :
>
>
> # \d t1
>                              Table "public.t1"
>   Column   |            Type             |             Modifiers
> -----------+-----------------------------+---------------------------------
>  t1id      | integer                     | not null default
> nextval('t1_t1id_seq'::regclass)
> (...snip...)
> Indexes:
>     "message_pkey" PRIMARY KEY, btree (id)
> (...snip...)
>
> # \d t2
>                                Table "public.t2"
>      Column      |            Type             |        Modifiers
> -----------------+-----------------------------+-----------------------------
>  t2id            | integer                     | not null default
> nextval('t2_t2id_seq'::regclass)
>  t1id            | integer                     | not null
>  foo             | integer                     | not null
>  bar             | timestamp without time zone | not null default now()
> Indexes:
>     "t2_pkey" PRIMARY KEY, btree (t2id)
>     "t2_bar_key" btree (bar)
>     "t2_t1id_key" btree (t1id)
> Foreign-key constraints:
>     "t2_t1id_fkey" FOREIGN KEY (t1id) REFERENCES t1(t1id) ON UPDATE
> RESTRICT ON DELETE CASCADE
>
> # explain delete from t1 where t1id in (select t1id from t2 where
> foo=0 and bar < '20101101');
>                                QUERY PLAN
> -----------------------------------------------------------------------------
>  Nested Loop  (cost=5088742.39..6705282.32 rows=30849 width=6)
>    ->  HashAggregate  (cost=5088742.39..5089050.88 rows=30849 width=4)
>          ->  Index Scan using t2_bar_key on t2  (cost=0.00..5035501.50
> rows=21296354 width=4)
>                Index Cond: (bar < '2010-11-01 00:00:00'::timestamp
> without time zone)
>                Filter: (foo = 0)
>    ->  Index Scan using t1_pkey on t1  (cost=0.00..52.38 rows=1 width=10)
>          Index Cond: (t1.t1id = t2.t1id)
> (7 rows)
>
>
> Note that the estimate of 30849 rows is way off : there should be
> around 55M rows deleted from t1, and 2-3 times as much from t2.
>
> When looking at the plan, I can easily imagine that data gets
> accumulated below the nestedloop (thus using all that memory), but why
> isn't each entry freed once one row has been deleted from t1 ? That
> entry isn't going to be found again in t1 or in t2, so why keep it
> around ?
>
> Is there a better way to write this query ? Would postgres 8.4/9.0
> handle things better ?
>

Do you have any DELETE triggers in t1 and/or t2?


--
Guillaume
  http://blog.guillaume.lelarge.info
  http://www.dalibo.com


pgsql-general by date:

Previous
From: Guillaume Lelarge
Date:
Subject: Re: Creating temp tables inside read only transactions
Next
From: "David Johnston"
Date:
Subject: Re: Add Foreign Keys To Table