Re: [GENERAL] DELETE taking too much memory

From: Vincent de Phily
Subject: Re: [GENERAL] DELETE taking too much memory
Date: ,
Msg-id: 1827997.DOOMnYSbox@moltowork
(view: Whole thread, Raw)
In response to: Re: [GENERAL] DELETE taking too much memory  (Guillaume Lelarge)
List: pgsql-performance

Tree view

DELETE taking too much memory  (vincent dephily, )
 Re: [GENERAL] DELETE taking too much memory  (Guillaume Lelarge, )
  Re: [GENERAL] DELETE taking too much memory  (Dean Rasheed, )
   Re: [GENERAL] DELETE taking too much memory  (Jose Ildefonso Camargo Tolosa, )
   Re: [GENERAL] DELETE taking too much memory  (Vincent de Phily, )
  Re: [GENERAL] DELETE taking too much memory  (Vincent de Phily, )
 Re: [GENERAL] DELETE taking too much memory  (Dean Rasheed, )
  Re: [GENERAL] DELETE taking too much memory  (Claudio Freire, )
 Re: DELETE taking too much memory  ("French, Martin", )
  Re: DELETE taking too much memory  (Vincent de Phily, )
 Re: DELETE taking too much memory  ("French, Martin", )
  Re: DELETE taking too much memory  (Vincent de Phily, )

On Thursday 07 July 2011 22:26:45 Guillaume Lelarge wrote:
> On Thu, 2011-07-07 at 15:34 +0200, vincent dephily wrote:
> > Hi,
> >
> > I have a delete query taking 7.2G of ram (and counting) but I do not
> > understant why so much memory is necessary. The server has 12G, and
> > I'm afraid it'll go into swap. Using postgres 8.3.14.
> >
> > I'm purging some old data from table t1, which should cascade-delete
> > referencing rows in t2. Here's an anonymized rundown :
> >
> >
> > # \d t1
> >
> >                              Table
> >                              "public.t1"
> >
> >   Column   |            Type             |             Modifiers
> >
> > -----------+-----------------------------+------------------------------
> > ---
> >
> >  t1id      | integer                     | not null default
> >
> > nextval('t1_t1id_seq'::regclass)
> > (...snip...)
> >
> > Indexes:
> >     "message_pkey" PRIMARY KEY, btree (id)
> >
> > (...snip...)
> >
> > # \d t2
> >
> >                                Table
> >                                "public.t
> >                                2"
> >
> >      Column      |            Type             |        Modifiers
> >
> > -----------------+-----------------------------+------------------------
> > -----
> >
> >  t2id            | integer                     | not null default
> >
> > nextval('t2_t2id_seq'::regclass)
> >
> >  t1id            | integer                     | not null
> >  foo             | integer                     | not null
> >  bar             | timestamp without time zone | not null default now()
> >
> > Indexes:
> >     "t2_pkey" PRIMARY KEY, btree (t2id)
> >     "t2_bar_key" btree (bar)
> >     "t2_t1id_key" btree (t1id)
> >
> > Foreign-key constraints:
> >     "t2_t1id_fkey" FOREIGN KEY (t1id) REFERENCES t1(t1id) ON UPDATE
> >
> > RESTRICT ON DELETE CASCADE
> >
> > # explain delete from t1 where t1id in (select t1id from t2 where
> > foo=0 and bar < '20101101');
> >
> >                                QUERY PLAN
> >
> > ------------------------------------------------------------------------
> > -----
> >
> >  Nested Loop  (cost=5088742.39..6705282.32 rows=30849 width=6)
> >
> >    ->  HashAggregate  (cost=5088742.39..5089050.88 rows=30849
> >    width=4)
> >
> >          ->  Index Scan using t2_bar_key on t2
> >          (cost=0.00..5035501.50
> >
> > rows=21296354 width=4)
> >
> >                Index Cond: (bar < '2010-11-01
> >                00:00:00'::timestamp
> >
> > without time zone)
> >
> >                Filter: (foo = 0)
> >
> >    ->  Index Scan using t1_pkey on t1  (cost=0.00..52.38 rows=1
> >    width=10)
> >
> >          Index Cond: (t1.t1id = t2.t1id)
> >
> > (7 rows)
> >
> >
> > Note that the estimate of 30849 rows is way off : there should be
> > around 55M rows deleted from t1, and 2-3 times as much from t2.
> >
> > When looking at the plan, I can easily imagine that data gets
> > accumulated below the nestedloop (thus using all that memory), but why
> > isn't each entry freed once one row has been deleted from t1 ? That
> > entry isn't going to be found again in t1 or in t2, so why keep it
> > around ?
> >
> > Is there a better way to write this query ? Would postgres 8.4/9.0
> > handle things better ?
>
> Do you have any DELETE triggers in t1 and/or t2?

No, there are triggers on insert/update to t1 which both insert into t2, but
no delete trigger. Deletions do cascade from t1 to t2 because of the foreign
key.
--
Vincent de Phily
Mobile Devices
+33 (0) 142 119 325
+353 (0) 85 710 6320

Warning
This message (and any associated files) is intended only for the use of its
intended recipient and may contain information that is confidential, subject
to copyright or constitutes a trade secret. If you are not the intended
recipient you are hereby notified that any dissemination, copying or
distribution of this message, or files associated with this message, is
strictly prohibited. If you have received this message in error, please
notify us immediately by replying to the message and deleting it from your
computer. Any views or opinions presented are solely those of the author
 and do not necessarily represent those of
the
company. Although the company has taken reasonable precautions to ensure no
viruses are present in this email, the company cannot accept responsibility
for any loss or damage arising from the use of this email or attachments.


pgsql-performance by date:

From: Pavel Stehule
Date:
Subject: Re: INSERT query times
From: Anish Kejariwal
Date:
Subject: Re: issue with query optimizer when joining two partitioned tables