Home > mailing lists

Re: Deleting millions of rows - Mailing list pgsql-performance

From	Robert Haas
Subject	Re: Deleting millions of rows
Date	February 4, 2009 09:59:23
Msg-id	603c8f070902040559ld4661b0p3e76a51ae1c3b618@mail.gmail.com Whole thread Raw
In response to	Re: Deleting millions of rows (Gregory Stark <stark@enterprisedb.com>)
List	pgsql-performance

Tree view

On Wed, Feb 4, 2009 at 7:35 AM, Gregory Stark <stark@enterprisedb.com> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>
>> That's good if you're deleting most or all of the parent table, but
>> what if you're deleting 100,000 values from a 10,000,000 row table?
>> In that case maybe I'm better off inserting all of the deleted keys
>> into a side table and doing a merge or hash join between the side
>> table and the child table...
>
> It would be neat if we could feed the queued trigger tests into a plan node
> like a Materialize and use the planner to determine which type of plan to
> generate.

Yes, definitely.  If it could be built as a general facility it would
be good for a lot of other things too.  Imagine that from within a
statement-level trigger you had magical tables called OLD_TUPLES and
NEW_TUPLES, analagous to OLD and NEW, but the whole set of them.  I
can't tell you how many problems I could solve with this type of
facility...

What I think makes it a little extra-difficult is that even if you had
this, you still can't express what you want to plan as a single query.
 You can either join the foreign key relation against OLD_TUPLES and
delete everything that matches, or you can join the foreign key
relation against the remaining table contents and throw away
everything that doesn't match.

...Robert

pgsql-performance by date:

From: david@lang.hm
Date: 04 February 2009, 09:40:52
Subject: Re: SSD performance

From: Jeff
Date: 04 February 2009, 10:06:50
Subject: Re: SSD performance

Re: Deleting millions of rows - Mailing list pgsql-performance

Previous

Next