Re: make bulk deletes faster? - Mailing list pgsql-performance

From Ang Chin Han
Subject Re: make bulk deletes faster?
Date
Msg-id 8c8854360512190647q349c1ff2s63b05b218495bd04@mail.gmail.com
Whole thread Raw
In response to make bulk deletes faster?  (James Klo <jklo@arkitec.com>)
List pgsql-performance
On 12/18/05, James Klo <jklo@arkitec.com> wrote:
> explain analyze delete from timeblock where timeblockid = 666666
>
> Index Scan using timeblockid_idx on timeblock  (cost=0.00..5.28 rows=1
> width=6) (actual time=0.022..0.022 rows=0 loops=1)
>   Index Cond: (timeblockid = 666666)
> Total runtime: 0.069 ms
... snip ...
> Here's what I've tried:
>
> Attempt 1:
> ----------
> delete from timeblock where timeblockid in (select timeblockid from
> timeblock_tmp)

The DELETE in Attempt 1 contains a join, so if this is the way you're
mainly specifying which rows to delete, you'll have to take into
account how efficient the join of timeblock and timeblock_tmp is. What
does

EXPLAIN ANALYZE select * from timeblock where timeblockid in (select
timeblockid from timeblock_tmp)

or

EXPLAIN ANALYZE delete from timeblock where timeblockid in (select
timeblockid from timeblock_tmp)

say?

You *should* at least get a "Hash IN join" for the outer loop, and
just one Seq scan on timeblock_tmp. Otherwise, consider increasing
your sort_mem (postgresql 7.x) or work_mem (postgresql 8.x) settings.
Another alternative is to reduce the amount of rows being archive at
one go to fit in the amount of sort_mem or work_mem that allows the
"Hash IN Join" plan. See
http://www.postgresql.org/docs/8.1/static/runtime-config-resource.html#GUC-WORK-MEM

On the other hand, PostgreSQL 8.1's partitioning sounds like a better
long term solution that you might want to look into.

pgsql-performance by date:

Previous
From: "Jignesh K. Shah"
Date:
Subject: Re: PostgreSQL and Ultrasparc T1
Next
From: "Anjan Dave"
Date:
Subject: Re: PostgreSQL and Ultrasparc T1