Re: Massive delete performance - Mailing list pgsql-performance

From Andy
Subject Re: Massive delete performance
Date
Msg-id 01e401c5ce67$93d65de0$0b00a8c0@forge
Whole thread Raw
In response to Massive delete performance  ("Andy" <frum@ar-sd.net>)
Responses Re: Massive delete performance
List pgsql-performance
We run the DB on a linux system. The client has a windows system. The
application is almost the same (so the database structure is 80% the same).
The difference is that the client does not need all the tables.

So, in the remaining tables there are a lot of extra data's that don't
belong to this client. We have to send every night a updated "info" to the
client database. Our (have to admin) "fast and not the best" solution was so
replicated the needed tables, and delete from these the info that is not
needed.

So, I send to this client a "dump" from the database.

I also find the ideea "not the best", but couldn't find in two days another
fast solution. And it works this way for 4 months.

Out database is not THAT big (500MB), the replication about (300MB)...
everything works fast enough except this delete....


How can I evidence the cascade deletes also on explain analyze?

The answer for Sean Davis <sdavis2@mail.nih.gov>:

EXPLAIN ANALYZE
DELETE FROM report WHERE id_order IN
(SELECT o.id FROM orders o WHERE o.id_ag NOT IN (SELECT cp.id_ag FROM users
u INNER JOIN
contactpartner cp ON cp.id_user=u.id WHERE u.name in ('dc') ORDER BY
cp.id_ag))

Hash IN Join  (cost=3532.83..8182.33 rows=32042 width=6) (actual
time=923.456..2457.323 rows=59557 loops=1)
  Hash Cond: ("outer".id_order = "inner".id)
  ->  Seq Scan on report  (cost=0.00..2613.83 rows=64083 width=10) (actual
time=33.269..1159.024 rows=64083 loops=1)
  ->  Hash  (cost=3323.31..3323.31 rows=32608 width=4) (actual
time=890.021..890.021 rows=0 loops=1)
        ->  Seq Scan on orders o  (cost=21.12..3323.31 rows=32608 width=4)
(actual time=58.428..825.306 rows=60596 loops=1)
              Filter: (NOT (hashed subplan))
              SubPlan
                ->  Sort  (cost=21.11..21.12 rows=3 width=4) (actual
time=47.612..47.612 rows=1 loops=1)
                      Sort Key: cp.id_ag
                      ->  Nested Loop  (cost=0.00..21.08 rows=3 width=4)
(actual time=47.506..47.516 rows=1 loops=1)
                            ->  Index Scan using users_name_idx on users u
(cost=0.00..5.65 rows=1 width=4) (actual time=20.145..20.148 rows=1 loops=1)
                                  Index Cond: ((name)::text = 'dc'::text)
                            ->  Index Scan using contactpartner_id_user_idx
on contactpartner cp  (cost=0.00..15.38 rows=4 width=8) (actual
time=27.348..27.352 rows=1 loops=1)
                                  Index Cond: (cp.id_user = "outer".id)
Total runtime: 456718.658 ms




----- Original Message -----
From: "Steinar H. Gunderson" <sgunderson@bigfoot.com>
To: <pgsql-performance@postgresql.org>
Sent: Tuesday, October 11, 2005 3:19 PM
Subject: Re: [PERFORM] Massive delete performance


> On Tue, Oct 11, 2005 at 10:47:03AM +0300, Andy wrote:
>> So, I have a replication only with the tables that I need to send, then I
>> make a copy of this replication, and from this copy I delete all the
>> data's
>> that are not needed.
>>
>> How can I increase this DELETE procedure because it is really slow???
>> There are of corse a lot of data's to be deleted.
>
> Instead of copying and then deleting, could you try just selecting out
> what
> you wanted in the first place?
>
> /* Steinar */
> --
> Homepage: http://www.sesse.net/
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: don't forget to increase your free space map settings
>
>


pgsql-performance by date:

Previous
From: "Steinar H. Gunderson"
Date:
Subject: Re: Massive delete performance
Next
From: Sven Willenberger
Date:
Subject: Re: Performance on SUSE w/ reiserfs