Re: Bad performing DELETE - Mailing list pgsql-sql

From Tom Lane
Subject Re: Bad performing DELETE
Date
Msg-id 10914.975339872@sss.pgh.pa.us
Whole thread Raw
In response to Bad performing DELETE  (Hans-Jürgen Schönig <hs@cybertec.at>)
List pgsql-sql
Hans-Jürgen Schönig <hs@cybertec.at> writes:
> I have two tables: t_haus is about 1400 row and t_host has 364000 entries.
> Both tables are indexed on edvnr. I did a vacuum on my db and all indices
> are rebuild.
> I want to delete all Entries in t_haus where a row can be found in t_host.
> When using "delete from t_haus where t_haus.edvnr=t_host.edvnr; " the
> database performs extremely bad.

>  explain delete from t_haus where t_haus.edvnr=t_host.edvnr;
> NOTICE:  QUERY PLAN:
> Merge Join  (cost=52178.53..56754.63 rows=6299767 width=14)
>   -> Sort  (cost=52038.25..52038.25 rows=364359 width=4)
>       -> Seq Scan on t_host  (cost=0.00..11700.59 rows=364359 width=4)
>   -> Sort  (cost=140.27..140.27 rows=1729 width=10)
>       -> Seq Scan on t_haus  (cost=0.00..47.29 rows=1729 width=10)

I wonder if a hash join would be faster.  What does EXPLAIN show if
you first do "set enable_mergejoin to off"?  What's the actual
performance in both cases?

Also, it's possible that the performance problem isn't the fault of the
plan at all.  Are there multiple rows in t_host matching the deletable
rows of t_haus?  I'm wondering if there's some speed penalty associated
with trying to delete the same row multiple times in one command...
        regards, tom lane


pgsql-sql by date:

Previous
From: Tom Lane
Date:
Subject: Re: ERROR: copyObject: don't know how to copy 611
Next
From: Anthony
Date:
Subject: count( distinct x )