Re: Help speeding up delete - Mailing list pgsql-performance

From Steve Wampler
Subject Re: Help speeding up delete
Date
Msg-id 4379EE24.8000504@noao.edu
Whole thread Raw
In response to Re: Help speeding up delete  ("Magnus Hagander" <mha@sollentuna.net>)
Responses Re: Help speeding up delete
List pgsql-performance
Magnus Hagander wrote:
>>Because I think we need to.  The above would only delete rows
>>that have name = 'obsid' and value = 'oid080505'.  We need to
>>delete all rows that have the same ids as those rows.
>>However, from what you note, I bet we could do:
>>
>>   DELETE FROM "tmp_table2" WHERE id IN
>>      (SELECT id FROM "temp_table2" WHERE name = 'obsid' and
>>value= 'oid080505');
>>
>>However, even that seems to have a much higher cost than I'd expect:
>>
>>   lab.devel.configdb=# explain delete from "tmp_table2" where id in
>>        (select id from tmp_table2 where name='obsid' and
>>value = 'oid080505');
>>   NOTICE:  QUERY PLAN:
>>
>>   Seq Scan on tmp_table2  (cost=0.00..65705177237.26
>>rows=769844 width=6)
>>     SubPlan
>>       ->  Materialize  (cost=42674.32..42674.32 rows=38 width=50)
>>             ->  Seq Scan on tmp_table2  (cost=0.00..42674.32
>>rows=38 width=50)
>>
>>   EXPLAIN
...
>
> Earlier pg versions have always been bad at dealing with IN subqueries.
> Try rewriting it as (with fixing any broken syntax, I'm not actually
> testing this :P)
>
> DELETE FROM tmp_table2 WHERE EXISTS
>  (SELECT * FROM tmp_table2 t2 WHERE t2.id=tmp_table2.id AND
> t2.name='obsid' AND t2.value='oid080505')

Thanks - that looks *significantly* better:

   lab.devel.configdb=# explain delete from tmp_table2 where exists
                           (select 1 from tmp_table2 t2 where
                           t2.id=tmp_table2.id and
                           t2.name='obsid' and t2.value='oid080505');
   NOTICE:  QUERY PLAN:

   Seq Scan on tmp_table2  (cost=0.00..9297614.80 rows=769844 width=6)
     SubPlan
       ->  Index Scan using inv_index_2 on tmp_table2 t2  (cost=0.00..6.02 rows=1 width=0)

   EXPLAIN

(This is after putting an index on the (id,name,value) tuple.)  That outer seq scan
is still annoying, but maybe this will be fast enough.

I've passed this on, along with the (strong) recommendation that they
upgrade PG.

Thanks!!

--
Steve Wampler -- swampler@noao.edu
The gods that smiled on your birth are now laughing out loud.

pgsql-performance by date:

Previous
From: "Merlin Moncure"
Date:
Subject: Re: Hardware/OS recommendations for large databases (5TB)
Next
From: Dave Cramer
Date:
Subject: Re: Hardware/OS recommendations for large databases (