Thread: Indexes not used in DELETE

Indexes not used in DELETE

From
Viktor Rosenfeld
Date:
Hi everybody,

I'm wondering why a DELETE statement of mine does not make use of
defined indexes on the tables.

I have the following tables which are linked as such: component ->
rank -> node -> corpus;

Now I want to delete all entries in component by giving a list of
corpus ids.

The query is as such:

DELETE FROM component
USING corpus toplevel, corpus child, node, rank
WHERE toplevel.id IN (25) AND toplevel.top_level = 'y'
AND toplevel.pre <= child.pre AND toplevel.post >= child.pre
AND node.corpus_ref = child.id AND rank.node_ref = node.id AND
rank.component_ref = component.id;

The table corpus is defined as such:

              Table "public.corpus"
   Column   |          Type          | Modifiers
-----------+------------------------+-----------
  id        | numeric(38,0)          | not null
  name      | character varying(100) | not null
  type      | character varying(100) | not null
  version   | character varying(100) |
  pre       | numeric(38,0)          | not null
  post      | numeric(38,0)          | not null
  top_level | boolean                | not null
Indexes:
     "corpus_pkey" PRIMARY KEY, btree (id)
     "corpus_post_key" UNIQUE, btree (post)
     "corpus_pre_key" UNIQUE, btree (pre)
     "idx_corpus__id_pre_post" btree (id, pre, post)
     "idx_corpus__pre_post" btree (pre, post)
     "idx_corpus__toplevel" btree (id) WHERE top_level = true


The query plan of the above statement looks like this:

                                               QUERY PLAN
-------------------------------------------------------------------------------------------------------
  Hash Join  (cost=708.81..4141.14 rows=9614 width=6)
    Hash Cond: (rank.component_ref = component.id)
    ->  Nested Loop  (cost=3.20..3268.07 rows=8373 width=8)
          ->  Hash Join  (cost=3.20..1306.99 rows=4680 width=8)
                Hash Cond: (node.corpus_ref = child.id)
                ->  Seq Scan on node  (cost=0.00..1075.63 rows=48363
width=14)
                ->  Hash  (cost=3.16..3.16 rows=3 width=27)
                      ->  Nested Loop  (cost=0.00..3.16 rows=3 width=27)
                            Join Filter: ((toplevel.pre <= child.pre)
AND (toplevel.post >= child.pre))
                            ->  Seq Scan on corpus toplevel
(cost=0.00..1.39 rows=1 width=54)
                                  Filter: (top_level AND (id =
25::numeric))
                            ->  Seq Scan on corpus child
(cost=0.00..1.31 rows=31 width=54)
          ->  Index Scan using fk_rank_2_struct on rank
(cost=0.00..0.39 rows=2 width=16)
                Index Cond: (rank.node_ref = node.id)
    ->  Hash  (cost=390.27..390.27 rows=25227 width=14)
          ->  Seq Scan on component  (cost=0.00..390.27 rows=25227
width=14)
(16 rows)

Specifically, I'm wondering why the innermost scan on corpus
(toplevel) does not use the index idx_corpus__toplevel and why the
join between corpus (toplevel) and corpus (child) is not a merge join
using the index corpus_pre_key to access the child table.

FYI, corpus.pre and corpus.post encode a corpus tree (or rather a
forest) using a combined pre and post order.  This scheme guarantees
that parent.post > child.post > child.pre for all edges parent ->
child in the corpus tree. I'm using the same scheme elsewhere in
SELECT statements and they work fine there.

Thanks,
Viktor

Re: Indexes not used in DELETE

From
Tom Lane
Date:
Viktor Rosenfeld <rosenfel@informatik.hu-berlin.de> writes:
>                             ->  Seq Scan on corpus toplevel (cost=0.00..1.39 rows=1 width=54)
>                                   Filter: (top_level AND (id = 25::numeric))

> Specifically, I'm wondering why the innermost scan on corpus
> (toplevel) does not use the index idx_corpus__toplevel

The cost estimate indicates that there are so few rows in corpus
that an indexscan would be a waste of time.

> and why the
> join between corpus (toplevel) and corpus (child) is not a merge join
> using the index corpus_pre_key to access the child table.

Same answer.  Populate the table and the plan will change.

            regards, tom lane

Re: Indexes not used in DELETE

From
Viktor Rosenfeld
Date:
Hi Tom,

I should have looked at the analyzed plan first. The culprit for the
slow query were trigger function calls on foreign keys.

Ciao,
Viktor

Am 08.05.2009 um 01:06 schrieb Tom Lane:

> Viktor Rosenfeld <rosenfel@informatik.hu-berlin.de> writes:
>>                            ->  Seq Scan on corpus toplevel
>> (cost=0.00..1.39 rows=1 width=54)
>>                                  Filter: (top_level AND (id =
>> 25::numeric))
>
>> Specifically, I'm wondering why the innermost scan on corpus
>> (toplevel) does not use the index idx_corpus__toplevel
>
> The cost estimate indicates that there are so few rows in corpus
> that an indexscan would be a waste of time.
>
>> and why the
>> join between corpus (toplevel) and corpus (child) is not a merge join
>> using the index corpus_pre_key to access the child table.
>
> Same answer.  Populate the table and the plan will change.
>
>             regards, tom lane
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org
> )
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance