Indexes not used in DELETE - Mailing list pgsql-performance
From | Viktor Rosenfeld |
---|---|
Subject | Indexes not used in DELETE |
Date | |
Msg-id | E4AC852E-C3F7-4DE8-8C78-7B345CFF5DE8@informatik.hu-berlin.de Whole thread Raw |
Responses |
Re: Indexes not used in DELETE
|
List | pgsql-performance |
Hi everybody, I'm wondering why a DELETE statement of mine does not make use of defined indexes on the tables. I have the following tables which are linked as such: component -> rank -> node -> corpus; Now I want to delete all entries in component by giving a list of corpus ids. The query is as such: DELETE FROM component USING corpus toplevel, corpus child, node, rank WHERE toplevel.id IN (25) AND toplevel.top_level = 'y' AND toplevel.pre <= child.pre AND toplevel.post >= child.pre AND node.corpus_ref = child.id AND rank.node_ref = node.id AND rank.component_ref = component.id; The table corpus is defined as such: Table "public.corpus" Column | Type | Modifiers -----------+------------------------+----------- id | numeric(38,0) | not null name | character varying(100) | not null type | character varying(100) | not null version | character varying(100) | pre | numeric(38,0) | not null post | numeric(38,0) | not null top_level | boolean | not null Indexes: "corpus_pkey" PRIMARY KEY, btree (id) "corpus_post_key" UNIQUE, btree (post) "corpus_pre_key" UNIQUE, btree (pre) "idx_corpus__id_pre_post" btree (id, pre, post) "idx_corpus__pre_post" btree (pre, post) "idx_corpus__toplevel" btree (id) WHERE top_level = true The query plan of the above statement looks like this: QUERY PLAN ------------------------------------------------------------------------------------------------------- Hash Join (cost=708.81..4141.14 rows=9614 width=6) Hash Cond: (rank.component_ref = component.id) -> Nested Loop (cost=3.20..3268.07 rows=8373 width=8) -> Hash Join (cost=3.20..1306.99 rows=4680 width=8) Hash Cond: (node.corpus_ref = child.id) -> Seq Scan on node (cost=0.00..1075.63 rows=48363 width=14) -> Hash (cost=3.16..3.16 rows=3 width=27) -> Nested Loop (cost=0.00..3.16 rows=3 width=27) Join Filter: ((toplevel.pre <= child.pre) AND (toplevel.post >= child.pre)) -> Seq Scan on corpus toplevel (cost=0.00..1.39 rows=1 width=54) Filter: (top_level AND (id = 25::numeric)) -> Seq Scan on corpus child (cost=0.00..1.31 rows=31 width=54) -> Index Scan using fk_rank_2_struct on rank (cost=0.00..0.39 rows=2 width=16) Index Cond: (rank.node_ref = node.id) -> Hash (cost=390.27..390.27 rows=25227 width=14) -> Seq Scan on component (cost=0.00..390.27 rows=25227 width=14) (16 rows) Specifically, I'm wondering why the innermost scan on corpus (toplevel) does not use the index idx_corpus__toplevel and why the join between corpus (toplevel) and corpus (child) is not a merge join using the index corpus_pre_key to access the child table. FYI, corpus.pre and corpus.post encode a corpus tree (or rather a forest) using a combined pre and post order. This scheme guarantees that parent.post > child.post > child.pre for all edges parent -> child in the corpus tree. I'm using the same scheme elsewhere in SELECT statements and they work fine there. Thanks, Viktor
pgsql-performance by date: