Hi,
I have 2 queries, which are quite similar:
a) select distinct p1.domain from dom_prot p1, identical p2, classification p3
where p2.reference = p1.protein
and p2.protein ~ 'EPA3_HUMAN'
and p3.domain = p1.domain;
b) select distinct p1.domain from dom_prot p1, identical p2, classification p3
where p2.reference = p1.protein
and p2.protein ~ 'EPA3_HUMAN'
and p3.domain = p1.domain
and p3.class = 'extra';
As you see, they differ only in the last line. The problem is, that the time
they need differes tremendously. Query a) takes about 5 sec, query b) 2 min !!
I don't understand why this last constraint over an indexed field leads to such
a difference.
Here is what explain says:
a)
NOTICE: QUERY PLAN:
Unique (cost=5193.21 size=0 width=0)
-> Sort (cost=5193.21 size=0 width=0)
-> Nested Loop (cost=5193.21 size=1 width=48)
-> Nested Loop (cost=5191.16 size=1 width=36)
-> Seq Scan on p2 (cost=5189.11 size=1 width=12)
-> Index Scan on p1 (cost=2.05 size=62250 width=24)
-> Index Scan on p3 (cost=2.05 size=189 width=12)
EXPLAIN
b)
NOTICE: QUERY PLAN:
Unique (cost=680.60 size=0 width=0)
-> Sort (cost=680.60 size=0 width=0)
-> Nested Loop (cost=680.60 size=1 width=48)
-> Nested Loop (cost=4.10 size=330 width=36)
-> Index Scan on p3 (cost=2.05 size=1 width=12)
-> Index Scan on p1 (cost=2.05 size=62250 width=24)
-> Index Scan on p2 (cost=2.05 size=1 width=12)
EXPLAIN
Dou you have any idea, how to improve query b) ??
Thanx,
Joerg
--
Internet Joerg.Schultz@embl-heidelberg.de
Address EMBL, Meyerhofstr 1, 69012 Heidelberg, Germany
Tel +49 (0)6221 387 534, Fax +49 (0)6221 387517
URL http://www.embl-heidelberg.de/~jschultz/