Thread: VERY strange query plan (LONG)

VERY strange query plan (LONG)

From
Oleg Bartunov
Date:
Hi,

I tried to implement fulltext search using linguistic approach,
for example, using ispell like udmsearch does. We also save position
information of each lexem in document to calculate relevancy
(it's C-function using SPI-interface). We're still testing different
strategies but found several problems with optimizer, just look at plan - 
very strange numbers and no indices used) (I did run vacuume analyze)

explain
select       txt.tid
from       txt, txt_lexem1 tl1_0, txt_lexem11 tl11_0
where
tl1_0.lid =17700
OR
tl11_0.lid =172751
;
NOTICE:  QUERY PLAN:

Nested Loop  (cost=0.00..16275952180.00 rows=512819420786 width=12) ->  Nested Loop  (cost=0.00..891369556.42
rows=512819421width=8)       ->  Seq Scan on txt_lexem11 tl11_0  (cost=0.00..2596.92 rows=132292 width=4)       ->  Seq
Scanon txt_lexem1 tl1_0  (cost=0.00..3815.95 rows=194795 width=4) ->  Seq Scan on txt  (cost=0.00..20.00 rows=1000
width=4)

EXPLAIN

fulltext=# \d txt         Table "txt"Attribute |  Type   | Modifier 
-----------+---------+----------tid       | integer | not null
Index: txt_pkey

tables  txt_lexemX look like:

fulltext=# \d txt_lexem1       Table "txt_lexem1"Attribute |   Type    | Modifier 
-----------+-----------+----------tid       | integer   | not nulllid       | integer   | not nulldid       | integer
|not nullcount     | integer   | not nullpos       | integer[] | not null
 
Index: txt_lexem1_key

We have rewrite using EXISTS and plan looks better !

select       txt.tid
from       txt
where
EXISTS ( select tid from txt_lexem1 tl1_0 where tl1_0.lid=17700 and tl1_0.did=0
and txt.tid=tl1_0.tid )
OR
EXISTS ( select tid from txt_lexem11 tl11_0 where tl11_0.lid=172751 and 
tl11_0.did=0 and txt.tid=tl11_0.tid )
;

NOTICE:  QUERY PLAN:

Seq Scan on txt  (cost=0.00..7416.48 rows=1000 width=4) SubPlan   ->  Index Scan using txt_lexem1_key on txt_lexem1
tl1_0 (cost=0.00..3.95 rows=1 width=4)   ->  Index Scan using txt_lexem11_key on txt_lexem11 tl11_0  (cost=0.00..3.45
rows=1width=4)
 

EXPLAIN

I've tested on plain 7.0.2 and CVS version.
I remind there was old problem with OR. Does optimizer still has
such problem ?
Regards,
    Oleg

_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83