Strange query plan - Mailing list pgsql-hackers

From Oleg Bartunov
Subject Strange query plan
Date
Msg-id Pine.GSO.4.33.0106051644530.26250-100000@ra.sai.msu.su
Whole thread Raw
Responses Re: Strange query plan  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tom,

I have a problem with slow query execution (postgresql 7.1.2):

There are 2 tables - idx, msg_prt:
bug=# \dt   List of relations Name   | Type  | Owner
---------+-------+--------idx     | table | megeramsg_prt | table | megera
(2 rows)

bug=#  \d idx         Table "idx"Attribute |  Type   | Modifier
-----------+---------+----------tid       | integer |lid       | integer |did       | integer |
Index: idxidx

bug=#  \d msg_prt       Table "msg_prt"Attribute |  Type   | Modifier
-----------+---------+----------tid       | integer |
Index: mprt_tid


Also there are 2 indexes - idxidx, mprt_tid

bug=# \d idxidx  Index "idxidx"Attribute |  Type
-----------+---------lid       | integerdid       | integertid       | integer
unique btree

bug=# \d mprt_tid Index "mprt_tid"Attribute |  Type
-----------+---------tid       | integer
unique btree


Query is:
select msg_prt.tid as mid from msg_prt
where exists (select idx.tid from idx where msg_prt.tid=idx.tid  and idx.did=1 and idx.lid in (1207,59587) )

Plan for this query looks very ineffective and query is very slow:

select msg_prt.tid as mid from msg_prtwhere exists (select idx.tid from idx where msg_prt.tid=idx.tid               and
idx.did=1and idx.lid in (1207,59587) )
 
NOTICE:  QUERY PLAN:

Seq Scan on msg_prt  (cost=0.00..119090807.13 rows=69505 width=4) SubPlan   ->  Index Scan using idxidx, idxidx on idx
(cost=0.00..1713.40rows=1 width=4)
 

total: 6.80 sec; number: 1; for one: 6.796 sec;

Statistics on tables:
idx     - 103651 rows
msg_prt - 69505  rows

There are only 16 rows in 'idx' table satisfied subselect condition.
I did vacuum analyze.

Adding another index 'create index tididx on idx (tid);' helps:
select msg_prt.tid as mid from msg_prtwhere exists (select idx.tid from idx where msg_prt.tid=idx.tid              and
idx.did=1and idx.lid in (1207,59587) )
 
NOTICE:  QUERY PLAN:

Seq Scan on msg_prt  (cost=0.00..1134474.94 rows=69505 width=4) SubPlan   ->  Index Scan using tididx on idx
(cost=0.00..16.31rows=1 width=4)
 

total: 1.71 sec; number: 1; for one: 1.711 sec;

but still plan looks ineffective.

The best plan I've got eliminating  IN predicate:
select msg_prt.tid as mid from msg_prtwhere exists (select idx.tid from idx where msg_prt.tid=idx.tid              and
idx.did=1and idx.lid = 1207 and idx.lid=59587 )
 
NOTICE:  QUERY PLAN:

Seq Scan on msg_prt  (cost=0.00..167368.47 rows=69505 width=4) SubPlan   ->  Index Scan using idxidx on idx
(cost=0.00..2.39rows=1 width=4)
 

total: 0.54 sec; number: 1; for one: 0.541 sec;

Unfortunately I can't use this way in general case.

Does it's a known problem ?

data+schema is available from
http://www.sai.msu.su/~megera/postgres/data/bug.dump.gz
It's about 500Kb !
Regards,    Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Strange query plan
Next
From: Tom Lane
Date:
Subject: Re: Multiprocessor performance