Thread: Strange query plan
Tom, I have a problem with slow query execution (postgresql 7.1.2): There are 2 tables - idx, msg_prt: bug=# \dt List of relations Name | Type | Owner ---------+-------+--------idx | table | megeramsg_prt | table | megera (2 rows) bug=# \d idx Table "idx"Attribute | Type | Modifier -----------+---------+----------tid | integer |lid | integer |did | integer | Index: idxidx bug=# \d msg_prt Table "msg_prt"Attribute | Type | Modifier -----------+---------+----------tid | integer | Index: mprt_tid Also there are 2 indexes - idxidx, mprt_tid bug=# \d idxidx Index "idxidx"Attribute | Type -----------+---------lid | integerdid | integertid | integer unique btree bug=# \d mprt_tid Index "mprt_tid"Attribute | Type -----------+---------tid | integer unique btree Query is: select msg_prt.tid as mid from msg_prt where exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1 and idx.lid in (1207,59587) ) Plan for this query looks very ineffective and query is very slow: select msg_prt.tid as mid from msg_prtwhere exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1and idx.lid in (1207,59587) ) NOTICE: QUERY PLAN: Seq Scan on msg_prt (cost=0.00..119090807.13 rows=69505 width=4) SubPlan -> Index Scan using idxidx, idxidx on idx (cost=0.00..1713.40rows=1 width=4) total: 6.80 sec; number: 1; for one: 6.796 sec; Statistics on tables: idx - 103651 rows msg_prt - 69505 rows There are only 16 rows in 'idx' table satisfied subselect condition. I did vacuum analyze. Adding another index 'create index tididx on idx (tid);' helps: select msg_prt.tid as mid from msg_prtwhere exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1and idx.lid in (1207,59587) ) NOTICE: QUERY PLAN: Seq Scan on msg_prt (cost=0.00..1134474.94 rows=69505 width=4) SubPlan -> Index Scan using tididx on idx (cost=0.00..16.31rows=1 width=4) total: 1.71 sec; number: 1; for one: 1.711 sec; but still plan looks ineffective. The best plan I've got eliminating IN predicate: select msg_prt.tid as mid from msg_prtwhere exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1and idx.lid = 1207 and idx.lid=59587 ) NOTICE: QUERY PLAN: Seq Scan on msg_prt (cost=0.00..167368.47 rows=69505 width=4) SubPlan -> Index Scan using idxidx on idx (cost=0.00..2.39rows=1 width=4) total: 0.54 sec; number: 1; for one: 0.541 sec; Unfortunately I can't use this way in general case. Does it's a known problem ? data+schema is available from http://www.sai.msu.su/~megera/postgres/data/bug.dump.gz It's about 500Kb ! Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes: > should be > select msg_prt.tid as mid from msg_prt > where exists (select idx.tid from idx where msg_prt.tid=idx.tid > and idx.did=1 and ( idx.lid = 1207 or idx.lid=59587 )); > but this is not a big win. Shouldn't be any win at all: the IN expression-list notation will get translated to exactly that form. > Anyway, what's about original query ? IN/EXISTS subqueries suck. This has been true for a long time and is going to be true for a while longer, unless someone else fixes it before I have a chance to look at it. See if you can't rewrite your query as a plain join. regards, tom lane
On Tue, 5 Jun 2001, Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: > > should be > > select msg_prt.tid as mid from msg_prt > > where exists (select idx.tid from idx where msg_prt.tid=idx.tid > > and idx.did=1 and ( idx.lid = 1207 or idx.lid=59587 )); > > but this is not a big win. > > Shouldn't be any win at all: the IN expression-list notation will get > translated to exactly that form. Sure > > > Anyway, what's about original query ? > > IN/EXISTS subqueries suck. This has been true for a long time and is > going to be true for a while longer, unless someone else fixes it before > I have a chance to look at it. See if you can't rewrite your query as > a plain join. > That's why we've moved to GiST and feel fine :-) That query we used in our old project, now swithing to GiST. > regards, tom lane > Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83
Oleg Bartunov <oleg@sai.msu.su> writes: > select msg_prt.tid as mid from msg_prt > where exists (select idx.tid from idx where msg_prt.tid=idx.tid > and idx.did=1 and idx.lid in (1207,59587) ) > NOTICE: QUERY PLAN: > Seq Scan on msg_prt (cost=0.00..119090807.13 rows=69505 width=4) > SubPlan > -> Index Scan using idxidx, idxidx on idx (cost=0.00..1713.40 rows=1 width=4) Actually, this example does reveal an unnecessary inefficiency: the planner is only using the "idx.lid in (1207,59587)" clause for the indexscan, ignoring the fact that the did and tid clauses match the additional columns of your three-column index. The attached patch should improve matters. regards, tom lane *** src/backend/optimizer/path/indxpath.c.orig Sun May 20 16:28:18 2001 --- src/backend/optimizer/path/indxpath.c Tue Jun 5 12:38:21 2001 *************** *** 397,403 **** clause, false); } ! /* * Given an OR subclause that has previously been determined to match * the specified index, extract a list of specificopclauses that can be * used as indexquals. --- 397,403 ---- clause, false); } ! /*---------- * Given an OR subclause that has previously been determined to match * the specified index, extract a listof specific opclauses that can be * used as indexquals. *************** *** 406,415 **** * given opclause. However, if the OR subclause is an AND, we have to * scan it to find the opclause(s)that match the index. (There should * be at least one, if match_or_subclause_to_indexkey succeeded, but there ! * could be more.) Also, we apply expand_indexqual_conditions() to convert ! * any special matching opclauses to indexable operators. * * The passed-in clause is not changed. */ List * extract_or_indexqual_conditions(RelOptInfo*rel, --- 406,430 ---- * given opclause. However, if the OR subclause is an AND, we have to * scan it to find the opclause(s)that match the index. (There should * be at least one, if match_or_subclause_to_indexkey succeeded, but there ! * could be more.) ! * ! * Also, we can look at other restriction clauses of the rel to discover ! * additional candidate indexquals: for example, consider ! * ... where (a = 11 or a = 12) and b = 42; ! * If we are dealing with an index on (a,b) then we can include the clause ! * b = 42 in the indexqual list generated for each of the OR subclauses. ! * Essentially, we are making an index-specific transformation from CNF to ! * DNF. (NOTE: when we do this, we end up with a slightly inefficient plan ! * because create_indexscan_plan is not very bright about figuring out which ! * restriction clauses are implied by the generated indexqual condition. ! * Currently we'll end up rechecking both the OR clause and the transferred ! * restriction clause as qpquals. FIXME someday.) ! * ! * Also, we apply expand_indexqual_conditions() to convert any special ! * matching opclauses to indexable operators. * * The passed-in clause is not changed. + *---------- */ List * extract_or_indexqual_conditions(RelOptInfo *rel, *************** *** 417,470 **** Expr *orsubclause) { List *quals = NIL; ! if (and_clause((Node *) orsubclause)) { ! /* ! * Extract relevant sub-subclauses in indexkey order. This is ! * just like group_clauses_by_indexkey() except that the input and ! * output are lists of bare clauses, not of RestrictInfo nodes. ! */ ! int *indexkeys = index->indexkeys; ! Oid *classes = index->classlist; ! do { ! int curIndxKey = indexkeys[0]; ! Oid curClass = classes[0]; ! List *clausegroup = NIL; ! List *item; ! foreach(item, orsubclause->args) { if (match_clause_to_indexkey(rel, index, curIndxKey, curClass, ! lfirst(item), false)) ! clausegroup = lappend(clausegroup, lfirst(item)); } ! /* ! * If no clauses match this key, we're done; we don't want to ! * look at keys to its right. ! */ ! if (clausegroup == NIL) ! break; ! ! quals = nconc(quals, clausegroup); ! ! indexkeys++; ! classes++; ! } while (!DoneMatchingIndexKeys(indexkeys, index)); ! ! if (quals == NIL) ! elog(ERROR, "extract_or_indexqual_conditions: no matching clause"); ! } ! else ! { ! /* we assume the caller passed a valid indexable qual */ ! quals = makeList1(orsubclause); ! } return expand_indexqual_conditions(quals); } --- 432,503 ---- Expr *orsubclause) { List *quals = NIL; + int *indexkeys = index->indexkeys; + Oid *classes = index->classlist; ! /* ! * Extract relevant indexclauses in indexkey order. This is essentially ! * just like group_clauses_by_indexkey() except that the input and ! * output are lists of bare clauses, not of RestrictInfo nodes. ! */ ! do { + int curIndxKey = indexkeys[0]; + Oid curClass = classes[0]; + List *clausegroup = NIL; + List *item; ! if (and_clause((Node *) orsubclause)) ! { ! foreach(item, orsubclause->args) ! { ! Expr *subsubclause = (Expr *) lfirst(item); ! if (match_clause_to_indexkey(rel, index, ! curIndxKey, curClass, ! subsubclause, false)) ! clausegroup = lappend(clausegroup, subsubclause); ! } ! } ! else if (match_clause_to_indexkey(rel, index, ! curIndxKey, curClass, ! orsubclause, false)) { ! clausegroup = makeList1(orsubclause); ! } ! /* ! * If we found no clauses for this indexkey in the OR subclause ! * itself, try looking in the rel's top-level restriction list. ! */ ! if (clausegroup == NIL) ! { ! foreach(item, rel->baserestrictinfo) { + RestrictInfo *rinfo = (RestrictInfo *) lfirst(item); + if (match_clause_to_indexkey(rel, index, curIndxKey, curClass, ! rinfo->clause, false)) ! clausegroup = lappend(clausegroup, rinfo->clause); } + } ! /* ! * If still no clauses match this key, we're done; we don't want to ! * look at keys to its right. ! */ ! if (clausegroup == NIL) ! break; ! ! quals = nconc(quals, clausegroup); ! ! indexkeys++; ! classes++; ! } while (!DoneMatchingIndexKeys(indexkeys, index)); ! ! if (quals == NIL) ! elog(ERROR, "extract_or_indexqual_conditions: no matching clause"); return expand_indexqual_conditions(quals);}
On Tue, 5 Jun 2001, Tom Lane wrote: > Oleg Bartunov <oleg@sai.msu.su> writes: > > select msg_prt.tid as mid from msg_prt > > where exists (select idx.tid from idx where msg_prt.tid=idx.tid > > and idx.did=1 and idx.lid in (1207,59587) ) > > NOTICE: QUERY PLAN: > > > Seq Scan on msg_prt (cost=0.00..119090807.13 rows=69505 width=4) > > SubPlan > > -> Index Scan using idxidx, idxidx on idx (cost=0.00..1713.40 rows=1 width=4) > > Actually, this example does reveal an unnecessary inefficiency: the > planner is only using the "idx.lid in (1207,59587)" clause for the > indexscan, ignoring the fact that the did and tid clauses match the > additional columns of your three-column index. The attached patch > should improve matters. > > regards, tom lane Cool. Looks better select msg_prt.tid as mid from msg_prt where exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1 andidx.lid in ( 1207, 59587) ) NOTICE: QUERY PLAN: Seq Scan on msg_prt (cost=0.00..333700.88 rows=69505 width=4) SubPlan -> Index Scan using idxidx, idxidx on idx (cost=0.00..4.79rows=1 width=4) total: 3.15 sec; number: 1; for one: 3.153 sec; interesting that droping index 'idxidx' and creating simple create index tididx on idx (tid); behaves better, while plan looks worse. Notice, index on tididx estimates cost better (16). select msg_prt.tid as mid from msg_prt where exists (select idx.tid from idx where msg_prt.tid=idx.tid and idx.did=1 andidx.lid in ( 1207, 59587) ) NOTICE: QUERY PLAN: Seq Scan on msg_prt (cost=0.00..1134474.94 rows=69505 width=4) SubPlan -> Index Scan using tididx on idx (cost=0.00..16.31rows=1 width=4) total: 1.70 sec; number: 1; for one: 1.703 sec; Interesting that earlier if I have 2 indexes - idxidx and tididx optimizer choose tididx, while now (after patching) optimizer always choose idxidx. Regards, Oleg _____________________________________________________________ Oleg Bartunov, sci.researcher, hostmaster of AstroNet, Sternberg Astronomical Institute, Moscow University (Russia) Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(095)939-16-83, +007(095)939-23-83