Thread: qurey plan and indices

qurey plan and indices

From
MESZAROS Attila
Date:
Hi,

I've experienced a brutal speedup (order of 2) separateing the following
subquery and making it manually:

explain select name,description 
from descriptions 
where in (select name     from descriptions     where description like '%Bankverbindung%');

Seq Scan on descriptions  (cost=163.98 rows=3575 width=24) SubPlan   ->  Seq Scan on descriptions  (cost=163.98 rows=2
width=12)
[I had no patient to wait the resuls...]


If I give manually the subquery it gives an immediate response (5 records 
at all). If I copy manually that 5 record as constants after 'in', then I
receive an immediate response from the main query too!

explain select name,description from descriptions where name in
('accounts', 'no_account', 'no_account', 'no_acc', 'account');

Index Scan using desc_ind3, desc_ind3, desc_ind3, desc_ind3 on
descriptions  (cost=9.01 rows=36 width=24)


I saw in the archive that in, and not in can cause performance
degradation, but this is not the case now. booth queries are using 'in'
Why can't be the subquery evaluated first, and then run the main query
with the results?

==============  the stucture: =============================== 

CREATE TABLE "descriptions" (       "objectid" int4 DEFAULT nextval ( 'DescSeq' ) NOT NULL,       "ts" int4 NOT NULL,
   "classid" int4 NOT NULL,       "languageid" int4 NOT NULL,       "priority" int4 NOT NULL,       "name" character
varying(50)NOT NULL,       "description" character varying(200));
 
CREATE  INDEX "desc_ind3" on "descriptions" using btree ( "name"
"varchar_ops" );
CREATE  INDEX "desc_ind2" on "descriptions" using btree ( "description"
"varchar_ops" );
CREATE  INDEX "desc_ind1" on "descriptions" using btree ( "classid"
"int4_ops", "name" "varchar_ops" );
CREATE UNIQUE INDEX "desc_idx" on "descriptions" using btree ( "classid"
"int4_ops", "languageid" "int4_ops", "name" "varchar_ops" );
CREATE UNIQUE INDEX "descriptions_pkey" on "descriptions" using btree (
"objectid" "int4_ops" );


version: postgresql 6.5, debian potato.



Re: [SQL] qurey plan and indices

From
Patrick Giagnocavo
Date:
MESZAROS Attila wrote:
> 
> Hi,
> 
> I've experienced a brutal speedup (order of 2) separateing the following
> subquery and making it manually:
> 
> explain select name,description
> from descriptions
> where in (select name
>                 from descriptions
>                 where description like '%Bankverbindung%');
> 
> Seq Scan on descriptions  (cost=163.98 rows=3575 width=24)
>   SubPlan
>     ->  Seq Scan on descriptions  (cost=163.98 rows=2 width=12)
> [I had no patient to wait the resuls...]
>

Correct me if I am wrong, however if using 

LIKE '%something%'
(which means, the field contains 'something' somewhere in the field)

there is never a chance to use the index you have created - a
sequential table scan MUST be made, thus you have to read all 3575
rows to return the set.

However, if you change your query so that you are looking for 

LIKE 'Bankverbindung%'

then an index can be used becase you know what the first few
characters are known.

Hope this helps,

Cordially

Patrick Giagnocavo
a222@redrose.net


Re: [SQL] qurey plan and indices

From
Tom Lane
Date:
Patrick Giagnocavo <a222@redrose.net> writes:
> MESZAROS Attila wrote:
>>
>> I've experienced a brutal speedup (order of 2) separateing the following
>> subquery and making it manually:
>> 
>> explain select name,description
>> from descriptions
>> where in (select name
>> from descriptions
>> where description like '%Bankverbindung%');

> Correct me if I am wrong, however if using 
> LIKE '%something%'
> (which means, the field contains 'something' somewhere in the field)
> there is never a chance to use the index you have created - a
> sequential table scan MUST be made, thus you have to read all 3575
> rows to return the set.

Yes.  The inner query must happen by sequential scan.  But the outer
query can use an index if it is in the form... where (name = 'a') or (name = 'b') or (name = 'c');
which is what "where name in ('a','b','c')" gets translated to.

However, the *real* problem here is that the inner query is treated
as a subplan, which means it is re-evaluated for each tuple scanned
by the outer query.  This is not very bright; the system ought to
notice that the inner query does not depend on the state of the
outer query, and do it only once.  There is already a notion of
InitPlan vs. SubPlan.  I'm not sure why this case is being classified
as a SubPlan, but it sure looks like it ought to be done as an
InitPlan...
        regards, tom lane


Re: [SQL] qurey plan and indices

From
MESZAROS Attila
Date:
Hi,

> LIKE '%something%'
> (which means, the field contains 'something' somewhere in the field)I should have to remove that % sign...this is not
thereal problem, bceause the subquery alone withbooth the % signs responds faster than a second.
 
I think the real problem is that, the subquery is evaluated 3575times :((((
If you see the plan of the direct manul query in my previous mailyou will see that there is no sequential search at
all....
what makes the difference????

Attila



Re: [SQL] qurey plan and indices

From
MESZAROS Attila
Date:
> However, the *real* problem here is that the inner query is treated
> as a subplan, which means it is re-evaluated for each tuple scanned
> by the outer query.  This is not very bright; the system ought to
> notice that the inner query does not depend on the state of the
> outer query, and do it only once.  There is already a notion of
> InitPlan vs. SubPlan.  I'm not sure why this case is being classified
> as a SubPlan, but it sure looks like it ought to be done as an
> InitPlan...So this means, it is a bug somewhere in the planner?who should we report this??

Attila

any workarounds?
[better than temptables?]



Re: [SQL] qurey plan and indices

From
Vadim Mikheev
Date:
Tom Lane wrote:
> 
> Yes.  The inner query must happen by sequential scan.  But the outer
> query can use an index if it is in the form
>         ... where (name = 'a') or (name = 'b') or (name = 'c');
> which is what "where name in ('a','b','c')" gets translated to.
> 
> However, the *real* problem here is that the inner query is treated
> as a subplan, which means it is re-evaluated for each tuple scanned
> by the outer query.  This is not very bright; the system ought to
> notice that the inner query does not depend on the state of the
> outer query, and do it only once.  There is already a notion of
> InitPlan vs. SubPlan.  I'm not sure why this case is being classified
> as a SubPlan, but it sure looks like it ought to be done as an
> InitPlan...

Not in all cases. First, to use InitPlan you would have to add 
DISTINCT to subquery to avoid duplicates. Second, there is another
way of optimization - cache/hash subquery results to avoid 
re-execution of subquery plan. Planner should make decision what
way to follow. 

Vadim


Re: [SQL] qurey plan and indices

From
Bruce Momjian
Date:
> Not in all cases. First, to use InitPlan you would have to add 
> DISTINCT to subquery to avoid duplicates. Second, there is another
> way of optimization - cache/hash subquery results to avoid 
> re-execution of subquery plan. Planner should make decision what
> way to follow. 

Yep, that's what I was looking for.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026