Thread: index usage ... strange !?

index usage ... strange !?

From
Marten Feldtmann
Date:
He's a small example and I'm trying to speed up this query and I can manage it :-(

The table:
P3AT:
AO - char(15), with indexAT - Integer, with index AV - VARCHAR(80), with index valindCI - Integer, with index

We've about 23000 rows in this table. I would like to execute:
SELECT AO,AT,AV FROM P3ATWHERE EXISTS( SELECT AO FROM P3AT WHERE AV='12' AND AT=12 AND CI=17)

Explain gives me:
Result(946.19,0,0)  InitPlan    -> INdex Scan using valind on p3at (222,265,12)    (fine !)  ->Seq Scan on p3at
(946,22235,26)                   (urgghhh ?)
 
This statement takes about 2s to return the results.
The select statement within exists just needs 23ms to find the AO value !
SELECT AO FROM P3AT WHERE AO='12'
EXPLAIN GIVES ME:
 INDEX SCAN USING ATROWIND ON P3AT (2,2,12)
In general I expect for each unique AO about 10-12 result rows. I've done 
several vacuum analyze (the first one crashed my database by the way :-(, this
tool is really making me crazy - claiming that it can't remove the lock :-().
The reason seems to be the seq scan ... therefore how can I get rid of it !
I use PSQL 6.5.3 under SuSE 6.1.
Marten




Re: [SQL] index usage ... strange !?

From
Tom Lane
Date:
Marten Feldtmann <marten@feki.toppoint.de> writes:
>  SELECT AO,AT,AV FROM P3AT
>  WHERE EXISTS( SELECT AO FROM P3AT WHERE AV='12' AND AT=12 AND CI=17)

Um ... I dunno what you are trying to accomplish, but this query
almost certainly doesn't do what you want.  Since the inner query
is independent of the outer, you will get back either all the rows
of P3AT (if the inner query yields rows) or none (if it doesn't).

The plan you quote is perfectly reasonable for this query...
the machine is even bright enough to figure out that it only
needs to evaluate the subquery once.
        regards, tom lane


Re: [SQL] index usage ... strange !?

From
Marten Feldtmann
Date:
> Marten Feldtmann <marten@feki.toppoint.de> writes:
> >  SELECT AO,AT,AV FROM P3AT
> >  WHERE EXISTS( SELECT AO FROM P3AT WHERE AV='12' AND AT=12 AND CI=17)
> 
> Um ... I dunno what you are trying to accomplish, but this query
> almost certainly doesn't do what you want.  Since the inner query
> is independent of the outer, you will get back either all the rows
> of P3AT (if the inner query yields rows) or none (if it doesn't).
> 
> The plan you quote is perfectly reasonable for this query...
> the machine is even bright enough to figure out that it only
> needs to evaluate the subquery once.
> 
>             regards, tom lane
> Yes, this seems to be a wrong statement. Actually  I would write it like:
select AO,AT,AV FROM P3AT WHERE AO IN (SELECT AO FROM P3AT WHERE ...)
but as PostgreSQL does not like "IN" very much I rewrote it to the wrongstatement above.
select P3AT.AO,P3AT.AT,P3AT.AV FROM P3AT,P3AT as B WHERE  (B.AO=P3AT.AO) AND (B.AV='12') AND (B.AT=12) AND (B.CI=17)


Therefore I have to rewrite the sql statement above to:
SELECT P3AT.AV,P3AT.AT,P3AT.AV FROM P3AT  WHEREEXISTS( SELECT B.AO FROM P3AT as B WHERE B.AO=P3AT.AO AND ..... )
Right ?

Marten Feldtmann



Re: [SQL] index usage ... strange !?

From
Marten Feldtmann
Date:
> 
>  Therefore I have to rewrite the sql statement above to:
> 
>  SELECT P3AT.AV,P3AT.AT,P3AT.AV FROM P3AT  WHERE
>  EXISTS( SELECT B.AO FROM P3AT as B WHERE B.AO=P3AT.AO AND ..... )
> 
Ok, indeed it was the wrong statement, therefore I rewrote it to:
SELECT P3AT.AO,P3AT.AT,P3AT.AV FROM P3ATWHERE  EXISTS    (SELECT B.AO FROM P3AT AS B WHERE B.AO=P3AT.AO AND ....)
There're indices on AO,AT,AV. Size of table about 23000 rows.
He uses indices only for the statement within EXISTS, which seems
to be pretty fast but for the outer SELECT statement he uses a
sequential scan ... which brings the the statement down from 23 ms to
2000ms.
Actually I don not understand it. The "B.AO=P3AT.AO" should create
internally a join (?) and therefore he could use the index on AO in
the outer select to create the result - but the seq scan seems to
be wrong.
Marten





Re: [SQL] index usage ... strange !?

From
Tom Lane
Date:
Marten Feldtmann <marten@feki.toppoint.de> writes:
>  Ok, indeed it was the wrong statement, therefore I rewrote it to:

>  SELECT P3AT.AO,P3AT.AT,P3AT.AV FROM P3AT
>  WHERE
>    EXISTS
>      (SELECT B.AO FROM P3AT AS B WHERE B.AO=P3AT.AO AND ....)

>  There're indices on AO,AT,AV. Size of table about 23000 rows.

>  He uses indices only for the statement within EXISTS, which seems
> to be pretty fast but for the outer SELECT statement he uses a
> sequential scan ... which brings the the statement down from 23 ms to
> 2000ms.

>  Actually I don not understand it. The "B.AO=P3AT.AO" should create
> internally a join (?) and therefore he could use the index on AO in
> the outer select to create the result - but the seq scan seems to
> be wrong.

You are right, this would be better done as a join, but the system is
not currently smart enough to rewrite a sub-SELECT into a join.  You
have to do it yourself :-(.

I think preserving the semantics of this exactly would require outer
joins, which we don't yet have, but you could get close with DISTINCT:

SELECT DISTINCT A.AO,A.AT,A.AV FROM P3AT A, P3AT B
WHERE A.AO = B.AO AND ...;

Both of these issues are on the TODO list, and probably will get
addressed in a release or three...
        regards, tom lane