Thread: index usage ... strange !?
He's a small example and I'm trying to speed up this query and I can manage it :-( The table: P3AT: AO - char(15), with indexAT - Integer, with index AV - VARCHAR(80), with index valindCI - Integer, with index We've about 23000 rows in this table. I would like to execute: SELECT AO,AT,AV FROM P3ATWHERE EXISTS( SELECT AO FROM P3AT WHERE AV='12' AND AT=12 AND CI=17) Explain gives me: Result(946.19,0,0) InitPlan -> INdex Scan using valind on p3at (222,265,12) (fine !) ->Seq Scan on p3at (946,22235,26) (urgghhh ?) This statement takes about 2s to return the results. The select statement within exists just needs 23ms to find the AO value ! SELECT AO FROM P3AT WHERE AO='12' EXPLAIN GIVES ME: INDEX SCAN USING ATROWIND ON P3AT (2,2,12) In general I expect for each unique AO about 10-12 result rows. I've done several vacuum analyze (the first one crashed my database by the way :-(, this tool is really making me crazy - claiming that it can't remove the lock :-(). The reason seems to be the seq scan ... therefore how can I get rid of it ! I use PSQL 6.5.3 under SuSE 6.1. Marten
Marten Feldtmann <marten@feki.toppoint.de> writes: > SELECT AO,AT,AV FROM P3AT > WHERE EXISTS( SELECT AO FROM P3AT WHERE AV='12' AND AT=12 AND CI=17) Um ... I dunno what you are trying to accomplish, but this query almost certainly doesn't do what you want. Since the inner query is independent of the outer, you will get back either all the rows of P3AT (if the inner query yields rows) or none (if it doesn't). The plan you quote is perfectly reasonable for this query... the machine is even bright enough to figure out that it only needs to evaluate the subquery once. regards, tom lane
> Marten Feldtmann <marten@feki.toppoint.de> writes: > > SELECT AO,AT,AV FROM P3AT > > WHERE EXISTS( SELECT AO FROM P3AT WHERE AV='12' AND AT=12 AND CI=17) > > Um ... I dunno what you are trying to accomplish, but this query > almost certainly doesn't do what you want. Since the inner query > is independent of the outer, you will get back either all the rows > of P3AT (if the inner query yields rows) or none (if it doesn't). > > The plan you quote is perfectly reasonable for this query... > the machine is even bright enough to figure out that it only > needs to evaluate the subquery once. > > regards, tom lane > Yes, this seems to be a wrong statement. Actually I would write it like: select AO,AT,AV FROM P3AT WHERE AO IN (SELECT AO FROM P3AT WHERE ...) but as PostgreSQL does not like "IN" very much I rewrote it to the wrongstatement above. select P3AT.AO,P3AT.AT,P3AT.AV FROM P3AT,P3AT as B WHERE (B.AO=P3AT.AO) AND (B.AV='12') AND (B.AT=12) AND (B.CI=17) Therefore I have to rewrite the sql statement above to: SELECT P3AT.AV,P3AT.AT,P3AT.AV FROM P3AT WHEREEXISTS( SELECT B.AO FROM P3AT as B WHERE B.AO=P3AT.AO AND ..... ) Right ? Marten Feldtmann
> > Therefore I have to rewrite the sql statement above to: > > SELECT P3AT.AV,P3AT.AT,P3AT.AV FROM P3AT WHERE > EXISTS( SELECT B.AO FROM P3AT as B WHERE B.AO=P3AT.AO AND ..... ) > Ok, indeed it was the wrong statement, therefore I rewrote it to: SELECT P3AT.AO,P3AT.AT,P3AT.AV FROM P3ATWHERE EXISTS (SELECT B.AO FROM P3AT AS B WHERE B.AO=P3AT.AO AND ....) There're indices on AO,AT,AV. Size of table about 23000 rows. He uses indices only for the statement within EXISTS, which seems to be pretty fast but for the outer SELECT statement he uses a sequential scan ... which brings the the statement down from 23 ms to 2000ms. Actually I don not understand it. The "B.AO=P3AT.AO" should create internally a join (?) and therefore he could use the index on AO in the outer select to create the result - but the seq scan seems to be wrong. Marten
Marten Feldtmann <marten@feki.toppoint.de> writes: > Ok, indeed it was the wrong statement, therefore I rewrote it to: > SELECT P3AT.AO,P3AT.AT,P3AT.AV FROM P3AT > WHERE > EXISTS > (SELECT B.AO FROM P3AT AS B WHERE B.AO=P3AT.AO AND ....) > There're indices on AO,AT,AV. Size of table about 23000 rows. > He uses indices only for the statement within EXISTS, which seems > to be pretty fast but for the outer SELECT statement he uses a > sequential scan ... which brings the the statement down from 23 ms to > 2000ms. > Actually I don not understand it. The "B.AO=P3AT.AO" should create > internally a join (?) and therefore he could use the index on AO in > the outer select to create the result - but the seq scan seems to > be wrong. You are right, this would be better done as a join, but the system is not currently smart enough to rewrite a sub-SELECT into a join. You have to do it yourself :-(. I think preserving the semantics of this exactly would require outer joins, which we don't yet have, but you could get close with DISTINCT: SELECT DISTINCT A.AO,A.AT,A.AV FROM P3AT A, P3AT B WHERE A.AO = B.AO AND ...; Both of these issues are on the TODO list, and probably will get addressed in a release or three... regards, tom lane