Thread: Query optimization and indexes

Query optimization and indexes

From
felix@crowfix.com
Date:
Suppose I have an index on 5 columns (A, B, C, D, E).

If my WHERE clause is not in that order, will the optimizer reorder
them as necessary and possible?

    WHERE A=1 AND C=3 AND B=2 AND E=5 AND D=4

Obviously it can't reorder them in all cases:

    WHERE A=1 AND (C=3 OR B=2) AND (E=5 OR D=4)

If I don't specify columns in the WHERE clause, how much can it use
the index?  I think it is smart enough to use beginning columns:

    WHERE A=1 AND B=2

How about skipping leading columns?

    WHERE B=2

How about skipping intermediate columns?

    WHERE A=1 AND C=3

Or both, which is probably the same?

    WHERE B=2 AND D=4?

--
            ... _._. ._ ._. . _._. ._. ___ .__ ._. . .__. ._ .. ._.
     Felix Finch: scarecrow repairman & rocket surgeon / felix@crowfix.com
  GPG = E987 4493 C860 246C 3B1E  6477 7838 76E9 182E 8151 ITAR license #4933
I've found a solution to Fermat's Last Theorem but I see I've run out of room o

Re: Query optimization and indexes

From
Tom Lane
Date:
felix@crowfix.com writes:
> Suppose I have an index on 5 columns (A, B, C, D, E).
> If my WHERE clause is not in that order, will the optimizer reorder
> them as necessary and possible?

Yes, the optimizer understands about commutativity/associativity of
AND and OR ;-)

> If I don't specify columns in the WHERE clause, how much can it use
> the index?

Before (if memory serves) 8.1, the planner would only consider leading
index columns as potential indexscan qualifiers.  So given

    where a = 5 and c = 4;

only the a = 5 clause would be used with the index.  As of 8.1 it will
consider using nonconsecutive index columns, but if you think for a bit
about the storage order of a btree, you'll realize that you really need
leading columns to keep down the amount of the index that gets scanned.
A lot of the time, such a plan will be rejected as apparently more
expensive than a seqscan.

(This is for btrees, I don't recall the state of play for GIST indexes
exactly.)

            regards, tom lane

Re: Query optimization and indexes

From
Gregory Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Before (if memory serves) 8.1, the planner would only consider leading
> index columns as potential indexscan qualifiers.  So given
>
>     where a = 5 and c = 4;
>
> only the a = 5 clause would be used with the index.  As of 8.1 it will
> consider using nonconsecutive index columns

Really? Is this the "skip scan" plan people were pining for? I missed when
that happened.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com

Re: Query optimization and indexes

From
Tom Lane
Date:
Gregory Stark <gsstark@mit.edu> writes:
> Tom Lane <tgl@sss.pgh.pa.us> writes:
>> only the a = 5 clause would be used with the index.  As of 8.1 it will
>> consider using nonconsecutive index columns

> Really? Is this the "skip scan" plan people were pining for?

No, there's no skip scan, it just applies all the indexable-column
checks within the index instead of making a trip to the heap.
For instance consider
    WHERE a >= 4 AND a < 7 AND c > 5
with index entries

    A    B    C

    3    9    8
    3    9    9
    4    0    0    <- search starts here
    4    0    1    reject
    ...
    4    0    5    reject
    4    0    6    accept (visit heap)
    4    0    9    accept
    4    1    0    reject
    ...
    6    9    8    accept
    6    9    9    accept
    7    0    0    <- search ends when we reach here
    7    0    1

If the condition on C is very selective then we might find ourselves
scanning over a lot of rejected entries within the possible bounds
for A.  The problem is to guess whether re-descending the search tree
will win compared to just slogging forward, and if so to generate a
suitable search key for each intermediate descent.

            regards, tom lane