Thread: optimizing "between" queries

optimizing "between" queries

From
Kyle
Date:
I have a table where I need to select elements that are between
certain values.  The typical query would be like:

  select foo from bar where
       element1 between ? and ? and  -- using some val_1a and val_1b
       element2 between ? and ? and  -- using some val_2a and val_2b
       ...
       element30 between ? and ?;

ie- we've got 30 elements here.  The runtime is acceptable now with
10k rows in the table (about 0.04 seconds).  However, the table size
will grow to the milions in the near future.

The data for element1..30 are int2's and columns element1..element30
have been ordered by how well they discriminate.  Can I use "cluster"
on more than one column with indexes per column to improve the search
time?  How else might I be able to tweak this?

thanks,
Kyle



Re: optimizing "between" queries

From
Tom Lane
Date:
Kyle <kaf@nwlink.com> writes:
> I have a table where I need to select elements that are between
> certain values.  The typical query would be like:

>   select foo from bar where
>        element1 between ? and ? and  -- using some val_1a and val_1b
>        element2 between ? and ? and  -- using some val_2a and val_2b
>        ...
>        element30 between ? and ?;

> ie- we've got 30 elements here.  The runtime is acceptable now with
> 10k rows in the table (about 0.04 seconds).  However, the table size
> will grow to the milions in the near future.

> The data for element1..30 are int2's and columns element1..element30
> have been ordered by how well they discriminate.  Can I use "cluster"
> on more than one column with indexes per column to improve the search
> time?  How else might I be able to tweak this?

An indexscan can only use one index.  You could use a multicolumn index
effectively with such a query:

    create index fooi on foo(element1, element2, ...);

which will make use of the clauses
       element1 between ? and ? and
       element2 between ? and ?
as indexscan limits, with the rest checked on-the-fly.

Note that such an index is completely ineffective if you don't specify a
WHERE constraint for element1 --- in general, the system knows how to
use the first K columns of an N-column index if there are WHERE
constraints for all K columns.

I doubt it'd be worth your while to set up an index with more than, say,
half a dozen columns ... maybe not even that many.  The more columns,
the more specialized the use of the index is --- and the larger and
slower to update/search it is.  It's good advice in general not to go
overboard with creating lots of specialized indexes.  Think hard about
what fraction of your queries can really exploit a particular index.

            regards, tom lane