Thread: partial indexes

partial indexes

From

Unregistered

Date:

01 February 2004, 22:18:42

I have following situation:
one "message" table and 3 other "message_{1,2,3}" tables which inherit
from "message".
Every day +-50 000 "messages" are being inserted.

I needed something  extra to speed up the queries instead of only using
indexes, so I thought of doing something like you can do in Oracle,
table partitioning.
In postgresql you can simulate this using the partial indexes.

So what I did was:
made an "today_idx", "2_weeks_idx", "1 month_idx" on the attribute
"sent_date" to speed up the queries using the "sent_date"
attribute,which 90% of the times is queried on. These indexes are
rebuild every night.

When I was testing this setup, the indexes I used where like:
today_idx: 1/02
2_weeks_idx: 18/01  -->  1/02
month_idx: 2/01 --> 1/02

When quering something like:
select * from messages where sent_date = '2004-02-01'

the optimiser chooses to search in the month_idx index instead of using
the optimal today_idx. This is because today (1/02) is also included in
the month_idx (and also the 2weeks_idx)
So I had to recreate the indexes excluding the previous ones and
borders. Is this normal?

My question:
Is this a good way to solve the problem? are there any other ways I can
do this?
Is the inheritance a good feature / stable in postgresql (not a lot of
documentation about)

Tnx in advance


Unregistered -
------------------------------------------------------------------------
Posted via http://www.webservertalk.com
------------------------------------------------------------------------
View this thread: http://www.webservertalk.com/message106661.html

Re: partial indexes

From

Tom Lane

Date:

01 February 2004, 23:53:34

Unregistered <Guest.10y6ky@mail.webservertalk.com> writes:
> So what I did was:
> made an "today_idx", "2_weeks_idx", "1 month_idx" on the attribute
> "sent_date" to speed up the queries using the "sent_date"
> attribute,which 90% of the times is queried on. These indexes are
> rebuild every night.

> When I was testing this setup, the indexes I used where like:
> today_idx: 1/02
> 2_weeks_idx: 18/01  -->  1/02
> month_idx: 2/01 --> 1/02

> When quering something like:
> select * from messages where sent_date = '2004-02-01'

> the optimiser chooses to search in the month_idx index instead of using
> the optimal today_idx. This is because today (1/02) is also included in
> the month_idx (and also the 2weeks_idx)

There's really hardly any difference in practice.  The exact same number
of index entries will be scanned in both cases.  If you're lucky there
might be one less level of btree to descend through at the start of the
scan, but probably not more than that considering the index fanout will
be in the hundreds.  The planner's cost model for this is too crude to
tell the difference, and so it estimates the costs the same and might
choose either index.

The above really isn't a very profitable way to use partial indexes,
anyway.  You're triply indexing the most active part of the table,
thus tripling your index-updating costs, for what certainly won't
be a 3x payoff in access time.  See the Postgres user manual for some
more plausible use-cases for partial indexes.

            regards, tom lane