Home > mailing lists

Re: Multiple indexes or multi-column index? - Mailing list pgsql-admin

From	Marc Mitchell
Subject	Re: Multiple indexes or multi-column index?
Date	August 13, 2002 20:00:08
Msg-id	016001c2430c$6dba7540$6e01050a@eisolution.com Whole thread Raw
In response to	Multiple indexes or multi-column index? (Jeff Boes <jboes@nexcerpt.com>)
List	pgsql-admin

Tree view

As has already been said, in order to use the index, the left-most index
columns must be supplied as part of your qualification.  But, beyond the
basics of when an index can or cannot be used, I would say you should
consider three additional issues: the distribution of values for any of
these columns, the effects of joining from index to base table, and the
volatility of the columns being indexed.

For example, if foo.a has potentially 100,000 different values then there
would seem to be little value in indexing the remaining three columns since
with a value for foo.a alone, you'll quickly narrow your request to a
handful of rows.  Extra columns just make the index entries themselves
wider.  The one benefit is that, by including columns within an index, even
if not used as part of the index access strategy, can eliminate the need to
join back to the base table itself if all the columns requested are within
the index itself.  HOWEVER: while I know this is the case with other RDBMS
products, I can't verify this within Postgres.  If I do a

"SELECT foo.index_column FROM foo WHERE foo.indexed_column LIKE 'ABC*';"

it runs the exact EXPLAIN results (aside from result with) as:

"SELECT foo.index_column FROM foo WHERE foo.indexed_column LIKE 'ABC*';"

so I'm not sure if the join from index to base table is being avoided or
not in the former case.

Furthermore, balance the ordering of columns between what is required for
index applicability based on the queries you run AND the distribution of
values for the columns you are indexing.  If, for example, all queries will
supply at least foo.a and foo.b, whichever of these two columns has the
fewer rows per given value should be listed first.

You should also consider the volatility of the columns you are including
within an index.  While the effect on inserts is basically the same (aside
from the width issue), adding extra columns to an index that are updated a
lot is not as desirable as extra columns that don't change.

Hope this is useful.

Marc Mitchell - Senior Application Architect
Enterprise Information Solutions, Inc.
Downers Grove, IL 60515
marcm@eisolution.com


----- Original Message -----
From: "Jeff Boes" <jboes@nexcerpt.com>
To: <pgsql-admin@postgresql.org>
Sent: Tuesday, August 13, 2002 3:14 PM
Subject: [ADMIN] Multiple indexes or multi-column index?


> Given a large (100,000+ rows) table such as:
>
> CREATE TABLE foo (
>   a INTEGER,
>   b INTEGER,
>   c TIMESTAMP,
>   d TEXT
>   /* possibly several more columns in here */
> );
>
> What are the pros and cons of these two different approaches to indexes?
>
> CREATE huge_index ON foo (a,b,c,d);
>
> vs.
>
> CREATE a_index ON foo (a);
> CREATE b_index ON foo (b);
> CREATE c_index ON foo (c);
> CREATE d_index ON foo (d);
>
> --
> Jeff Boes                                      vox 269.226.9550 ext 24
> Database Engineer                                     fax 269.349.9076
> Nexcerpt, Inc.                                 http://www.nexcerpt.com
>            ...Nexcerpt... Extend your Expertise
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
> http://archives.postgresql.org

pgsql-admin by date:

From: Oleg Bartunov
Date: 13 August 2002, 19:21:27
Subject: Re: Multiple indexes or multi-column index?

From: Tom Lane
Date: 13 August 2002, 20:03:00
Subject: Re: performance tuning: shared_buffers, sort_mem; swap

Re: Multiple indexes or multi-column index? - Mailing list pgsql-admin

Previous

Next