Re: Multicolum index and primary key - Mailing list pgsql-sql

From Rod Taylor
Subject Re: Multicolum index and primary key
Date
Msg-id 1069086197.77562.12.camel@jester
Whole thread Raw
In response to Multicolum index and primary key  (Michele Bendazzoli <mickymouse@mickymouse.it>)
Responses Re: Multicolum index and primary key
List pgsql-sql
> Suppose by example that one have a table1 with a primary key over three
> field (a, b, c): 
..
> are the indexes over (a) and (a, b) redundant (and so useless)?

Yes, they are redundant not not necessarily useless.

In short, an index with 3 keys will be larger than an index with 1 key,
as such PostgreSQL may choose to use the single key index to reduce the
number of pages it needs to pull off the disk.

That said, if the 3 key index is hit regularly, it is likely to be in
memory where the rarely hit single key index is not.  This would make
going through the 3 key data faster (although there is more of it) than
retrieving the single key data from disk, then processing.

To top it all off, managing 3 indexes takes significantly longer during
INSERT and UPDATE than manging a single larger index does.


So... Are they useless?

The primary key is required, so it's index is required.

Do a majority of the queries against that table only supply one or two
pieces of information? If so, you may benefit, as these indexes will
tend to be in memory.

Is access on the table mostly read? Is the write penalty worth the
increased speed of write?

Is the additional storage space worth it? Indexes on thousands or
million of tuples are not free. 3 indexes will probably consume as much
diskspace as the original table did thus doubling your storage
requirements.


Finally, if everything is useful, I suggest you re-order some of the
indexes. a, ab, abc all require a to be a part of the query.  There is 0
benefit if b or c are supplied without a.

If you have determined 3 indexes will be useful, you might try a, ba,
cba. This way if b or c are supplied without a, they will receive some
benefit of the index with negligible impact to the queries that do use
a.

NOTE: I have made an assumption that the distribution of a, b and c are
equivalent. You will want the more selective field first in your index
to reduce the number of disk accesses -- so couple that with the odds
that b or c will be supplied without a.





pgsql-sql by date:

Previous
From: Tomasz Myrta
Date:
Subject: Re: Multicolum index and primary key
Next
From: Michele Bendazzoli
Date:
Subject: Re: Multicolum index and primary key