Home > mailing lists

Re: Multicolum index and primary key - Mailing list pgsql-sql

From	Michele Bendazzoli
Subject	Re: Multicolum index and primary key
Date	November 17, 2003 15:53:53
Msg-id	1069092026.8338.52.camel@mickymouse.sintel Whole thread Raw
In response to	Re: Multicolum index and primary key (Rod Taylor <pg@rbt.ca>)
List	pgsql-sql

Tree view

On Mon, 2003-11-17 at 17:23, Rod Taylor wrote:
> > Suppose by example that one have a table1 with a primary key over three
> > field (a, b, c): 
> ..
> > are the indexes over (a) and (a, b) redundant (and so useless)?
> 
> Yes, they are redundant not not necessarily useless.
> 
> In short, an index with 3 keys will be larger than an index with 1 key,
> as such PostgreSQL may choose to use the single key index to reduce the
> number of pages it needs to pull off the disk.
> 
> That said, if the 3 key index is hit regularly, it is likely to be in
> memory where the rarely hit single key index is not.  This would make
> going through the 3 key data faster (although there is more of it) than
> retrieving the single key data from disk, then processing.
> 
> To top it all off, managing 3 indexes takes significantly longer during
> INSERT and UPDATE than manging a single larger index does.

> So... Are they useless?
> 
> The primary key is required, so it's index is required.

In fact ... 

> Do a majority of the queries against that table only supply one or two
> pieces of information? If so, you may benefit, as these indexes will
> tend to be in memory.
> Is access on the table mostly read? Is the write penalty worth the
> increased speed of write?
> Is the additional storage space worth it? Indexes on thousands or
> million of tuples are not free. 3 indexes will probably consume as much
> diskspace as the original table did thus doubling your storage
> requirements.
> 
> 
> Finally, if everything is useful, I suggest you re-order some of the
> indexes. a, ab, abc all require a to be a part of the query.  There is 0
> benefit if b or c are supplied without a.

a is always present in the queries ... and other that (a, ab, abc) i
have only to query (ac): so I think I have to index separately only
(ac).

> If you have determined 3 indexes will be useful, you might try a, ba,
> cba. This way if b or c are supplied without a, they will receive some
> benefit of the index with negligible impact to the queries that do use
> a.

Uhm, good point ... I'll have to think carefully.

> NOTE: I have made an assumption that the distribution of a, b and c are
> equivalent. You will want the more selective field first in your index
> to reduce the number of disk accesses -- so couple that with the odds
> that b or c will be supplied without a.

For now a is not selective at all because I have sell the service at
only one client (a may represent a function of the client). Obviously I
hope that in the future the number of distinct values on field 'a' grows
exponentially :-))

Thank you very much for the in depth advice.

ciao, Michele

pgsql-sql by date:

From: Michele Bendazzoli
Date: 17 November 2003, 15:28:49
Subject: Re: Multicolum index and primary key

From: ritchie turner
Date: 17 November 2003, 17:33:42
Subject: COPY

Re: Multicolum index and primary key - Mailing list pgsql-sql

Previous

Next