Thread: Multicolum index and primary key

Multicolum index and primary key

From

Michele Bendazzoli

Date:

17 November 2003, 11:53:55

I wonder if is useless to set some indexes for columns contained in a
multifield primary key.

Suppose by example that one have a table1 with a primary key over three
field (a, b, c): 

ALTER TABLE public.table1 ADD CONSTRAINT table1_pkey PRIMARY KEY(a, b, c);

are the indexes over (a) and (a, b) redundant (and so useless)?

I.e. with other words the question is, if I have a primary key set for
the a, b, c fields the queries 

SELECT 1 from table1 where a='a value'

and

SELECT 1 from table1 where a='a value' AND b='another value'

use the implicit index created by the primary key or not?

Thank you in advance for any advice.

ciao, Michele 

p.s. I know, I'll have to begin to use the explain command ...
I promise I'll do it ;-)

Re: Multicolum index and primary key

From

Peter Eisentraut

Date:

17 November 2003, 12:15:10

Michele Bendazzoli writes:

> ALTER TABLE public.table1
>   ADD CONSTRAINT table1_pkey PRIMARY KEY(a, b, c);
>
> are the indexes over (a) and (a, b) redundant (and so useless)?

Exactly.

> p.s. I know, I'll have to begin to use the explain command ...
> I promise I'll do it ;-)

You could also have read the documentation about multicolumn indexes,
because it contains exactly this example.

-- 
Peter Eisentraut   peter_e@gmx.net

Re: Multicolum index and primary key

From

Tomasz Myrta

Date:

17 November 2003, 12:16:55

Dnia 2003-11-17 18:00, Użytkownik Michele Bendazzoli napisał:
> p.s. I know, I'll have to begin to use the explain command ...
> I promise I'll do it ;-)

Use the explain analyze command and then answer yourself ;-)

Regards,
Tomasz Myrta

Re: Multicolum index and primary key

From

Rod Taylor

Date:

17 November 2003, 12:23:53

> Suppose by example that one have a table1 with a primary key over three
> field (a, b, c): 
..
> are the indexes over (a) and (a, b) redundant (and so useless)?

Yes, they are redundant not not necessarily useless.

In short, an index with 3 keys will be larger than an index with 1 key,
as such PostgreSQL may choose to use the single key index to reduce the
number of pages it needs to pull off the disk.

That said, if the 3 key index is hit regularly, it is likely to be in
memory where the rarely hit single key index is not.  This would make
going through the 3 key data faster (although there is more of it) than
retrieving the single key data from disk, then processing.

To top it all off, managing 3 indexes takes significantly longer during
INSERT and UPDATE than manging a single larger index does.


So... Are they useless?

The primary key is required, so it's index is required.

Do a majority of the queries against that table only supply one or two
pieces of information? If so, you may benefit, as these indexes will
tend to be in memory.

Is access on the table mostly read? Is the write penalty worth the
increased speed of write?

Is the additional storage space worth it? Indexes on thousands or
million of tuples are not free. 3 indexes will probably consume as much
diskspace as the original table did thus doubling your storage
requirements.


Finally, if everything is useful, I suggest you re-order some of the
indexes. a, ab, abc all require a to be a part of the query.  There is 0
benefit if b or c are supplied without a.

If you have determined 3 indexes will be useful, you might try a, ba,
cba. This way if b or c are supplied without a, they will receive some
benefit of the index with negligible impact to the queries that do use
a.

NOTE: I have made an assumption that the distribution of a, b and c are
equivalent. You will want the more selective field first in your index
to reduce the number of disk accesses -- so couple that with the odds
that b or c will be supplied without a.

Re: Multicolum index and primary key

From

Michele Bendazzoli

Date:

17 November 2003, 12:27:10

On Mon, 2003-11-17 at 17:14, Peter Eisentraut wrote:
> Michele Bendazzoli writes:
> 
> > ALTER TABLE public.table1
> >   ADD CONSTRAINT table1_pkey PRIMARY KEY(a, b, c);
> >
> > are the indexes over (a) and (a, b) redundant (and so useless)?
> 
> Exactly.

> > p.s. I know, I'll have to begin to use the explain command ...
> > I promise I'll do it ;-)
> 
> You could also have read the documentation about multicolumn indexes,
> because it contains exactly this example.

Thank you. I have read the chapter but I wonder if the implicit index
created by the primary key behaves as a normal indexes at all.

ciao, Michele

Re: Multicolum index and primary key

From

Michele Bendazzoli

Date:

17 November 2003, 12:28:49

On Mon, 2003-11-17 at 17:15, Tomasz Myrta wrote:
> Dnia 2003-11-17 18:00, Użytkownik Michele Bendazzoli napisał:
> > p.s. I know, I'll have to begin to use the explain command ...
> > I promise I'll do it ;-)
> 
> Use the explain analyze command and then answer yourself ;-)

I'm not in still in production and I have too few values ...

Thank you anyway

ciao, Michele

Re: Multicolum index and primary key

From

Michele Bendazzoli

Date:

17 November 2003, 12:53:53

On Mon, 2003-11-17 at 17:23, Rod Taylor wrote:
> > Suppose by example that one have a table1 with a primary key over three
> > field (a, b, c): 
> ..
> > are the indexes over (a) and (a, b) redundant (and so useless)?
> 
> Yes, they are redundant not not necessarily useless.
> 
> In short, an index with 3 keys will be larger than an index with 1 key,
> as such PostgreSQL may choose to use the single key index to reduce the
> number of pages it needs to pull off the disk.
> 
> That said, if the 3 key index is hit regularly, it is likely to be in
> memory where the rarely hit single key index is not.  This would make
> going through the 3 key data faster (although there is more of it) than
> retrieving the single key data from disk, then processing.
> 
> To top it all off, managing 3 indexes takes significantly longer during
> INSERT and UPDATE than manging a single larger index does.

> So... Are they useless?
> 
> The primary key is required, so it's index is required.

In fact ... 

> Do a majority of the queries against that table only supply one or two
> pieces of information? If so, you may benefit, as these indexes will
> tend to be in memory.
> Is access on the table mostly read? Is the write penalty worth the
> increased speed of write?
> Is the additional storage space worth it? Indexes on thousands or
> million of tuples are not free. 3 indexes will probably consume as much
> diskspace as the original table did thus doubling your storage
> requirements.
> 
> 
> Finally, if everything is useful, I suggest you re-order some of the
> indexes. a, ab, abc all require a to be a part of the query.  There is 0
> benefit if b or c are supplied without a.

a is always present in the queries ... and other that (a, ab, abc) i
have only to query (ac): so I think I have to index separately only
(ac).

> If you have determined 3 indexes will be useful, you might try a, ba,
> cba. This way if b or c are supplied without a, they will receive some
> benefit of the index with negligible impact to the queries that do use
> a.

Uhm, good point ... I'll have to think carefully.

> NOTE: I have made an assumption that the distribution of a, b and c are
> equivalent. You will want the more selective field first in your index
> to reduce the number of disk accesses -- so couple that with the odds
> that b or c will be supplied without a.

For now a is not selective at all because I have sell the service at
only one client (a may represent a function of the client). Obviously I
hope that in the future the number of distinct values on field 'a' grows
exponentially :-))

Thank you very much for the in depth advice.

ciao, Michele

Re: Multicolum index and primary key

From

Michele Bendazzoli

Date:

18 November 2003, 06:17:45

On Mon, 2003-11-17 at 18:13, Tomasz Myrta wrote:
> Dnia 2003-11-17 19:00, Użytkownik Michele Bendazzoli napisał:
> > a is always present in the queries ... and other that (a, ab, abc) i
> > have only to query (ac): so I think I have to index separately only
> > (ac).
> 
> For such cases consider changing primary key from (a,b,c) into (a,c,b)

b,c are exactly simmetric, so if I change the primary key to (a,c,b) I
have to index separately (a,b) ... rigth? 

ciao, Michele