Thread: Multicolum index and primary key
I wonder if is useless to set some indexes for columns contained in a multifield primary key. Suppose by example that one have a table1 with a primary key over three field (a, b, c): ALTER TABLE public.table1 ADD CONSTRAINT table1_pkey PRIMARY KEY(a, b, c); are the indexes over (a) and (a, b) redundant (and so useless)? I.e. with other words the question is, if I have a primary key set for the a, b, c fields the queries SELECT 1 from table1 where a='a value' and SELECT 1 from table1 where a='a value' AND b='another value' use the implicit index created by the primary key or not? Thank you in advance for any advice. ciao, Michele p.s. I know, I'll have to begin to use the explain command ... I promise I'll do it ;-)
Michele Bendazzoli writes: > ALTER TABLE public.table1 > ADD CONSTRAINT table1_pkey PRIMARY KEY(a, b, c); > > are the indexes over (a) and (a, b) redundant (and so useless)? Exactly. > p.s. I know, I'll have to begin to use the explain command ... > I promise I'll do it ;-) You could also have read the documentation about multicolumn indexes, because it contains exactly this example. -- Peter Eisentraut peter_e@gmx.net
Dnia 2003-11-17 18:00, Użytkownik Michele Bendazzoli napisał: > p.s. I know, I'll have to begin to use the explain command ... > I promise I'll do it ;-) Use the explain analyze command and then answer yourself ;-) Regards, Tomasz Myrta
> Suppose by example that one have a table1 with a primary key over three > field (a, b, c): .. > are the indexes over (a) and (a, b) redundant (and so useless)? Yes, they are redundant not not necessarily useless. In short, an index with 3 keys will be larger than an index with 1 key, as such PostgreSQL may choose to use the single key index to reduce the number of pages it needs to pull off the disk. That said, if the 3 key index is hit regularly, it is likely to be in memory where the rarely hit single key index is not. This would make going through the 3 key data faster (although there is more of it) than retrieving the single key data from disk, then processing. To top it all off, managing 3 indexes takes significantly longer during INSERT and UPDATE than manging a single larger index does. So... Are they useless? The primary key is required, so it's index is required. Do a majority of the queries against that table only supply one or two pieces of information? If so, you may benefit, as these indexes will tend to be in memory. Is access on the table mostly read? Is the write penalty worth the increased speed of write? Is the additional storage space worth it? Indexes on thousands or million of tuples are not free. 3 indexes will probably consume as much diskspace as the original table did thus doubling your storage requirements. Finally, if everything is useful, I suggest you re-order some of the indexes. a, ab, abc all require a to be a part of the query. There is 0 benefit if b or c are supplied without a. If you have determined 3 indexes will be useful, you might try a, ba, cba. This way if b or c are supplied without a, they will receive some benefit of the index with negligible impact to the queries that do use a. NOTE: I have made an assumption that the distribution of a, b and c are equivalent. You will want the more selective field first in your index to reduce the number of disk accesses -- so couple that with the odds that b or c will be supplied without a.
On Mon, 2003-11-17 at 17:14, Peter Eisentraut wrote: > Michele Bendazzoli writes: > > > ALTER TABLE public.table1 > > ADD CONSTRAINT table1_pkey PRIMARY KEY(a, b, c); > > > > are the indexes over (a) and (a, b) redundant (and so useless)? > > Exactly. > > p.s. I know, I'll have to begin to use the explain command ... > > I promise I'll do it ;-) > > You could also have read the documentation about multicolumn indexes, > because it contains exactly this example. Thank you. I have read the chapter but I wonder if the implicit index created by the primary key behaves as a normal indexes at all. ciao, Michele
On Mon, 2003-11-17 at 17:15, Tomasz Myrta wrote: > Dnia 2003-11-17 18:00, Użytkownik Michele Bendazzoli napisał: > > p.s. I know, I'll have to begin to use the explain command ... > > I promise I'll do it ;-) > > Use the explain analyze command and then answer yourself ;-) I'm not in still in production and I have too few values ... Thank you anyway ciao, Michele
On Mon, 2003-11-17 at 17:23, Rod Taylor wrote: > > Suppose by example that one have a table1 with a primary key over three > > field (a, b, c): > .. > > are the indexes over (a) and (a, b) redundant (and so useless)? > > Yes, they are redundant not not necessarily useless. > > In short, an index with 3 keys will be larger than an index with 1 key, > as such PostgreSQL may choose to use the single key index to reduce the > number of pages it needs to pull off the disk. > > That said, if the 3 key index is hit regularly, it is likely to be in > memory where the rarely hit single key index is not. This would make > going through the 3 key data faster (although there is more of it) than > retrieving the single key data from disk, then processing. > > To top it all off, managing 3 indexes takes significantly longer during > INSERT and UPDATE than manging a single larger index does. > So... Are they useless? > > The primary key is required, so it's index is required. In fact ... > Do a majority of the queries against that table only supply one or two > pieces of information? If so, you may benefit, as these indexes will > tend to be in memory. > Is access on the table mostly read? Is the write penalty worth the > increased speed of write? > Is the additional storage space worth it? Indexes on thousands or > million of tuples are not free. 3 indexes will probably consume as much > diskspace as the original table did thus doubling your storage > requirements. > > > Finally, if everything is useful, I suggest you re-order some of the > indexes. a, ab, abc all require a to be a part of the query. There is 0 > benefit if b or c are supplied without a. a is always present in the queries ... and other that (a, ab, abc) i have only to query (ac): so I think I have to index separately only (ac). > If you have determined 3 indexes will be useful, you might try a, ba, > cba. This way if b or c are supplied without a, they will receive some > benefit of the index with negligible impact to the queries that do use > a. Uhm, good point ... I'll have to think carefully. > NOTE: I have made an assumption that the distribution of a, b and c are > equivalent. You will want the more selective field first in your index > to reduce the number of disk accesses -- so couple that with the odds > that b or c will be supplied without a. For now a is not selective at all because I have sell the service at only one client (a may represent a function of the client). Obviously I hope that in the future the number of distinct values on field 'a' grows exponentially :-)) Thank you very much for the in depth advice. ciao, Michele
On Mon, 2003-11-17 at 18:13, Tomasz Myrta wrote: > Dnia 2003-11-17 19:00, Użytkownik Michele Bendazzoli napisał: > > a is always present in the queries ... and other that (a, ab, abc) i > > have only to query (ac): so I think I have to index separately only > > (ac). > > For such cases consider changing primary key from (a,b,c) into (a,c,b) b,c are exactly simmetric, so if I change the primary key to (a,c,b) I have to index separately (a,b) ... rigth? ciao, Michele