Thread: Multi-column index: Which column order
Hello!
When creating an index on multiple columns, does the order of the columns matter?
(I guess so)
It's mostly for SELECT statements using a condition that include ALL columns of the index (pkey):
SELECT * FROM art WHERE etb='L1' and code='ART345'
I would naturally put the columns with the most various values first, and
For example, if the "code" column contains thousands of various item ids like 'SXZ874', 'ERF345', ... while the "etb" column contains a dozen of values like "L1", "LT" and "BX".
Which one is best?
CREATE UNIQUE INDEX ix1 ON art (code, etb)
or
CREATE UNIQUE INDEX ix1 ON art (etb, code)
(or its PRIMARY KEY equivalent)
Does it depend on the type of index (Btree, GiST, etc) ?
I could not find that information in the doc.
Seb
> On 14/02/2023 18:53 CET Sebastien Flaesch <sebastien.flaesch@4js.com> wrote: > > Hello! > > When creating an index on multiple columns, does the order of the columns > matter? (I guess so) Maybe, depending on the queries. > It's mostly for SELECT statements using a condition that include ALL > columns of the index (pkey): > > SELECT * FROM art WHERE etb='L1' and code='ART345' > > I would naturally put the columns with the most various values first, and > > For example, if the "code" column contains thousands of various item ids > like 'SXZ874', 'ERF345', ... while the "etb" column contains a dozen of > values like "L1", "LT" and "BX". > > Which one is best? > > CREATE UNIQUE INDEX ix1 ON art (code, etb) > or > CREATE UNIQUE INDEX ix1 ON art (etb, code) > > (or its PRIMARY KEY equivalent) It should not make any difference for the query above. It can make a difference for queries that only filter by the second index column or use inequality constraints on those columns. > Does it depend on the type of index (Btree, GiST, etc) ? > > I could not find that information in the doc. Yes, see the documentation on multicolumn indexes with details on how they are used: https://www.postgresql.org/docs/current/indexes-multicolumn.html But you're limited to btree anyway if you're only interested in unique indexes. -- Erik
On Tue, 2023-02-14 at 17:53 +0000, Sebastien Flaesch wrote: > When creating an index on multiple columns, does the order of the columns matter? > (I guess so) It does, but not in your case. > It's mostly for SELECT statements using a condition that include ALL columns of the index (pkey): > > SELECT * FROM art WHERE etb='L1' and code='ART345' > > I would naturally put the columns with the most various values first, and > > For example, if the "code" column contains thousands of various item ids like 'SXZ874', > 'ERF345', ... while the "etb" column contains a dozen of values like "L1", "LT" and "BX". > > Which one is best? > > CREATE UNIQUE INDEX ix1 ON art (code, etb) > or > CREATE UNIQUE INDEX ix1 ON art (etb, code) > > (or its PRIMARY KEY equivalent) Both are the same. There is an old myth that says that you should use the moew selective column first (which would be "code"), but that is just a myth. Order makes a difference for queries like "WHERE etb = 'L1' and code LIKE 'ART345'" or "WHERE code = 'ART345' ORDER BY etb", but not for two equality comparisons. > Does it depend on the type of index (Btree, GiST, etc) ? Yes, it does. For B-tree and GiST indexes order matters, for hash, GIN and BRIN indexes it doesn't (for various reasons). Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
On 2/15/23 02:46, Laurenz Albe wrote: [snip] > Which one is best? >> CREATE UNIQUE INDEX ix1 ON art (code, etb) >> or >> CREATE UNIQUE INDEX ix1 ON art (etb, code) >> >> (or its PRIMARY KEY equivalent) > Both are the same. > > There is an old myth that says that you should use the moew selective column first > (which would be "code"), but that is just a myth. Only on Postgresql? -- Born in Arizona, moved to Babylonia.
On Wed, 2023-02-15 at 10:20 -0600, Ron wrote: > On 2/15/23 02:46, Laurenz Albe wrote: > > Which one is best? > > > CREATE UNIQUE INDEX ix1 ON art (code, etb) > > > or > > > CREATE UNIQUE INDEX ix1 ON art (etb, code) > > > > > > (or its PRIMARY KEY equivalent) > > Both are the same. > > > > There is an old myth that says that you should use the more selective column first > > (which would be "code"), but that is just a myth. > > Only on Postgresql? No, on all relational databases that use B-tree indexes. Look at how many index entries have to be scanned in both cases, and you will see. Yours, Laurenz Albe
On 2/15/23 21:45, Laurenz Albe wrote: > On Wed, 2023-02-15 at 10:20 -0600, Ron wrote: >> On 2/15/23 02:46, Laurenz Albe wrote: [snip] >>> Both are the same. >>> There is an old myth that says that you should use the more selective column first >>> (which would be "code"), but that is just a myth. >> Only on Postgresql? > No, on all relational databases that use B-tree indexes. Not only is "all" is a very absolute word (I know of a counter-example), but querying on the second segment means that you have to scan the whole tree instead of isolating one sub-branch. -- Born in Arizona, moved to Babylonia.
On Wed, 2023-02-15 at 22:08 -0600, Ron wrote: > On 2/15/23 21:45, Laurenz Albe wrote: > > On Wed, 2023-02-15 at 10:20 -0600, Ron wrote: > > > On 2/15/23 02:46, Laurenz Albe wrote: > > > > Both are the same. > > > > There is an old myth that says that you should use the more selective column first > > > > (which would be "code"), but that is just a myth. > > > > > > Only on Postgresql? > > > > No, on all relational databases that use B-tree indexes. > > Not only is "all" is a very absolute word (I know of a counter-example), but > querying on the second segment means that you have to scan the whole tree > instead of isolating one sub-branch. The question was about a multi-column index where all columns are compared with "equal" in the WHERE condition. For other cases, order matters. I am aware of the danger of absolute claims, but as far as I can tell, this is in the nature of B-tree indexes. Yours, Laurenz Albe