Thread: Multi-column index: Which column order

Multi-column index: Which column order

From
Sebastien Flaesch
Date:
Hello!

When creating an index on multiple columns, does the order of the columns matter?
(I guess so)

It's mostly for SELECT statements using a condition that include ALL columns of the index (pkey):

     SELECT * FROM art WHERE etb='L1' and code='ART345'

I would naturally put the columns with the most various values first, and

For example, if the "code" column contains thousands of various item ids like 'SXZ874', 'ERF345', ... while the "etb" column contains a dozen of values like "L1", "LT" and "BX".

Which one is best?

CREATE UNIQUE INDEX ix1 ON art (code, etb)
or
CREATE UNIQUE INDEX ix1 ON art (etb, code)

(or its PRIMARY KEY equivalent)

Does it depend on the type of index (Btree, GiST, etc) ?

I could not find that information in the doc.

Seb

Re: Multi-column index: Which column order

From
Erik Wienhold
Date:
> On 14/02/2023 18:53 CET Sebastien Flaesch <sebastien.flaesch@4js.com> wrote:
>
> Hello!
>
> When creating an index on multiple columns, does the order of the columns
> matter? (I guess so)

Maybe, depending on the queries.

> It's mostly for SELECT statements using a condition that include ALL
> columns of the index (pkey):
>
> SELECT * FROM art WHERE etb='L1' and code='ART345'
>
> I would naturally put the columns with the most various values first, and
>
> For example, if the "code" column contains thousands of various item ids
> like 'SXZ874', 'ERF345', ... while the "etb" column contains a dozen of
> values like "L1", "LT" and "BX".
>
> Which one is best?
>
> CREATE UNIQUE INDEX ix1 ON art (code, etb)
> or
> CREATE UNIQUE INDEX ix1 ON art (etb, code)
>
> (or its PRIMARY KEY equivalent)

It should not make any difference for the query above.  It can make a
difference for queries that only filter by the second index column or use
inequality constraints on those columns.

> Does it depend on the type of index (Btree, GiST, etc) ?
>
> I could not find that information in the doc.

Yes, see the documentation on multicolumn indexes with details on how they
are used: https://www.postgresql.org/docs/current/indexes-multicolumn.html

But you're limited to btree anyway if you're only interested in unique
indexes.

--
Erik



Re: Multi-column index: Which column order

From
Laurenz Albe
Date:
On Tue, 2023-02-14 at 17:53 +0000, Sebastien Flaesch wrote:
> When creating an index on multiple columns, does the order of the columns matter?
> (I guess so)

It does, but not in your case.

> It's mostly for SELECT statements using a condition that include ALL columns of the index (pkey):
>
>      SELECT * FROM art WHERE etb='L1' and code='ART345'
>
> I would naturally put the columns with the most various values first, and
>
> For example, if the "code" column contains thousands of various item ids like 'SXZ874',
> 'ERF345', ... while the "etb" column contains a dozen of values like "L1", "LT" and "BX".
>
> Which one is best?
>
> CREATE UNIQUE INDEX ix1 ON art (code, etb)
> or
> CREATE UNIQUE INDEX ix1 ON art (etb, code)
>
> (or its PRIMARY KEY equivalent)

Both are the same.

There is an old myth that says that you should use the moew selective column first
(which would be "code"), but that is just a myth.

Order makes a difference for queries like "WHERE etb = 'L1' and code LIKE 'ART345'"
or "WHERE code = 'ART345' ORDER BY etb", but not for two equality comparisons.

> Does it depend on the type of index (Btree, GiST, etc) ?

Yes, it does.  For B-tree and GiST indexes order matters, for hash, GIN and BRIN
indexes it doesn't (for various reasons).

Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com



Re: Multi-column index: Which column order

From
Ron
Date:
On 2/15/23 02:46, Laurenz Albe wrote:
[snip]
> Which one is best?
>> CREATE UNIQUE INDEX ix1 ON art (code, etb)
>> or
>> CREATE UNIQUE INDEX ix1 ON art (etb, code)
>>
>> (or its PRIMARY KEY equivalent)
> Both are the same.
>
> There is an old myth that says that you should use the moew selective column first
> (which would be "code"), but that is just a myth.

Only on Postgresql?

-- 
Born in Arizona, moved to Babylonia.



Re: Multi-column index: Which column order

From
Laurenz Albe
Date:
On Wed, 2023-02-15 at 10:20 -0600, Ron wrote:
> On 2/15/23 02:46, Laurenz Albe wrote:
> > Which one is best?
> > > CREATE UNIQUE INDEX ix1 ON art (code, etb)
> > > or
> > > CREATE UNIQUE INDEX ix1 ON art (etb, code)
> > >
> > > (or its PRIMARY KEY equivalent)
> > Both are the same.
> >
> > There is an old myth that says that you should use the more selective column first
> > (which would be "code"), but that is just a myth.
>
> Only on Postgresql?

No, on all relational databases that use B-tree indexes.

Look at how many index entries have to be scanned in both cases, and you will see.

Yours,
Laurenz Albe



Re: Multi-column index: Which column order

From
Ron
Date:
On 2/15/23 21:45, Laurenz Albe wrote:
> On Wed, 2023-02-15 at 10:20 -0600, Ron wrote:
>> On 2/15/23 02:46, Laurenz Albe wrote:
[snip]
>>> Both are the same.
>>> There is an old myth that says that you should use the more selective column first
>>> (which would be "code"), but that is just a myth.
>> Only on Postgresql?
> No, on all relational databases that use B-tree indexes.

Not only is "all" is a very absolute word (I know of a counter-example), but 
querying on the second segment means that you have to scan the whole tree 
instead of isolating one sub-branch.

-- 
Born in Arizona, moved to Babylonia.



Re: Multi-column index: Which column order

From
Laurenz Albe
Date:
On Wed, 2023-02-15 at 22:08 -0600, Ron wrote:
> On 2/15/23 21:45, Laurenz Albe wrote:
> > On Wed, 2023-02-15 at 10:20 -0600, Ron wrote:
> > > On 2/15/23 02:46, Laurenz Albe wrote:
> > > > Both are the same.
> > > > There is an old myth that says that you should use the more selective column first
> > > > (which would be "code"), but that is just a myth.
> > >
> > > Only on Postgresql?
> >
> > No, on all relational databases that use B-tree indexes.
>
> Not only is "all" is a very absolute word (I know of a counter-example), but
> querying on the second segment means that you have to scan the whole tree
> instead of isolating one sub-branch.

The question was about a multi-column index where all columns are compared
with "equal" in the WHERE condition.  For other cases, order matters.

I am aware of the danger of absolute claims, but as far as I can tell, this
is in the nature of B-tree indexes.

Yours,
Laurenz Albe