Home > mailing lists

Re: Clustered index to preserve data locality in a multitenant application? - Mailing list pgsql-general

From	Igor Neyman
Subject	Re: Clustered index to preserve data locality in a multitenant application?
Date	September 1, 2016 13:08:33
Msg-id	MWHPR07MB2877984B8E63B20E024FE031DAE20@MWHPR07MB2877.namprd07.prod.outlook.com Whole thread Raw
In response to	Re: Clustered index to preserve data locality in a multitenant application? (Nicolas Grilly <nicolas@gardentechno.com>)
Responses	Re: Clustered index to preserve data locality in a multitenant application?
List	pgsql-general

Tree view

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Nicolas Grilly
Sent: Wednesday, August 31, 2016 6:32 PM
To: Kenneth Marshall <ktm@rice.edu>
Cc: Vick Khera <vivek@khera.org>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] Clustered index to preserve data locality in a multitenant application?

On Tue, Aug 30, 2016 at 8:17 PM, Kenneth Marshall <ktm@rice.edu> wrote:

We have been using the extension pg_repack to keep a table groomed into
cluster order. With an appropriate FILLFACTOR to keep updates on the same
page, it works well. The issue is that it needs space to rebuild the new
index/table. If you have that, it works well.

In DB2, it seems possible to define a "clustering index" that determines how rows are physically ordered in the "table space" (the heap).

The documentation says: "When a table has a clustering index, an INSERT statement causes DB2 to insert the records as nearly as possible in the order of their index values."

It looks like a kind of "continuous CLUSTER/pg_repack". Is there something similar available or planned for PostgreSQL?

Don’t know about plans to implement clustered indexes in PostgreSQL.

Not sure if this was mentioned, MS SQL Server has clustered indexes, where heap row is just stored on the leaf level of the index.

Oracle also has similar feature: IOT, Index Organized Table.

It seems to me (may be I’m wrong), that in PostgreSQL it should be much harder to implement clustered index (with the heap row stored in the index leaf) because of the way how MVCC implemented: multiple row versions are stored in the table itself (e.g. Oracle for that purpose keeps table “clean” and stores multiple row versions in UNDO tablespace/segment).

Regards,

Igor Neyman

pgsql-general by date:

From: "Mike Sofen"
Date: 01 September 2016, 12:20:56
Subject: Re: UPDATE OR REPLACE?

From: rob stone
Date: 01 September 2016, 13:35:26
Subject: Re: COL unique (CustomerID) plus COL unique (COUNT) inside CustomerID

Re: Clustered index to preserve data locality in a multitenant application? - Mailing list pgsql-general

Previous

Next