Re: Clustered index to preserve data locality in a multitenant application? - Mailing list pgsql-general

From Igor Neyman
Subject Re: Clustered index to preserve data locality in a multitenant application?
Date
Msg-id MWHPR07MB2877984B8E63B20E024FE031DAE20@MWHPR07MB2877.namprd07.prod.outlook.com
Whole thread Raw
In response to Re: Clustered index to preserve data locality in a multitenant application?  (Nicolas Grilly <nicolas@gardentechno.com>)
Responses Re: Clustered index to preserve data locality in a multitenant application?  (Nicolas Grilly <nicolas@gardentechno.com>)
List pgsql-general

From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Nicolas Grilly
Sent: Wednesday, August 31, 2016 6:32 PM
To: Kenneth Marshall <ktm@rice.edu>
Cc: Vick Khera <vivek@khera.org>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] Clustered index to preserve data locality in a multitenant application?

 

On Tue, Aug 30, 2016 at 8:17 PM, Kenneth Marshall <ktm@rice.edu> wrote:

We have been using the extension pg_repack to keep a table groomed into
cluster order. With an appropriate FILLFACTOR to keep updates on the same
page, it works well. The issue is that it needs space to rebuild the new
index/table. If you have that, it works well.

 

In DB2, it seems possible to define a "clustering index" that determines how rows are physically ordered in the "table space" (the heap).

 

The documentation says: "When a table has a clustering index, an INSERT statement causes DB2 to insert the records as nearly as possible in the order of their index values."


It looks like a kind of "continuous CLUSTER/pg_repack". Is there something similar available or planned for PostgreSQL?

 

 

Don’t know about plans to implement clustered indexes in PostgreSQL.

 

Not sure if this was mentioned, MS SQL Server has clustered indexes, where heap row is just stored on the leaf level of the index.

Oracle also has similar feature: IOT, Index Organized Table.

 

It seems to me (may be I’m wrong), that in PostgreSQL it should be much harder to implement clustered index (with the heap row stored in the index leaf) because of the way how MVCC implemented: multiple row versions are stored in the table itself (e.g. Oracle for that purpose keeps table “clean” and stores multiple row versions in UNDO tablespace/segment).

 

Regards,

Igor Neyman

 

 

pgsql-general by date:

Previous
From: "Mike Sofen"
Date:
Subject: Re: UPDATE OR REPLACE?
Next
From: rob stone
Date:
Subject: Re: COL unique (CustomerID) plus COL unique (COUNT) inside CustomerID