Home > mailing lists

Re: Clustered index to preserve data locality in a multitenant application? - Mailing list pgsql-general

From	Kenneth Marshall
Subject	Re: Clustered index to preserve data locality in a multitenant application?
Date	August 30, 2016 18:17:13
Msg-id	20160830181705.GY4950@aart.rice.edu Whole thread Raw
In response to	Re: Clustered index to preserve data locality in a multitenant application? (Nicolas Grilly <nicolas@gardentechno.com>)
Responses	Re: Clustered index to preserve data locality in a multitenant application? Re: Clustered index to preserve data locality in a multitenant application? Re: Clustered index to preserve data locality in a multitenant application?
List	pgsql-general

Tree view

On Tue, Aug 30, 2016 at 07:51:58PM +0200, Nicolas Grilly wrote:
> On Tue, Aug 30, 2016 at 7:26 PM, Vick Khera <vivek@khera.org> wrote:
>
> > I'll assume you have an index on the tenant ID. In that case, your
> > queries will be pretty fast.
> >
> > On some instances, we have multi-column indexes starting with the
> > tenant ID, and those are used very effectively as well.
> >
> > I never worry about data locality.
> >
>
> Yes, we have an index starting with the tenant ID, and the query uses the
> index.
>
> But I'm still worried about PostgreSQL having to fetch 10 times more pages
> from the disk than MySQL. If each 8K page contains approximately 10 rows,
> fetching one page in MySQL will return 10 "useful" rows belonging to the
> tenant. By comparison, fetching one page in PostgreSQL will probably return
> only 1 "useful" row belonging to the tenant. In terms of IO, it's a big
> difference.
>
>
> > Depending on your data distribution, you may want to consider table
> > partitions based on the tenant id. I personally never bother with
> > that, but split based on some other key in the data.
> >
>
> You're right. I don't really need a clustered index (like InnoDB). What I
> need is to store rows belonging to the same tenant close from each other.
> Partitioning can help with that. But the lack of declarative partitioning
> makes it cumbersome (I've read this is worked on).

Hi,

We have been using the extension pg_repack to keep a table groomed into
cluster order. With an appropriate FILLFACTOR to keep updates on the same
page, it works well. The issue is that it needs space to rebuild the new
index/table. If you have that, it works well.

Regards,
Ken

pgsql-general by date:

From: Nicolas Grilly
Date: 30 August 2016, 17:52:21
Subject: Re: Clustered index to preserve data locality in a multitenant application?

From: Daniel Caldeweyher
Date: 30 August 2016, 20:16:08
Subject: Do function calls the cached?

Re: Clustered index to preserve data locality in a multitenant application? - Mailing list pgsql-general

Previous

Next