Re: Statistics and Multi-Column indexes - Mailing list pgsql-performance

From Samuel Gendler
Subject Re: Statistics and Multi-Column indexes
Date
Msg-id CAEV0TzDMkCXGDwPMm8nrK+cXL-fTCe56HvfJqe=BLtFOSfGbJQ@mail.gmail.com
Whole thread Raw
In response to Statistics and Multi-Column indexes  (lars <lhofhansl@yahoo.com>)
Responses Re: Statistics and Multi-Column indexes  (lars <lhofhansl@yahoo.com>)
List pgsql-performance
On Sun, Jul 10, 2011 at 2:16 PM, lars <lhofhansl@yahoo.com> wrote:
I know this has been discussed various times...

We are maintaining a large multi tenant database where *all* tables have a tenant-id and all indexes and PKs lead with the tenant-id.
Statistics and counts for the all other columns are only really meaningful within the context of the tenant they belong to.

There appear to be five options for me:
1. Using single column indexes on all interesting columns and rely on PostgreSQLs bitmap indexes to combine them (which are pretty cool).
2. Use multi column indexes and accept that sometimes Postgres pick the wrong index (because a non-tenant-id
column might seem highly selective over the table, but it is not for a particular tenant - or vice versa).
3. Use a functional index that combines multiple columns and only query via these, that causes statistics
gathering for the expression.
I.e. create index i on t((tenantid||column1)) and SELECT ... FROM t WHERE tenantid||column1 = '...'
4. Play with n_distinct and/or set the statistics for the inner columns to some fixed values that lead to the plans that we want.
5. Have a completely different schema and maybe a database per tenant.

 
What about partitioning tables by tenant id and then maintaining indexes on each partition independent of tenant id, since constraint exclusion should handle filtering by tenant id for you.  That seems like a potentially more tolerable variant of #5 How many tenants are we talking about?  I gather partitioning starts to become problematic when the number of partitions gets large.


pgsql-performance by date:

Previous
From: lars
Date:
Subject: Statistics and Multi-Column indexes
Next
From: Craig Ringer
Date:
Subject: Re: query total time im milliseconds