On Tue, Aug 30, 2016 at 7:10 AM, Nicolas Grilly
<nicolas@vocationcity.com> wrote:
> Let's say we have a table containing data for 10,000 tenants and 10,000 rows
> per tenant, for a total of 100,000,000 rows. Let's say each 8 KB block
> contains ~10 rows. Let's way we want to compute the sum of an integer column
> for all rows belonging to a given tenant ID.
I'll assume you have an index on the tenant ID. In that case, your
queries will be pretty fast.
On some instances, we have multi-column indexes starting with the
tenant ID, and those are used very effectively as well.
I never worry about data locality.
Depending on your data distribution, you may want to consider table
partitions based on the tenant id. I personally never bother with
that, but split based on some other key in the data.