Re: Multi tenancy : schema vs databases - Mailing list pgsql-general

From Colin Morelli
Subject Re: Multi tenancy : schema vs databases
Date
Msg-id CAPtU-Uqn8LGLXQnKaFXAZ-27MZZScUi20a+WoyQaZUBtJyvxBw@mail.gmail.com
Whole thread Raw
In response to Re: Multi tenancy : schema vs databases  (Vick Khera <vivek@khera.org>)
List pgsql-general
Rakesh,

As long as one application knows how to connect to more than 1 tenant, there will always be the possibility that a software bug in your application causes one tenant to access another tenant's data. I think this is why you're getting people asking you to refine your requirements. There quite simply is no way to "guarantee" one tenant can't see another tenant when they are sharing resources.

As Vick mentioned, whether you do this by database, schema, or a column in a table, your application has to decide which tenant to connect to and when. You could make the argument that using a column such as customer_id increases the surface area of the potential failures, but generally speaking - regardless of the solution you go with - your best investment would be in a testing infrastructure for your application that ensures data remains tenant-isolated.

I know none of this answers your most recent question, about the technical tradeoffs between schema per tenant or database per tenant, but I think it's still relevant to your original question. Both of those solutions will be considerably harder to manage than a shared-everything infrastructure. They'll certainly both require more resources, and they introduce the problem for other issues (inconsistent schema and/or table definitions across multiple tenants being one of the most problematic and difficult to prevent).

Best,
Colin

Worth noting: in the shared everything infrastructure, it's far more likely that a bug in your software results in one tenant seeing all data across all tenants, as opposed to data for one wrong tenant. The good news(?) here is that these kinds of bugs are generally very easy to spot before they ever make it out to customers. In any case, the answer is almost certainly going to be testing.

On Fri, Sep 30, 2016 at 10:34 AM Vick Khera <vivek@khera.org> wrote:
On Fri, Sep 30, 2016 at 5:11 AM, John R Pierce <pierce@hogranch.com> wrote:
> On 9/30/2016 2:06 AM, Rakesh Kumar wrote:
>>
>> We require complete data isolation. Absolutely nothing should be shared
>> between two tenants.
>>
>> WHy would multiple dbs be any worse than multiple schemas in performance?
>
>
> complete?  use 1000s of seperate VM instances, one per tennant.
>

Well, VM's don't always provide 100% isolation, so separate hardware,
with each on its own VLAN seems right to me. And then make sure your
switch doesn't leak across VLANs.

Your requirements need refinement, at the least :)


--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

pgsql-general by date:

Previous
From: Vick Khera
Date:
Subject: Re: Multi tenancy : schema vs databases
Next
From: Paul Jungwirth
Date:
Subject: Re: Multi tenancy : schema vs databases