Thread: Database Scalability

Database Scalability

From
Saurav Sarkar
Date:
Hi All,

We are using Amazon RDS PostgreSQL.
We have some multi-tenant solutions which are separating the tenant data in Postgresql mainly in the following manner. 

1. Using different schemas
2. Using different tables for different tenants.

I was just wondering what would be the best way to scale in case no. of schemas or no. of tables limit increase and probably cross the limits of PostgreSQL. 

I understand that no. of schemas are unlimited in DB as per

https://www.postgresql.org/message-id/24478.1109618520%40sss.pgh.pa.us

and some limits on tables are mentioned here

https://www.postgresql.org/docs/12/limits.html

In case of table separated tenancy some tables might have more data than other tables. In the case of the table itself growing in size I can look for partitioning the data.

In the case of limits being reached in terms of tables and schemas (if any ) I understand that database sharding or horizontal scaling over multiple database instances could be one solution.  It will also help in having  more database connections/resources for an ever increasing number of tenants . Obviously with the headache of defining a balanced sharding strategy and maintaining it.

Thanks and Best Regards,

Saurav




Re: Database Scalability

From
Ben Chobot
Date:
Saurav Sarkar wrote on 11/29/21 10:13 PM:
Hi All,

We have some multi-tenant solutions which are separating the tenant data in Postgresql mainly in the following manner. 

1. Using different schemas
2. Using different tables for different tenants.

Without more details it's impossible to give you a detailed answer, so, in general.... if you are breaking out your client data by schema, you will likely be fine. We have used this method with great success to scale our customer workload - each customer gets their own schema, which we can then rebalance between databases as those clients drive more load over time. For those tables that we want partitioned, we simply partition them in every schema. That's arguably inefficient but we find the schema consistency to be an overall win.

(For context, when I say we have used this method with great success, we have over 13k customers, almost a PB of data, peak around 1.5M iops, and it's all painless to operate.)

Re: Database Scalability

From
Saurav Sarkar
Date:
Hi Ben,

Thanks a lot for your reply.

So are all the schemas on one DB or are distributed/sharded across multiple DBs ?

Best Regards,
Saurav

On Tue, Nov 30, 2021 at 11:43 PM Ben Chobot <bench@silentmedia.com> wrote:
Saurav Sarkar wrote on 11/29/21 10:13 PM:
Hi All,

We have some multi-tenant solutions which are separating the tenant data in Postgresql mainly in the following manner. 

1. Using different schemas
2. Using different tables for different tenants.

Without more details it's impossible to give you a detailed answer, so, in general.... if you are breaking out your client data by schema, you will likely be fine. We have used this method with great success to scale our customer workload - each customer gets their own schema, which we can then rebalance between databases as those clients drive more load over time. For those tables that we want partitioned, we simply partition them in every schema. That's arguably inefficient but we find the schema consistency to be an overall win.

(For context, when I say we have used this method with great success, we have over 13k customers, almost a PB of data, peak around 1.5M iops, and it's all painless to operate.)

Re: Database Scalability

From
Mladen Gogala
Date:
On 11/30/21 22:08, Saurav Sarkar wrote:
> Hi Ben,
>
> Thanks a lot for your reply.
>
> So are all the schemas on one DB or are distributed/sharded across 
> multiple DBs ?
>
> Best Regards,
> Saurav

To my knowledge PostgreSQL doesn't support sharding, which is well and 
good because sharding is mostly useless, at least in my opinion.

-- 
Mladen Gogala
Database Consultant
Tel: (347) 321-1217
https://dbwhisperer.wordpress.com




Re: Database Scalability

From
Michael Stephenson
Date:
Store a connection string for each tenant or the metadata to build one on demand.  Then each tenant is in its own
schemaon whatever database instance the connection string points at.  Then it doesn’t really matter how you spread your
tenantsacross one database or many; just do whatever works best and adjust as needed. 

~ Mike



Re: Database Scalability

From
Ben Chobot
Date:
Saurav Sarkar wrote on 11/30/21 7:08 PM:
> So are all the schemas on one DB or are distributed/sharded across 
> multiple DBs ?
In our use case, every db entirely homes one or more schemas. Some dbs 
host many schemas for small customers, some dbs host a handful of 
schemas for medium customers, and some dbs are dedicated to a single 
large customer. If the customer is very large, we get a bigger db than 
normal for them. So theoretically we still have a problem of vertical 
scale but in practice it's rarely an issue and when it is, we can 
usually split that single customer into multiple, at the application layer.



Re: Database Scalability

From
Ben Chobot
Date:
Mladen Gogala wrote on 11/30/21 7:52 PM:
>
> To my knowledge PostgreSQL doesn't support sharding, which is well and 
> good because sharding is mostly useless, at least in my opinion.
>
OK I'll bite.

Not only does PostgreSQL natively support table partitioning (which is 
absolutely a form of sharding), there multiple well-regarded extensions 
that can help with sharding, all of which are orthogonal to how you can 
configure your application to use Postgres in the first place. So to say 
Postgres doesn't support sharding is.... misleading, at best.

Also, the general concept of sharding to move your scaling challenges 
from vertical ones to horizontal ones has multiple self-evident 
advantages. If your work history has all happened to fit on a single 
server, then bully for you, but not everybody has it so easy.



Re: Database Scalability

From
SQL Padawan
Date:

> > To my knowledge PostgreSQL doesn't support sharding, which is well and
> >
> > good because sharding is mostly useless, at least in my opinion.


> Not only does PostgreSQL natively support table partitioning (which is
>
> absolutely a form of sharding), there multiple well-regarded extensions
>
> that can help with sharding, all of which are orthogonal to how you can
>
> configure your application to use Postgres in the first place. So to say
>
> Postgres doesn't support sharding is.... misleading, at best.
>
> Also, the general concept of sharding to move your scaling challenges
>
> from vertical ones to horizontal ones has multiple self-evident
>
> advantages. If your work history has all happened to fit on a single
>
> server, then bully for you, but not everybody has it so easy.

It supports partitioning out of the box - not sharding where different tables reside on different machines!

CitusData and TimescaleDB provide sharding as extensions - both of which appear useful for TimeSeries data. There was
PostgresXLwhich was a general sharding (multi-machine) solution that appears to have died. 

SQLP!