Thread: pg_attribute, pg_class, pg_depend grow huge in count and size withmultiple tenants.
pg_attribute, pg_class, pg_depend grow huge in count and size withmultiple tenants.
From
samhitha g
Date:
Hi experts,
Our application serves multiple tenants. Each tenant has the schema with a few hundreds of tables and few functions.
We have 2000 clients so we have to create 2000 schemas in a single database.
While doing this, i observed that the catalog tables pg_attribute, pg_class, pg_depend grow huge in count and size.
Do you think this will be a challenge during execution of every query ?
When Postgres parses an sql to find the best execution plan, does it scan any of these catalogs that could eventually take more time?
Any other challenges you have come across or foresee in such cases ?
Thanks,
Sammy.
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
"David G. Johnston"
Date:
On Thu, May 7, 2020 at 1:05 PM samhitha g <samhithagarudadri@gmail.com> wrote:
Our application serves multiple tenants. Each tenant has the schema with a few hundreds of tables and few functions.We have 2000 clients so we have to create 2000 schemas in a single database.
That is one option but I wouldn't say you must. If you cannot get individual tables to be multi-tenant you are probably better off having one database per client on a shared cluster - at least given the size of the schema and number of clients.
David J.
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
Avinash Kumar
Date:
Hi,
On Thu, May 7, 2020 at 5:18 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Thu, May 7, 2020 at 1:05 PM samhitha g <samhithagarudadri@gmail.com> wrote:Our application serves multiple tenants. Each tenant has the schema with a few hundreds of tables and few functions.We have 2000 clients so we have to create 2000 schemas in a single database.That is one option but I wouldn't say you must. If you cannot get individual tables to be multi-tenant you are probably better off having one database per client on a shared cluster - at least given the size of the schema and number of clients.
I am working on a similar problem.
1 database per each client may be a killer when you have a connection pooler that creates a pool for a unique combination of (user,database).
David J.
Regards,
Avinash Vallarapu
Avinash Vallarapu
+1-902-221-5976
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
Rory Campbell-Lange
Date:
On 07/05/20, Avinash Kumar (avinash.vallarapu@gmail.com) wrote: > >> Our application serves multiple tenants. Each tenant has the schema > >> with a few hundreds of tables and few functions. > >> We have 2000 clients so we have to create 2000 schemas in a single > >> database. > > That is one option but I wouldn't say you must. If you cannot get > > individual tables to be multi-tenant you are probably better off having one > > database per client on a shared cluster - at least given the size of the > > schema and number of clients. > > > I am working on a similar problem. > 1 database per each client may be a killer when you have a connection > pooler that creates a pool for a unique combination of (user,database). One of our clusters has well over 500 databases fronted by pg_bouncer. We get excellent connection "flattening" using pg_bouncer with per-database connection spikes dealt with through a reserve pool. The nice thing about separate databases is that it is easy to scale horizontally. Rory
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
Avinash Kumar
Date:
Hi,
On Thu, May 7, 2020 at 6:08 PM Rory Campbell-Lange <rory@campbell-lange.net> wrote:
On 07/05/20, Avinash Kumar (avinash.vallarapu@gmail.com) wrote:
> >> Our application serves multiple tenants. Each tenant has the schema
> >> with a few hundreds of tables and few functions.
> >> We have 2000 clients so we have to create 2000 schemas in a single
> >> database.
> > That is one option but I wouldn't say you must. If you cannot get
> > individual tables to be multi-tenant you are probably better off having one
> > database per client on a shared cluster - at least given the size of the
> > schema and number of clients.
> >
> I am working on a similar problem.
> 1 database per each client may be a killer when you have a connection
> pooler that creates a pool for a unique combination of (user,database).
One of our clusters has well over 500 databases fronted by pg_bouncer.
We get excellent connection "flattening" using pg_bouncer with
per-database connection spikes dealt with through a reserve pool.
What if you see at least 4 connections being established by each client during peak ? And if you serve 4 or 2 connections per each DB, then you are creating 1000 or more reserved connections with 500 DBs in a cluster.
The nice thing about separate databases is that it is easy to scale
horizontally.
Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500 clusters means you may have to have a lot of manual vacuuming in place as well.
Rory
Regards,
Avinash Vallarapu
Avinash Vallarapu
+1-902-221-5976
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
Laurenz Albe
Date:
On Thu, 2020-05-07 at 18:17 -0300, Avinash Kumar wrote: > > The nice thing about separate databases is that it is easy to scale > > horizontally. > > Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500 clusters > means you may have to have a lot of manual vacuuming in place as well. Just set "autovacuum_max_workers" higher. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
Avinash Kumar
Date:
Hi,
On Fri, May 8, 2020 at 3:31 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Thu, 2020-05-07 at 18:17 -0300, Avinash Kumar wrote:
> > The nice thing about separate databases is that it is easy to scale
> > horizontally.
>
> Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500 clusters
> means you may have to have a lot of manual vacuuming in place as well.
Just set "autovacuum_max_workers" higher.
No, that wouldn't help. If you just increase autovacuum_max_workers, the total cost limit of autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is shared by so many workers and it further delays autovacuum per each worker. Instead you need to increase autovacuum_vacuum_cost_limit as well when you increase the number of workers. But, if you do that and also increase workers, well, you would easily reach the limitations of the disk. I am not sure it is anywhere advised to have 20 autovacuum_max_workers unless i have a disk with lots of IOPS and with very tiny tables across all the databases.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Regards,
Avinash Vallarapu
Avinash Vallarapu
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
Laurenz Albe
Date:
On Fri, 2020-05-08 at 03:47 -0300, Avinash Kumar wrote: > > Just set "autovacuum_max_workers" higher. > > No, that wouldn't help. If you just increase autovacuum_max_workers, the total cost limit of > autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is shared by so many workers and it > further delays autovacuum per each worker. Instead you need to increase autovacuum_vacuum_cost_limit > as well when you increase the number of workers. True, I should have mentioned that. > But, if you do that and also increase workers, well, you would easily reach the limitations > of the disk. I am not sure it is anywhere advised to have 20 autovacuum_max_workers unless > i have a disk with lots of IOPS and with very tiny tables across all the databases. Sure, if you have a high database load, you will at some point exceed the limits of the machine, which is not surprising. What I am trying to say is that you have to ramp up the resources for autovacuum together with increasing the overall workload. You should consider autovacuum as part of that workload. If your machine cannot cope with the workload any more, you have to scale, which is easily done by adding more machines if you have many databases. Yours, Laurenz Albe -- Cybertec | https://www.cybertec-postgresql.com
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
Avinash Kumar
Date:
Hi,
On Fri, May 8, 2020 at 3:53 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:
On Fri, 2020-05-08 at 03:47 -0300, Avinash Kumar wrote:
> > Just set "autovacuum_max_workers" higher.
>
> No, that wouldn't help. If you just increase autovacuum_max_workers, the total cost limit of
> autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is shared by so many workers and it
> further delays autovacuum per each worker. Instead you need to increase autovacuum_vacuum_cost_limit
> as well when you increase the number of workers.
True, I should have mentioned that.
> But, if you do that and also increase workers, well, you would easily reach the limitations
> of the disk. I am not sure it is anywhere advised to have 20 autovacuum_max_workers unless
> i have a disk with lots of IOPS and with very tiny tables across all the databases.
Sure, if you have a high database load, you will at some point exceed the limits of
the machine, which is not surprising. What I am trying to say is that you have to ramp
up the resources for autovacuum together with increasing the overall workload.
You should consider autovacuum as part of that workload.
If your machine cannot cope with the workload any more, you have to scale, which
is easily done by adding more machines if you have many databases.
Agreed. Getting back to the original question asked by Sammy, i think it is still bad to create 2000 databases for storing 2000 clients/(schemas) for a multi-tenant setup.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com
Regards,
Avinash Vallarapu
Avinash Vallarapu
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
MichaelDBA
Date:
Hi all,
Since we are talking about multi-tenant databases, the citus extension fits in neatly with that using horizontal partitioning/shards.
Regards,
Michael Vitale
Avinash Kumar wrote on 5/8/2020 6:14 AM:
Since we are talking about multi-tenant databases, the citus extension fits in neatly with that using horizontal partitioning/shards.
Regards,
Michael Vitale
Avinash Kumar wrote on 5/8/2020 6:14 AM:
Hi,On Fri, May 8, 2020 at 3:53 AM Laurenz Albe <laurenz.albe@cybertec.at> wrote:On Fri, 2020-05-08 at 03:47 -0300, Avinash Kumar wrote:
> > Just set "autovacuum_max_workers" higher.
>
> No, that wouldn't help. If you just increase autovacuum_max_workers, the total cost limit of
> autovacuum_vacuum_cost_limit (or vacuum_cost_limit) is shared by so many workers and it
> further delays autovacuum per each worker. Instead you need to increase autovacuum_vacuum_cost_limit
> as well when you increase the number of workers.
True, I should have mentioned that.
> But, if you do that and also increase workers, well, you would easily reach the limitations
> of the disk. I am not sure it is anywhere advised to have 20 autovacuum_max_workers unless
> i have a disk with lots of IOPS and with very tiny tables across all the databases.
Sure, if you have a high database load, you will at some point exceed the limits of
the machine, which is not surprising. What I am trying to say is that you have to ramp
up the resources for autovacuum together with increasing the overall workload.
You should consider autovacuum as part of that workload.
If your machine cannot cope with the workload any more, you have to scale, which
is easily done by adding more machines if you have many databases.Agreed. Getting back to the original question asked by Sammy, i think it is still bad to create 2000 databases for storing 2000 clients/(schemas) for a multi-tenant setup.
Yours,
Laurenz Albe
--
Cybertec | https://www.cybertec-postgresql.com--Regards,
Avinash Vallarapu
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
Jeff Janes
Date:
On Thu, May 7, 2020 at 4:05 PM samhitha g <samhithagarudadri@gmail.com> wrote:
Hi experts,Our application serves multiple tenants. Each tenant has the schema with a few hundreds of tables and few functions.We have 2000 clients so we have to create 2000 schemas in a single database.While doing this, i observed that the catalog tables pg_attribute, pg_class, pg_depend grow huge in count and size.
Please attach numbers to "huge". We don't know what "huge" means to you.
"2000 * a few hundred" tables is certainly getting to the point where it makes sense to be concerned. But my concern would be more about backup and recovery, version upgrades, pg_dump, etc. not about daily operations.
Cheers,
Jeff
Re: pg_attribute, pg_class, pg_depend grow huge in count and sizewith multiple tenants.
From
Jeff Janes
Date:
On Thu, May 7, 2020 at 5:17 PM Avinash Kumar <avinash.vallarapu@gmail.com> wrote:
Hi,On Thu, May 7, 2020 at 6:08 PM Rory Campbell-Lange <rory@campbell-lange.net> wrote:One of our clusters has well over 500 databases fronted by pg_bouncer.
We get excellent connection "flattening" using pg_bouncer with
per-database connection spikes dealt with through a reserve pool.What if you see at least 4 connections being established by each client during peak ? And if you serve 4 or 2 connections per each DB, then you are creating 1000 or more reserved connections with 500 DBs in a cluster.
Does every database spike at the same time?
The nice thing about separate databases is that it is easy to scale
horizontally.Agreed. But, how about autovacuum ? Workers shift from DB to DB and 500 clusters means you may have to have a lot of manual vacuuming in place as well.
Why would having difference schemas in different DBs change your manual vacuuming needs? And if anything, having separate DBs will make autovacuuming more efficient, as it keeps the statistics collectors stats files smaller.
Cheers,
Jeff