Thread: Databases Vs. Schemas
--sorry to repost, just subscribed to the list. hopefully it gets to the list this time -- Hi All, We are evaluating the options for having multiple databases vs. schemas on a single database cluster for a custom grown app that we developed. Each app installs same set of tables for each service. And the service could easily be in thousands. so Is it better to have 1000 databases vs 1000 schemas in a database cluster. What are the performance overhead of having multiple databases vs. schemas (if any). I'm leaning towards having schemas rather than databases but i would like to get others opinion on this. Appreciate your reply. Thanks, Stalin
"Subbiah, Stalin" <SSubbiah@netopia.com> writes: > Is it better to have 1000 databases vs 1000 schemas in a > database cluster. You almost certainly want to go for schemas, at least from a performance point of view. The overhead of a schema is small (basically one more row in pg_namespace) whereas the overhead of a database is not trivial. The main reason you might not want to use schemas is if you want fairly airtight separation between different services. Separate databases would prevent services from looking at each others' catalog entries. regards, tom lane
We have a similarly sized database and we went with schemas. We did something different, though, we created one schema that contained all of the tables (we used the public schema) and then created the hundreds of schemas with views that access only the related rows for a particular schema. Something like this: create table public.file (siteid int, id int, [fields]); create schema sc1; create view sc1.file as select * from public.file where siteid = 1; create schema sc2; create view sc2.file as select * from public file where siteid = 2; And we also created rules to allow update, delete, and insert on those views so that they looked like tables. The reason we did this is because we ran into issues with too many open files during pg_dump when we had thousands of tables instead of about 1 hundred tables and thousands of views. We, however, did have a need to periodically select data from 2 schemas at a time, and it was simpler logic than if we needed 2 database connections. Adam Ruth On Mar 22, 2004, at 2:30 PM, Subbiah, Stalin wrote: > --sorry to repost, just subscribed to the list. hopefully it gets to > the > list this time -- > > Hi All, > > We are evaluating the options for having multiple databases vs. > schemas on a > single database cluster for a custom grown app that we developed. Each > app > installs same set of tables for each service. And the service could > easily > be in thousands. so Is it better to have 1000 databases vs 1000 > schemas in a > database cluster. What are the performance overhead of having multiple > databases vs. schemas (if any). I'm leaning towards having schemas > rather > than databases but i would like to get others opinion on this. > Appreciate > your reply. > > Thanks, > Stalin > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
>And we also created rules to allow update, delete, and insert on those >views so that they looked like tables. The reason we did this is >because we ran into issues with too many open files during pg_dump when >we had thousands of tables instead of about 1 hundred tables and >thousands of views. Is it because you had smaller value set for max. allowable number of open files descriptor. what was ulimit -a set to ? >We, however, did have a need to periodically select data from 2 schemas >at a time, and it was simpler logic than if we needed 2 database >connections. Adam Ruth On Mar 22, 2004, at 2:30 PM, Subbiah, Stalin wrote: > --sorry to repost, just subscribed to the list. hopefully it gets to > the > list this time -- > > Hi All, > > We are evaluating the options for having multiple databases vs. > schemas on a > single database cluster for a custom grown app that we developed. Each > app > installs same set of tables for each service. And the service could > easily > be in thousands. so Is it better to have 1000 databases vs 1000 > schemas in a > database cluster. What are the performance overhead of having multiple > databases vs. schemas (if any). I'm leaning towards having schemas > rather > than databases but i would like to get others opinion on this. > Appreciate > your reply. > > Thanks, > Stalin > > ---------------------------(end of > broadcast)--------------------------- > TIP 3: if posting/reading through Usenet, please send an appropriate > subscribe-nomail command to majordomo@postgresql.org so that your > message can get through to the mailing list cleanly >
Stalin, > We are evaluating the options for having multiple databases vs. schemas on a > single database cluster for a custom grown app that we developed. Each app > installs same set of tables for each service. And the service could easily > be in thousands. so Is it better to have 1000 databases vs 1000 schemas in a > database cluster. What are the performance overhead of having multiple > databases vs. schemas (if any). I'm leaning towards having schemas rather > than databases but i would like to get others opinion on this. Appreciate > your reply. No performance difference AFAIK. The real question is whether you have to have queries joining several "databases". If yes, use Schema; if no, use databases. -- -Josh Berkus Aglio Database Solutions San Francisco
On Mar 23, 2004, at 11:16 AM, Subbiah, Stalin wrote: >> And we also created rules to allow update, delete, and insert on those >> views so that they looked like tables. The reason we did this is >> because we ran into issues with too many open files during pg_dump >> when >> we had thousands of tables instead of about 1 hundred tables and >> thousands of views. > > Is it because you had smaller value set for max. allowable number of > open > files descriptor. what was ulimit -a set to ? It was actually running on OS X and it was a shared memory issue. We would have had to recompile the Darwin kernel to get a bigger SHMMAX, but this solution seemed better since we would possibly be installing on servers where we wouldn't have that much leeway. I think that the view idea works better for a number of other reasons. For one, I can do a query on the base table and see all of the rows for all of the schemas at once, that has proven quite useful. > >> We, however, did have a need to periodically select data from 2 >> schemas >> at a time, and it was simpler logic than if we needed 2 database >> connections. > > Adam Ruth > > On Mar 22, 2004, at 2:30 PM, Subbiah, Stalin wrote: > >> --sorry to repost, just subscribed to the list. hopefully it gets to >> the >> list this time -- >> >> Hi All, >> >> We are evaluating the options for having multiple databases vs. >> schemas on a >> single database cluster for a custom grown app that we developed. Each >> app >> installs same set of tables for each service. And the service could >> easily >> be in thousands. so Is it better to have 1000 databases vs 1000 >> schemas in a >> database cluster. What are the performance overhead of having multiple >> databases vs. schemas (if any). I'm leaning towards having schemas >> rather >> than databases but i would like to get others opinion on this. >> Appreciate >> your reply. >> >> Thanks, >> Stalin >> >> ---------------------------(end of >> broadcast)--------------------------- >> TIP 3: if posting/reading through Usenet, please send an appropriate >> subscribe-nomail command to majordomo@postgresql.org so that >> your >> message can get through to the mailing list cleanly >> >