Thread: Databases Vs. Schemas

Databases Vs. Schemas

From
"Subbiah, Stalin"
Date:
--sorry to repost, just subscribed to the list. hopefully it gets to the
list this time --

Hi All,

We are evaluating the options for having multiple databases vs. schemas on a
single database cluster for a custom grown app that we developed. Each app
installs same set of tables for each service. And the service could easily
be in thousands. so Is it better to have 1000 databases vs 1000 schemas in a
database cluster. What are the performance overhead of having multiple
databases vs. schemas (if any). I'm leaning towards having schemas rather
than databases but i would like to get others opinion on this. Appreciate
your reply.

Thanks,
Stalin

Re: Databases Vs. Schemas

From
Tom Lane
Date:
"Subbiah, Stalin" <SSubbiah@netopia.com> writes:
> Is it better to have 1000 databases vs 1000 schemas in a
> database cluster.

You almost certainly want to go for schemas, at least from a performance
point of view.  The overhead of a schema is small (basically one more
row in pg_namespace) whereas the overhead of a database is not trivial.

The main reason you might not want to use schemas is if you want fairly
airtight separation between different services.  Separate databases
would prevent services from looking at each others' catalog entries.

            regards, tom lane

Re: Databases Vs. Schemas

From
Adam Ruth
Date:
We have a  similarly sized database and we went with schemas.  We did
something different, though, we created one schema that contained all
of the tables (we used the public schema) and then created the hundreds
of schemas with views that access only the related rows for a
particular schema.  Something like this:

create table public.file (siteid int, id int, [fields]);
create schema sc1;
create view sc1.file as select * from public.file where siteid = 1;
create schema sc2;
create view sc2.file as select * from public file where siteid = 2;

And we also created rules to allow update, delete, and insert on those
views so that they looked like tables.  The reason we did this is
because we ran into issues with too many open files during pg_dump when
we had thousands of tables instead of about 1 hundred tables and
thousands of views.

We, however, did have a need to periodically select data from 2 schemas
at a time, and it was simpler logic than if we needed 2 database
connections.

Adam Ruth

On Mar 22, 2004, at 2:30 PM, Subbiah, Stalin wrote:

> --sorry to repost, just subscribed to the list. hopefully it gets to
> the
> list this time --
>
> Hi All,
>
> We are evaluating the options for having multiple databases vs.
> schemas on a
> single database cluster for a custom grown app that we developed. Each
> app
> installs same set of tables for each service. And the service could
> easily
> be in thousands. so Is it better to have 1000 databases vs 1000
> schemas in a
> database cluster. What are the performance overhead of having multiple
> databases vs. schemas (if any). I'm leaning towards having schemas
> rather
> than databases but i would like to get others opinion on this.
> Appreciate
> your reply.
>
> Thanks,
> Stalin
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>


Re: Databases Vs. Schemas

From
"Subbiah, Stalin"
Date:
>And we also created rules to allow update, delete, and insert on those
>views so that they looked like tables.  The reason we did this is
>because we ran into issues with too many open files during pg_dump when
>we had thousands of tables instead of about 1 hundred tables and
>thousands of views.

Is it because you had smaller value set for max. allowable number of open
files descriptor. what was ulimit -a set to ?

>We, however, did have a need to periodically select data from 2 schemas
>at a time, and it was simpler logic than if we needed 2 database
>connections.

Adam Ruth

On Mar 22, 2004, at 2:30 PM, Subbiah, Stalin wrote:

> --sorry to repost, just subscribed to the list. hopefully it gets to
> the
> list this time --
>
> Hi All,
>
> We are evaluating the options for having multiple databases vs.
> schemas on a
> single database cluster for a custom grown app that we developed. Each
> app
> installs same set of tables for each service. And the service could
> easily
> be in thousands. so Is it better to have 1000 databases vs 1000
> schemas in a
> database cluster. What are the performance overhead of having multiple
> databases vs. schemas (if any). I'm leaning towards having schemas
> rather
> than databases but i would like to get others opinion on this.
> Appreciate
> your reply.
>
> Thanks,
> Stalin
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 3: if posting/reading through Usenet, please send an appropriate
>       subscribe-nomail command to majordomo@postgresql.org so that your
>       message can get through to the mailing list cleanly
>

Re: [PERFORM] Databases Vs. Schemas

From
Josh Berkus
Date:
Stalin,

> We are evaluating the options for having multiple databases vs. schemas on a
> single database cluster for a custom grown app that we developed. Each app
> installs same set of tables for each service. And the service could easily
> be in thousands. so Is it better to have 1000 databases vs 1000 schemas in a
> database cluster. What are the performance overhead of having multiple
> databases vs. schemas (if any). I'm leaning towards having schemas rather
> than databases but i would like to get others opinion on this. Appreciate
> your reply.

No performance difference AFAIK.   The real question is whether you have to
have queries joining several "databases".   If yes, use Schema; if no, use
databases.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Databases Vs. Schemas

From
Adam Ruth
Date:
On Mar 23, 2004, at 11:16 AM, Subbiah, Stalin wrote:

>> And we also created rules to allow update, delete, and insert on those
>> views so that they looked like tables.  The reason we did this is
>> because we ran into issues with too many open files during pg_dump
>> when
>> we had thousands of tables instead of about 1 hundred tables and
>> thousands of views.
>
> Is it because you had smaller value set for max. allowable number of
> open
> files descriptor. what was ulimit -a set to ?

It was actually running on OS X and it was a shared memory issue.  We
would have had to recompile the Darwin kernel to get a bigger SHMMAX,
but this solution seemed better since we would possibly be installing
on servers where we wouldn't have that much leeway.  I think that the
view idea works better for a number of other reasons.  For one, I can
do a query on the base table and see all of the rows for all of the
schemas at once, that has proven quite useful.

>
>> We, however, did have a need to periodically select data from 2
>> schemas
>> at a time, and it was simpler logic than if we needed 2 database
>> connections.
>
> Adam Ruth
>
> On Mar 22, 2004, at 2:30 PM, Subbiah, Stalin wrote:
>
>> --sorry to repost, just subscribed to the list. hopefully it gets to
>> the
>> list this time --
>>
>> Hi All,
>>
>> We are evaluating the options for having multiple databases vs.
>> schemas on a
>> single database cluster for a custom grown app that we developed. Each
>> app
>> installs same set of tables for each service. And the service could
>> easily
>> be in thousands. so Is it better to have 1000 databases vs 1000
>> schemas in a
>> database cluster. What are the performance overhead of having multiple
>> databases vs. schemas (if any). I'm leaning towards having schemas
>> rather
>> than databases but i would like to get others opinion on this.
>> Appreciate
>> your reply.
>>
>> Thanks,
>> Stalin
>>
>> ---------------------------(end of
>> broadcast)---------------------------
>> TIP 3: if posting/reading through Usenet, please send an appropriate
>>       subscribe-nomail command to majordomo@postgresql.org so that
>> your
>>       message can get through to the mailing list cleanly
>>
>