Thread: hundreds of schema vs hundreds of databases

hundreds of schema vs hundreds of databases

From
olivier
Date:
Hi group,

I have an application with some hundreds users, each one having the same
data definitions, and each one storing up to 2 GB of data.
A user have just access to his own data. His data will have its own
tablespace.

Therefore, it seems to me I have a choice between "one database per
user" and "one schema per user in the same database".

What is the best practice here ? Which solution will be the easiest to
manage ?


Cheers,

   Olivier





Re: hundreds of schema vs hundreds of databases

From
"Albe Laurenz"
Date:
> I have an application with some hundreds users, each one
> having the same
> data definitions, and each one storing up to 2 GB of data.
> A user have just access to his own data. His data will have its own
> tablespace.
>
> Therefore, it seems to me I have a choice between "one database per
> user" and "one schema per user in the same database".
>
> What is the best practice here ? Which solution will be the
> easiest to manage ?

Advantages of many databases:
- Each database is smaller.
- No danger of one user accessing another user's data (because of
  misconfigured permissions and similar).
- Guaranteed independence of each user's data.
- More scalable: If you decide that one machine or one cluster
  is not enough to handle the load, you can easily transfer some
  of the databases somewhere else.

Advantages of one database with many schemata:
- Fewer databases to administrate.

I'd probably go for many databases.

Yours,
Laurenz Albe

Re: hundreds of schema vs hundreds of databases

From
"Merlin Moncure"
Date:
On 5/29/07, Albe Laurenz <all@adv.magwien.gv.at> wrote:
> > I have an application with some hundreds users, each one
> > having the same
> > data definitions, and each one storing up to 2 GB of data.
> > A user have just access to his own data. His data will have its own
> > tablespace.
> >
> > Therefore, it seems to me I have a choice between "one database per
> > user" and "one schema per user in the same database".
> >
> > What is the best practice here ? Which solution will be the
> > easiest to manage ?
>
> Advantages of many databases:
> - Each database is smaller.
> - No danger of one user accessing another user's data (because of
>   misconfigured permissions and similar).
> - Guaranteed independence of each user's data.
> - More scalable: If you decide that one machine or one cluster
>   is not enough to handle the load, you can easily transfer some
>   of the databases somewhere else.
>
> Advantages of one database with many schemata:
> - Fewer databases to administrate.
>
> I'd probably go for many databases.

you missed one possible advantage of schemas...database structures can
be more easily shared.   For example, you can join one of the user's
private tables with a shared central table.  With multiple databases,
you have to resort to other strategies to do that, for example dblink.

Schemas are designed to the effect of giving a private data area in a
large shared database.  Separate databases would be preferred if the
databases are backing difrferent applications and completely
unrelated.

merlin

Re: hundreds of schema vs hundreds of databases

From
Ron Johnson
Date:
On 05/29/07 04:02, Albe Laurenz wrote:
>> I have an application with some hundreds users, each one
>> having the same
>> data definitions, and each one storing up to 2 GB of data.
>> A user have just access to his own data. His data will have its own
>> tablespace.
>>
>> Therefore, it seems to me I have a choice between "one database per
>> user" and "one schema per user in the same database".
>>
>> What is the best practice here ? Which solution will be the
>> easiest to manage ?
>
> Advantages of many databases:
> - Each database is smaller.
> - No danger of one user accessing another user's data (because of
>   misconfigured permissions and similar).
> - Guaranteed independence of each user's data.
> - More scalable: If you decide that one machine or one cluster

You could always dump a schema then drop it and restore it in a new
database.  At 2GB, that should be quick.

>   is not enough to handle the load, you can easily transfer some
>   of the databases somewhere else.
 >
> Advantages of one database with many schemata:
> - Fewer databases to administrate.

But since they all have to have the same schema, you'd still have
the same DDL overhead whether it's one DB or many.

Does PG set up buffers at the postmaster level or the database level?

If at the database level, then you'll be allocating memory to
databases that might not be in use at any one time, thus wasting it.
  One database buffer pool would make more efficient use of RAM.

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


Re: hundreds of schema vs hundreds of databases

From
Guy Rouillier
Date:
Albe Laurenz wrote:
>
> Advantages of many databases:
> - Each database is smaller.
> - No danger of one user accessing another user's data (because of
>   misconfigured permissions and similar).
> - Guaranteed independence of each user's data.
> - More scalable: If you decide that one machine or one cluster
>   is not enough to handle the load, you can easily transfer some
>   of the databases somewhere else.
>
> Advantages of one database with many schemata:
> - Fewer databases to administrate.

Using different databases for each user incurs the full overhead of
creating and maintaining a database: all the system tables and all the
memory required to keep a database open.  If the OP is allowing direct
SQL access to each user, then the risks you identify above must be
addressed, but tbey can fairly simply by using scripts to create each
new user.  I'd opt for using schemas unless there is a compelling
evidence that different databases are required.

--
Guy Rouillier

Re: hundreds of schema vs hundreds of databases

From
"Albe Laurenz"
Date:
Ron Johnson wrote:
> Does PG set up buffers at the postmaster level or the database level?
>
> If at the database level, then you'll be allocating memory to
> databases that might not be in use at any one time, thus wasting it.
>   One database buffer pool would make more efficient use of RAM.

Shared memory is allocated at the cluster level.
See
http://www.postgresql.org/docs/current/static/runtime-config-resource.ht
ml#RUNTIME-CONFIG-RESOURCE-MEMORY

Yours,
Laurenz Albe

Re: hundreds of schema vs hundreds of databases

From
Ron Johnson
Date:
On 05/30/07 01:38, Albe Laurenz wrote:
> Ron Johnson wrote:
>> Does PG set up buffers at the postmaster level or the database level?
>>
>> If at the database level, then you'll be allocating memory to
>> databases that might not be in use at any one time, thus wasting it.
>>   One database buffer pool would make more efficient use of RAM.
>
> Shared memory is allocated at the cluster level.
> See
> http://www.postgresql.org/docs/current/static/runtime-config-resource.ht
> ml#RUNTIME-CONFIG-RESOURCE-MEMORY

I read that page, but don't see any references to "cluster level".
Maybe I am misinterpreting "cluster"?

--
Ron Johnson, Jr.
Jefferson LA  USA

Give a man a fish, and he eats for a day.
Hit him with a fish, and he goes away for good!


Re: hundreds of schema vs hundreds of databases

From
"Merlin Moncure"
Date:
On 5/30/07, Ron Johnson <ron.l.johnson@cox.net> wrote:
> On 05/30/07 01:38, Albe Laurenz wrote:
> > Ron Johnson wrote:
> >> Does PG set up buffers at the postmaster level or the database level?
> >>
> >> If at the database level, then you'll be allocating memory to
> >> databases that might not be in use at any one time, thus wasting it.
> >>   One database buffer pool would make more efficient use of RAM.
> >
> > Shared memory is allocated at the cluster level.
> > See
> > http://www.postgresql.org/docs/current/static/runtime-config-resource.ht
> > ml#RUNTIME-CONFIG-RESOURCE-MEMORY
>
> I read that page, but don't see any references to "cluster level".
> Maybe I am misinterpreting "cluster"?

Meaning the database cluster:
http://www.postgresql.org/docs/8.2/static/creating-cluster.html.

I can understand your confusion though.

merlin