Thread: Schema per user?

Schema per user?

From
Anders Steinlein
Date:
Hi,

I'm pondering a design question for a subscription-based web-app we
are developing. Would it be feasible to create a new schema per user
account, setting the search_path to their own schema during login?
There is no shared data (except where we keep a record of users), as
each account is entirely self-contained. I would usually just put
username into relevant tables and querying based on this, but I
figured separate schemas *might* have a few advantages: Ease scaling
by placing users across different tablespaces and/or databases, easier
backup/restore of specific user's data and no vacuum when we delete
accounts (just drop schema vs delete).

Has anyone done something like this, or is it simply A Bad Idea? How
many schemas can a database contain, until one hits a (hard or soft)
limit? Keep in mind that this is not a "regular" web-app with
thousands of users, but more in the range of 500-1000 for the
foreseeable future.

Regads,
Anders

Re: Schema per user?

From
Scott Marlowe
Date:
On Thu, May 7, 2009 at 1:45 PM, Anders Steinlein <anders@steinlein.no> wrote:
> Hi,
>
> I'm pondering a design question for a subscription-based web-app we are
> developing. Would it be feasible to create a new schema per user account,
> setting the search_path to their own schema during login? There is no shared
> data (except where we keep a record of users), as each account is entirely
> self-contained. I would usually just put username into relevant tables and
> querying based on this, but I figured separate schemas *might* have a few
> advantages: Ease scaling by placing users across different tablespaces
> and/or databases, easier backup/restore of specific user's data and no
> vacuum when we delete accounts (just drop schema vs delete).
>
> Has anyone done something like this, or is it simply A Bad Idea? How many
> schemas can a database contain, until one hits a (hard or soft) limit? Keep
> in mind that this is not a "regular" web-app with thousands of users, but
> more in the range of 500-1000 for the foreseeable future.

We're looking at something similar here at work, but in the 10k to 10M
range of schemas.  I'll let you know how our testing goes.

1,000 is nothing in terms of schemas.  You should be fine.

Re: Schema per user?

From
Anders Steinlein
Date:
On May 7, 2009, at 10:05 PM, Scott Marlowe wrote:

> On Thu, May 7, 2009 at 1:45 PM, Anders Steinlein
> <anders@steinlein.no> wrote:
>> Hi,
>>
>> I'm pondering a design question for a subscription-based web-app we
>> are
>> developing. Would it be feasible to create a new schema per user
>> account,
>> setting the search_path to their own schema during login?
>>
>> [snip]
>
> We're looking at something similar here at work, but in the 10k to 10M
> range of schemas.  I'll let you know how our testing goes.
>
> 1,000 is nothing in terms of schemas.  You should be fine.

I'd be *very* interested to hear your experiences once you get some
results.

Generally though, what made you consider such a solution? Same
advantages as I mentioned? One thing I'm a bit usure of how best to
solve is where to place the "users" or some such table for
authentication and other "shared" info -- simply in the "public"
schema, perhaps?

\a.

Re: Schema per user?

From
Adam Ruth
Date:
I've actually done this before. I had a web app with about 400 users
each with their own schema. It actually worked very well, except for
one thing. There got to be so many tables that a pg_dump would fail
because it would run out of file locks. We got around it by creating a
primary table and then using views in each of the schemas to access
that user's data. It also made it easy to do a query against all users
at once in the primary table.


On 08/05/2009, at 5:45 AM, Anders Steinlein wrote:

> Hi,
>
> I'm pondering a design question for a subscription-based web-app we
> are developing. Would it be feasible to create a new schema per user
> account, setting the search_path to their own schema during login?
> There is no shared data (except where we keep a record of users), as
> each account is entirely self-contained. I would usually just put
> username into relevant tables and querying based on this, but I
> figured separate schemas *might* have a few advantages: Ease scaling
> by placing users across different tablespaces and/or databases,
> easier backup/restore of specific user's data and no vacuum when we
> delete accounts (just drop schema vs delete).
>
> Has anyone done something like this, or is it simply A Bad Idea? How
> many schemas can a database contain, until one hits a (hard or soft)
> limit? Keep in mind that this is not a "regular" web-app with
> thousands of users, but more in the range of 500-1000 for the
> foreseeable future.
>
> Regads,
> Anders
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: Schema per user?

From
Erik Jones
Date:
On May 7, 2009, at 1:05 PM, Scott Marlowe wrote:

> On Thu, May 7, 2009 at 1:45 PM, Anders Steinlein
> <anders@steinlein.no> wrote:
>> Hi,
>>
>> I'm pondering a design question for a subscription-based web-app we
>> are
>> developing. Would it be feasible to create a new schema per user
>> account,
>> setting the search_path to their own schema during login? There is
>> no shared
>> data (except where we keep a record of users), as each account is
>> entirely
>> self-contained. I would usually just put username into relevant
>> tables and
>> querying based on this, but I figured separate schemas *might* have
>> a few
>> advantages: Ease scaling by placing users across different
>> tablespaces
>> and/or databases, easier backup/restore of specific user's data and
>> no
>> vacuum when we delete accounts (just drop schema vs delete).
>>
>> Has anyone done something like this, or is it simply A Bad Idea?
>> How many
>> schemas can a database contain, until one hits a (hard or soft)
>> limit? Keep
>> in mind that this is not a "regular" web-app with thousands of
>> users, but
>> more in the range of 500-1000 for the foreseeable future.
>
> We're looking at something similar here at work, but in the 10k to 10M
> range of schemas.  I'll let you know how our testing goes.
>
> 1,000 is nothing in terms of schemas.  You should be fine.

One thing you'll notice a big change in is dump times from pg_dump
when compared to what you'd see from a db with the same size data set
but a smaller schema.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






Re: Schema per user?

From
Tom Lane
Date:
Adam Ruth <adamruth@mac.com> writes:
> I've actually done this before. I had a web app with about 400 users
> each with their own schema. It actually worked very well, except for
> one thing. There got to be so many tables that a pg_dump would fail
> because it would run out of file locks. We got around it by creating a
> primary table and then using views in each of the schemas to access
> that user's data. It also made it easy to do a query against all users
> at once in the primary table.

Note that this is about how many tables you have, and has got nothing to
do with how many schemas they are in, but: the solution to that is to
increase max_locks_per_transaction.  The default value is kinda
conservative to avoid eating up too much shared memory.

            regards, tom lane

Re: Schema per user?

From
Scott Marlowe
Date:
On Thu, May 7, 2009 at 2:56 PM, Erik Jones <ejones@engineyard.com> wrote:
>
> On May 7, 2009, at 1:05 PM, Scott Marlowe wrote:
>
>> On Thu, May 7, 2009 at 1:45 PM, Anders Steinlein <anders@steinlein.no>
>> wrote:
>>>
>>> Hi,
>>>
>>> I'm pondering a design question for a subscription-based web-app we are
>>> developing. Would it be feasible to create a new schema per user account,
>>> setting the search_path to their own schema during login? There is no
>>> shared
>>> data (except where we keep a record of users), as each account is
>>> entirely
>>> self-contained. I would usually just put username into relevant tables
>>> and
>>> querying based on this, but I figured separate schemas *might* have a few
>>> advantages: Ease scaling by placing users across different tablespaces
>>> and/or databases, easier backup/restore of specific user's data and no
>>> vacuum when we delete accounts (just drop schema vs delete).
>>>
>>> Has anyone done something like this, or is it simply A Bad Idea? How many
>>> schemas can a database contain, until one hits a (hard or soft) limit?
>>> Keep
>>> in mind that this is not a "regular" web-app with thousands of users, but
>>> more in the range of 500-1000 for the foreseeable future.
>>
>> We're looking at something similar here at work, but in the 10k to 10M
>> range of schemas.  I'll let you know how our testing goes.
>>
>> 1,000 is nothing in terms of schemas.  You should be fine.
>
> One thing you'll notice a big change in is dump times from pg_dump when
> compared to what you'd see from a db with the same size data set but a
> smaller schema.

Yeah, we're already looking at segregating out pg_dumps by schemas /
ranges of schemas to keep things manageable.

Re: Schema per user?

From
Scott Marlowe
Date:
On Thu, May 7, 2009 at 2:12 PM, Anders Steinlein <anders@steinlein.no> wrote:
>
> On May 7, 2009, at 10:05 PM, Scott Marlowe wrote:
>
>> On Thu, May 7, 2009 at 1:45 PM, Anders Steinlein <anders@steinlein.no>
>> wrote:
>>>
>>> Hi,
>>>
>>> I'm pondering a design question for a subscription-based web-app we are
>>> developing. Would it be feasible to create a new schema per user account,
>>> setting the search_path to their own schema during login?
>>>
>>> [snip]
>>
>> We're looking at something similar here at work, but in the 10k to 10M
>> range of schemas.  I'll let you know how our testing goes.
>>
>> 1,000 is nothing in terms of schemas.  You should be fine.
>
> I'd be *very* interested to hear your experiences once you get some results.
>
> Generally though, what made you consider such a solution? Same advantages as
> I mentioned? One thing I'm a bit usure of how best to solve is where to
> place the "users" or some such table for authentication and other "shared"
> info -- simply in the "public" schema, perhaps?

We're looking at a "schema per group" fit for a certain application
and we have lot of groups (in the 100,000 to 1,000,000 range.)  We're
also looking at partitioning to multiple db servers if needs be.  It's
a compelling app, and schemas allow us to have one copy of the master
user data etc and the app just has to have a different search path and
viola, we're integrated.

Re: Schema per user?

From
Erik Jones
Date:
On May 7, 2009, at 2:06 PM, Scott Marlowe wrote:

> On Thu, May 7, 2009 at 2:12 PM, Anders Steinlein
> <anders@steinlein.no> wrote:
>>
>> On May 7, 2009, at 10:05 PM, Scott Marlowe wrote:
>>
>>> On Thu, May 7, 2009 at 1:45 PM, Anders Steinlein <anders@steinlein.no
>>> >
>>> wrote:
>>>>
>>>> Hi,
>>>>
>>>> I'm pondering a design question for a subscription-based web-app
>>>> we are
>>>> developing. Would it be feasible to create a new schema per user
>>>> account,
>>>> setting the search_path to their own schema during login?
>>>>
>>>> [snip]
>>>
>>> We're looking at something similar here at work, but in the 10k to
>>> 10M
>>> range of schemas.  I'll let you know how our testing goes.
>>>
>>> 1,000 is nothing in terms of schemas.  You should be fine.
>>
>> I'd be *very* interested to hear your experiences once you get some
>> results.
>>
>> Generally though, what made you consider such a solution? Same
>> advantages as
>> I mentioned? One thing I'm a bit usure of how best to solve is
>> where to
>> place the "users" or some such table for authentication and other
>> "shared"
>> info -- simply in the "public" schema, perhaps?
>
> We're looking at a "schema per group" fit for a certain application
> and we have lot of groups (in the 100,000 to 1,000,000 range.)  We're
> also looking at partitioning to multiple db servers if needs be.  It's
> a compelling app, and schemas allow us to have one copy of the master
> user data etc and the app just has to have a different search path and
> viola, we're integrated.

Actually, that does sound really interesting.  I could see using pl/
proxy to handle transparently accessing schemas regardless of what
actual db their on -- and I do think that once you get up to those #s
you're talking you're going to need to partition across multiple boxes.

Erik Jones, Database Administrator
Engine Yard
Support, Scalability, Reliability
866.518.9273 x 260
Location: US/Pacific
IRC: mage2k






Re: Schema per user?

From
David Kerr
Date:
On Thu, May 07, 2009 at 03:06:11PM -0600, Scott Marlowe wrote:
- On Thu, May 7, 2009 at 2:12 PM, Anders Steinlein <anders@steinlein.no> wrote:
- >
- > On May 7, 2009, at 10:05 PM, Scott Marlowe wrote:
- >
- >> On Thu, May 7, 2009 at 1:45 PM, Anders Steinlein <anders@steinlein.no>
- >> wrote:
- > Generally though, what made you consider such a solution? Same advantages as
- > I mentioned? One thing I'm a bit usure of how best to solve is where to
- > place the "users" or some such table for authentication and other "shared"
- > info -- simply in the "public" schema, perhaps?
-
- We're looking at a "schema per group" fit for a certain application
- and we have lot of groups (in the 100,000 to 1,000,000 range.)  We're
- also looking at partitioning to multiple db servers if needs be.  It's
- a compelling app, and schemas allow us to have one copy of the master
- user data etc and the app just has to have a different search path and
- viola, we're integrated.
-

Interesting, we were looking at something similar but dismissed it because it
seemed like a maintenance nightmare, instead we're planning on going with
partitioning.

From a programming aspect, we're using JPA, anyone know if you can set search_path
with JPA/JDBC?

Also, how do you plan to handle schema updates in that model, inheritence?

You don't have a concern with dealing with 100,000 * n tables?

My background is with oracle, and in general it would have cleanup issues with
tracking that many tables/segments. Does postgres just handle an insane amount
of tables better?

Thanks

David Kerr

Re: Schema per user?

From
Scott Marlowe
Date:
On Thu, May 7, 2009 at 4:59 PM, David Kerr <dmk@mr-paradox.net> wrote:
> On Thu, May 07, 2009 at 03:06:11PM -0600, Scott Marlowe wrote:
> - On Thu, May 7, 2009 at 2:12 PM, Anders Steinlein <anders@steinlein.no> wrote:
> - >
> - > On May 7, 2009, at 10:05 PM, Scott Marlowe wrote:
> - >
> - >> On Thu, May 7, 2009 at 1:45 PM, Anders Steinlein <anders@steinlein.no>
> - >> wrote:
> - > Generally though, what made you consider such a solution? Same advantages as
> - > I mentioned? One thing I'm a bit usure of how best to solve is where to
> - > place the "users" or some such table for authentication and other "shared"
> - > info -- simply in the "public" schema, perhaps?
> -
> - We're looking at a "schema per group" fit for a certain application
> - and we have lot of groups (in the 100,000 to 1,000,000 range.)  We're
> - also looking at partitioning to multiple db servers if needs be.  It's
> - a compelling app, and schemas allow us to have one copy of the master
> - user data etc and the app just has to have a different search path and
> - viola, we're integrated.
> -
>
> Interesting, we were looking at something similar but dismissed it because it
> seemed like a maintenance nightmare, instead we're planning on going with
> partitioning.

We're looking at both, eventually.  They're kind of two parts of the
same solution.  schemas make having many isolated users easy,
partitioning (across dbs with something like pl/proxy or in the app
layer) allows us to scale to millions or more schemas before things
start getting too slow.

> From a programming aspect, we're using JPA, anyone know if you can set search_path
> with JPA/JDBC?

Can you issue
set search_path=public,myschema;
as a command, just straight up sql?

We don't use java for much here, so I don't know.

> Also, how do you plan to handle schema updates in that model, inheritence?

Yeah, we're looking at using inheritance to allow us to update them
all at once.  But if we need to do more than that, we can always
automate it with scripts.

> You don't have a concern with dealing with 100,000 * n tables?

Of course I do.  But I kinda want to know when we start having issues.
 Our machines are pretty beefy, and we can double the memory to 64G
and will probably double (or more) the cores on them from 8 to 16 to
24 etc... over time.  So, as long as the catalog tables aren't truly
monstrous we should be ok.  But only benchmarking will really give us
an idea

> My background is with oracle, and in general it would have cleanup issues with
> tracking that many tables/segments. Does postgres just handle an insane amount
> of tables better?

I don't think pgsql can handle many more tables than oracle in terms
of performance, it's just way easier to maintain them, due to useful
features like inheritance and such.  And multi-element search_path
lets you mix > 1 schema for a user to see seamlessly.  you have to be
WAY more schema aware in writing an app in oracle, at least in the
older versions I've used.  Haven't touched 11g, and don't really plan
to unless I have to. :)

Re: Schema per user?

From
CaT
Date:
On Thu, May 07, 2009 at 09:45:30PM +0200, Anders Steinlein wrote:
> Has anyone done something like this, or is it simply A Bad Idea? How
> many schemas can a database contain, until one hits a (hard or soft)
> limit? Keep in mind that this is not a "regular" web-app with thousands
> of users, but more in the range of 500-1000 for the foreseeable future.

I'm doing something similarish for our blog server. We have a blog db and
a schema for each user. This was mainly for backup, restore and maintenance
reasons. Being able to take one dump of all the blogs and then selectively
rip them out by schema is nice.

Currently we're at 632 schemas and roles and climbing. No problems as yet.

--
  "A search of his car uncovered pornography, a homemade sex aid, women's
  stockings and a Jack Russell terrier."
    - http://www.news.com.au/story/0%2C27574%2C24675808-421%2C00.html