Thread: Design question: Scalability and tens of thousands of tables?

Design question: Scalability and tens of thousands of tables?

From
"zackchandler"
Date:
Hi everybody.  Quick question:

I'm designing an application that will allow users to create custom objects
on the application level.  A custom object can have zero or more attributes.
Attributes can be one of 5-10 types (ex. String, Number, List, Date, Time,
Currency, etc...). This will allow users to track anything exactly as they
want.  My first thought on how to structure this is to make a custom table
in the db for each custom object.  The attributes would map to fields and
everything would work just dandy.

The problem is I am very concerned about scalability with having a different
table created for each custom object.  I want to design to site to handle
tens of thousands of users.  If each user has 3-5 custom objects the
database would have to handle tens of thousands of tables.

So it would appear that this is a broken solution and will not scale.  Has
anyone designed a similar system or have ideas to share?

Thanks,
Zack


Re: Design question: Scalability and tens of thousands of tables?

From
Michael Glaesemann
Date:
On Nov 4, 2005, at 2:58 , zackchandler wrote:

> The problem is I am very concerned about scalability with having a  
> different
> table created for each custom object.  I want to design to site to  
> handle
> tens of thousands of users.  If each user has 3-5 custom objects the
> database would have to handle tens of thousands of tables.
>
> So it would appear that this is a broken solution and will not  
> scale.  Has
> anyone designed a similar system or have ideas to share?

One resource you may want to look at is Joe Conway's site:
http://www.joeconway.com/

He has a PDF available for download that may help you accomplish this:
OSCON 2004 presentation: Flexible Data Acquisition and Analysis

Joe's an active member of the PostgreSQL community and there's a lot  
of good stuff there.

Hope this helps.

Michael Glaesemann
grzm myrealbox com





Re: Design question: Scalability and tens of thousands of tables?

From
James Robinson
Date:
On Nov 4, 2005, at 9:47 AM, pgsql-sql-owner@postgresql.org wrote:

> The problem is I am very concerned about scalability with having a  
> different
> table created for each custom object.  I want to design to site to  
> handle
> tens of thousands of users.  If each user has 3-5 custom objects the
> database would have to handle tens of thousands of tables.
>
> So it would appear that this is a broken solution and will not  
> scale.  Has
> anyone designed a similar system or have ideas to share?

Well, before you discount it, did you try out your design? You could  
do initial segregation of user's tables into separate schemas (say,  
schema 'a' -> 'z' according to username or some better hashing  
routine like brute-force round-robin assignment at user creation  
time). Assignment of objects -> schema would be one additional column  
in your centralized user directory table(s).

----
James Robinson
Socialserve.com



Re: Design question: Scalability and tens of thousands of tables?

From
Christopher Browne
Date:
> On Nov 4, 2005, at 9:47 AM, pgsql-sql-owner@postgresql.org wrote:
>
>> The problem is I am very concerned about scalability with having a
>> different
>> table created for each custom object.  I want to design to site to
>> handle
>> tens of thousands of users.  If each user has 3-5 custom objects the
>> database would have to handle tens of thousands of tables.
>>
>> So it would appear that this is a broken solution and will not
>> scale.  Has
>> anyone designed a similar system or have ideas to share?
>
> Well, before you discount it, did you try out your design? You could
> do initial segregation of user's tables into separate schemas (say,
> schema 'a' -> 'z' according to username or some better hashing
> routine like brute-force round-robin assignment at user creation
> time). Assignment of objects -> schema would be one additional column
> in your centralized user directory table(s).

I don't imagine this would help much with the *true* problem, which is
that a lot of queries inside the DB would have tens of thousands of
tables to go thru rather than (say) hundreds.

If you have 10,000 tables, that means 10,000 entries in pg_class.

Associating them with 26 (or 260) namespaces does nothing to assist
any queries that still have to scan through 10K pg_class entries.
-- 
(reverse (concatenate 'string "moc.liamg" "@" "enworbbc"))
http://linuxfinances.info/info/wp.html
"Whenever you  find that you  are on the  side of the majority,  it is
time to reform." -- Mark Twain


Re: Design question: Scalability and tens of thousands of

From
george young
Date:
On Thu, 3 Nov 2005 09:58:29 -0800
"zackchandler" <zackchandler@hotmail.com> threw this fish to the penguins:
> I'm designing an application that will allow users to create custom objects
> on the application level.  A custom object can have zero or more attributes.
> Attributes can be one of 5-10 types (ex. String, Number, List, Date, Time,
> Currency, etc...). This will allow users to track anything exactly as they
> want.  My first thought on how to structure this is to make a custom table
> in the db for each custom object.  The attributes would map to fields and
> everything would work just dandy.
> 
> The problem is I am very concerned about scalability with having a different
> table created for each custom object.  I want to design to site to handle
> tens of thousands of users.  If each user has 3-5 custom objects the
> database would have to handle tens of thousands of tables.
> 
> So it would appear that this is a broken solution and will not scale.  Has
> anyone designed a similar system or have ideas to share?

This is a sticky problem.  My app is a bit similar.  Trying something
like your solution, I found that zillions of tables, constantly created
and destroyed by users, to be terrible to manage.  Now I use one table
that defines objects' attributes, e.g. something like:
 create table fields(owner text, obj text, name text, type text, seq int) create table objs(owner text, name text)
createtable values(owner text, obj text, name text, val text)
 

That is, the values are stored in text type, not the native type.
Yes, this takes a performance hit for conversion of values, but the
simplicity of schema really wins for me.  I suggest you seriously consider
it unless you need blinding performance in all 20,000 applications...

-- George Young
-- 
"Are the gods not just?"  "Oh no, child.
What would become of us if they were?" (CSL)


Re: Design question: Scalability and tens of thousands of

From
"Zack Chandler"
Date:
I have thought of this as a possible solution.  Unfortunately I want to do
reporting (sum, avg, ect.) queries on this data (for each user) and I can't
imagine that being feasible with casting that many columns.

Zack


-----Original Message-----
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org]
On Behalf Of george young
Sent: Tuesday, November 08, 2005 6:15 PM
To: pgsql-sql@postgresql.org
Subject: Re: [SQL] Design question: Scalability and tens of thousands of

On Thu, 3 Nov 2005 09:58:29 -0800
"zackchandler" <zackchandler@hotmail.com> threw this fish to the penguins:
> I'm designing an application that will allow users to create custom
objects
> on the application level.  A custom object can have zero or more
attributes.
> Attributes can be one of 5-10 types (ex. String, Number, List, Date, Time,
> Currency, etc...). This will allow users to track anything exactly as they
> want.  My first thought on how to structure this is to make a custom table
> in the db for each custom object.  The attributes would map to fields and
> everything would work just dandy.
>
> The problem is I am very concerned about scalability with having a
different
> table created for each custom object.  I want to design to site to handle
> tens of thousands of users.  If each user has 3-5 custom objects the
> database would have to handle tens of thousands of tables.
>
> So it would appear that this is a broken solution and will not scale.  Has
> anyone designed a similar system or have ideas to share?

This is a sticky problem.  My app is a bit similar.  Trying something
like your solution, I found that zillions of tables, constantly created
and destroyed by users, to be terrible to manage.  Now I use one table
that defines objects' attributes, e.g. something like:
 create table fields(owner text, obj text, name text, type text, seq int) create table objs(owner text, name text)
createtable values(owner text, obj text, name text, val text) 

That is, the values are stored in text type, not the native type.
Yes, this takes a performance hit for conversion of values, but the
simplicity of schema really wins for me.  I suggest you seriously consider
it unless you need blinding performance in all 20,000 applications...

-- George Young
--
"Are the gods not just?"  "Oh no, child.
What would become of us if they were?" (CSL)

---------------------------(end of broadcast)---------------------------
TIP 9: In versions below 8.0, the planner will ignore your desire to      choose an index scan if your joining column's
datatypesdo not      match