Thread: Design question: Scalability and tens of thousands of tables?
Hi everybody. Quick question: I'm designing an application that will allow users to create custom objects on the application level. A custom object can have zero or more attributes. Attributes can be one of 5-10 types (ex. String, Number, List, Date, Time, Currency, etc...). This will allow users to track anything exactly as they want. My first thought on how to structure this is to make a custom table in the db for each custom object. The attributes would map to fields and everything would work just dandy. The problem is I am very concerned about scalability with having a different table created for each custom object. I want to design to site to handle tens of thousands of users. If each user has 3-5 custom objects the database would have to handle tens of thousands of tables. So it would appear that this is a broken solution and will not scale. Has anyone designed a similar system or have ideas to share? Thanks, Zack
On Nov 4, 2005, at 2:58 , zackchandler wrote: > The problem is I am very concerned about scalability with having a > different > table created for each custom object. I want to design to site to > handle > tens of thousands of users. If each user has 3-5 custom objects the > database would have to handle tens of thousands of tables. > > So it would appear that this is a broken solution and will not > scale. Has > anyone designed a similar system or have ideas to share? One resource you may want to look at is Joe Conway's site: http://www.joeconway.com/ He has a PDF available for download that may help you accomplish this: OSCON 2004 presentation: Flexible Data Acquisition and Analysis Joe's an active member of the PostgreSQL community and there's a lot of good stuff there. Hope this helps. Michael Glaesemann grzm myrealbox com
On Nov 4, 2005, at 9:47 AM, pgsql-sql-owner@postgresql.org wrote: > The problem is I am very concerned about scalability with having a > different > table created for each custom object. I want to design to site to > handle > tens of thousands of users. If each user has 3-5 custom objects the > database would have to handle tens of thousands of tables. > > So it would appear that this is a broken solution and will not > scale. Has > anyone designed a similar system or have ideas to share? Well, before you discount it, did you try out your design? You could do initial segregation of user's tables into separate schemas (say, schema 'a' -> 'z' according to username or some better hashing routine like brute-force round-robin assignment at user creation time). Assignment of objects -> schema would be one additional column in your centralized user directory table(s). ---- James Robinson Socialserve.com
> On Nov 4, 2005, at 9:47 AM, pgsql-sql-owner@postgresql.org wrote: > >> The problem is I am very concerned about scalability with having a >> different >> table created for each custom object. I want to design to site to >> handle >> tens of thousands of users. If each user has 3-5 custom objects the >> database would have to handle tens of thousands of tables. >> >> So it would appear that this is a broken solution and will not >> scale. Has >> anyone designed a similar system or have ideas to share? > > Well, before you discount it, did you try out your design? You could > do initial segregation of user's tables into separate schemas (say, > schema 'a' -> 'z' according to username or some better hashing > routine like brute-force round-robin assignment at user creation > time). Assignment of objects -> schema would be one additional column > in your centralized user directory table(s). I don't imagine this would help much with the *true* problem, which is that a lot of queries inside the DB would have tens of thousands of tables to go thru rather than (say) hundreds. If you have 10,000 tables, that means 10,000 entries in pg_class. Associating them with 26 (or 260) namespaces does nothing to assist any queries that still have to scan through 10K pg_class entries. -- (reverse (concatenate 'string "moc.liamg" "@" "enworbbc")) http://linuxfinances.info/info/wp.html "Whenever you find that you are on the side of the majority, it is time to reform." -- Mark Twain
On Thu, 3 Nov 2005 09:58:29 -0800 "zackchandler" <zackchandler@hotmail.com> threw this fish to the penguins: > I'm designing an application that will allow users to create custom objects > on the application level. A custom object can have zero or more attributes. > Attributes can be one of 5-10 types (ex. String, Number, List, Date, Time, > Currency, etc...). This will allow users to track anything exactly as they > want. My first thought on how to structure this is to make a custom table > in the db for each custom object. The attributes would map to fields and > everything would work just dandy. > > The problem is I am very concerned about scalability with having a different > table created for each custom object. I want to design to site to handle > tens of thousands of users. If each user has 3-5 custom objects the > database would have to handle tens of thousands of tables. > > So it would appear that this is a broken solution and will not scale. Has > anyone designed a similar system or have ideas to share? This is a sticky problem. My app is a bit similar. Trying something like your solution, I found that zillions of tables, constantly created and destroyed by users, to be terrible to manage. Now I use one table that defines objects' attributes, e.g. something like: create table fields(owner text, obj text, name text, type text, seq int) create table objs(owner text, name text) createtable values(owner text, obj text, name text, val text) That is, the values are stored in text type, not the native type. Yes, this takes a performance hit for conversion of values, but the simplicity of schema really wins for me. I suggest you seriously consider it unless you need blinding performance in all 20,000 applications... -- George Young -- "Are the gods not just?" "Oh no, child. What would become of us if they were?" (CSL)
I have thought of this as a possible solution. Unfortunately I want to do reporting (sum, avg, ect.) queries on this data (for each user) and I can't imagine that being feasible with casting that many columns. Zack -----Original Message----- From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of george young Sent: Tuesday, November 08, 2005 6:15 PM To: pgsql-sql@postgresql.org Subject: Re: [SQL] Design question: Scalability and tens of thousands of On Thu, 3 Nov 2005 09:58:29 -0800 "zackchandler" <zackchandler@hotmail.com> threw this fish to the penguins: > I'm designing an application that will allow users to create custom objects > on the application level. A custom object can have zero or more attributes. > Attributes can be one of 5-10 types (ex. String, Number, List, Date, Time, > Currency, etc...). This will allow users to track anything exactly as they > want. My first thought on how to structure this is to make a custom table > in the db for each custom object. The attributes would map to fields and > everything would work just dandy. > > The problem is I am very concerned about scalability with having a different > table created for each custom object. I want to design to site to handle > tens of thousands of users. If each user has 3-5 custom objects the > database would have to handle tens of thousands of tables. > > So it would appear that this is a broken solution and will not scale. Has > anyone designed a similar system or have ideas to share? This is a sticky problem. My app is a bit similar. Trying something like your solution, I found that zillions of tables, constantly created and destroyed by users, to be terrible to manage. Now I use one table that defines objects' attributes, e.g. something like: create table fields(owner text, obj text, name text, type text, seq int) create table objs(owner text, name text) createtable values(owner text, obj text, name text, val text) That is, the values are stored in text type, not the native type. Yes, this takes a performance hit for conversion of values, but the simplicity of schema really wins for me. I suggest you seriously consider it unless you need blinding performance in all 20,000 applications... -- George Young -- "Are the gods not just?" "Oh no, child. What would become of us if they were?" (CSL) ---------------------------(end of broadcast)--------------------------- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypesdo not match