Thread: Big Tables vs. many Tables vs. many Databases
Hi there, I'm currently about to redesign a database which you could compare with a database for managing a library. Now this solution will not only manage one library, but 100 to 500 of them. Currently, eg. all the data about the inventory (books) is held in one table for all the libraries. Is it useful to spread this to one table for each library, by eg. giving it an id as a postfix? For one library, we currently need about 150 tables, so that would easily increase a lot if there would be a set of this tables for each library. On the other hand, there are only a very few tables (2-5), which are used by all libraries. All the rest does not interact with each other - and don't think about exchanging books betweens libs, as the library is only an example... One other solution would be to make one database for each library. What do you think of that? Does PostgreSQL has any problems with that much tables? Would it better to spread the data by databases? Thanks for your opinions, Dirk
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 PostgreSQL certainly has a limit somewhere, well I never hit it. I would never put them together into one database. If my math is correct you're talking up to 75000 tables. Even if the database can handle that many tables, why would you ? What happens if that one database gets corrupted somehow ? You also lose a bunch of ways to scale the application using that database. You'll need a humangous server, because normally if you handle 500 libraries there is at least 500 connections (assuming that each library wants to access the database). Depending on what your application is doing you might end up using a lot of memory. So maybe you want to spread that into several databases on several servers. Monolithic design is nice when it comes to maintenance - but it's actually very bad when it comes to failures. Think you have a hardware failure on the server (shit happens you know ...) - with everything in that one database on that one server you're dead in the water. If you spread that onto several machines you're usually better off. Except maybe you can afford something like a SUN HA E10k or similar, you know, something with redundancy,automatic replication etc. My $0.02 On Wednesday 18 February 2004 04:44 pm, Dirk Olbertz wrote: > Hi there, > > I'm currently about to redesign a database which you could compare with a > database for managing a library. Now this solution will not only manage one > library, but 100 to 500 of them. Currently, eg. all the data about the > inventory (books) is held in one table for all the libraries. > > Is it useful to spread this to one table for each library, by eg. giving it > an id as a postfix? > > For one library, we currently need about 150 tables, so that would easily > increase a lot if there would be a set of this tables for each library. On > the other hand, there are only a very few tables (2-5), which are used by > all libraries. All the rest does not interact with each other - and don't > think about exchanging books betweens libs, as the library is only an > example... > > One other solution would be to make one database for each library. What do > you think of that? Does PostgreSQL has any problems with that much tables? > Would it better to spread the data by databases? > > Thanks for your opinions, > Dirk > > > ---------------------------(end of broadcast)--------------------------- > TIP 7: don't forget to increase your free space map settings - -- UC - -- Open Source Solutions 4U, LLC 2570 Fleetwood Drive Phone: +1 650 872 2425 San Bruno, CA 94066 Cell: +1 650 302 2405 United States Fax: +1 650 872 2417 -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.3 (GNU/Linux) iD8DBQFANGwFjqGXBvRToM4RAsrMAKDAmQqrlqUEdbqA/2FjEsAQk6heMACfXGuI /dA2xWmt1ZiLmv9QNO+604U= =Uzxj -----END PGP SIGNATURE-----
On Thursday 19 February 2004 00:44, Dirk Olbertz wrote: > Hi there, > > I'm currently about to redesign a database which you could compare with a > database for managing a library. Now this solution will not only manage one > library, but 100 to 500 of them. Currently, eg. all the data about the > inventory (books) is held in one table for all the libraries. > > Is it useful to spread this to one table for each library, by eg. giving it > an id as a postfix? [snip] Uwe's raised some points about reliability - I'd suggest looking at how you want to use the data. You might want to look at multiple databases, but probably not multiple tables. With multiple databases, you could backup/restore individual client's data. On the other hand, if you want to view data across multiple "libraries" then one large table in one database is useful. Also think about how you will make changes to the structure of your database(s) - how much effort will it be to add a column in each case? -- Richard Huxton Archonet Ltd
> Is it useful to spread this to one table for each library, by eg. giving > it an id as a postfix? That sounds like a real PITA to me. One big database is the way to go, just put in a branch ID where you need it to seperate the inventories. This makes things much easier to scale; just put in one row for a new branch and you are in business, no messy creating of database and initializing them and linking those few shared tables... From my experience, there is no such thing as a big table, no matter how big it gets. The only thing you ever need to worry about is indexes, because that is how you find things and when using correct index strategies, Postgres is _fast_. Bas.
At 01:44 AM 2/19/2004 +0100, Dirk Olbertz wrote: >I'm currently about to redesign a database which you could compare with a >database for managing a library. Now this solution will not only manage one >library, but 100 to 500 of them. Currently, eg. all the data about the >inventory (books) is held in one table for all the libraries. > >Is it useful to spread this to one table for each library, by eg. giving it >an id as a postfix? If the libraries don't belong together (administrated and used independently) then it may be worth splitting them into different databases. If you are really managing ALL libraries together then keep them in the same database and even in the same tables. Of course you have to be careful when sharing tables - if you screw up, data could be exposed to the wrong parties.
olbertz.dirk@gmx.de ("Dirk Olbertz") wrote: > Hi there, > > I'm currently about to redesign a database which you could compare with a > database for managing a library. Now this solution will not only manage one > library, but 100 to 500 of them. Currently, eg. all the data about the > inventory (books) is held in one table for all the libraries. > > Is it useful to spread this to one table for each library, by eg. giving it > an id as a postfix? > > For one library, we currently need about 150 tables, so that would easily > increase a lot if there would be a set of this tables for each library. On > the other hand, there are only a very few tables (2-5), which are used by > all libraries. All the rest does not interact with each other - and don't > think about exchanging books betweens libs, as the library is only an > example... > > One other solution would be to make one database for each library. What do > you think of that? Does PostgreSQL has any problems with that much tables? > Would it better to spread the data by databases? The big challenge that you'll face (and this would be true with _any_ DBMS) if you separate things into separate databases is that the proliferation of connections will get pretty frightful, and there will be no option of having your application use connection pooling as the connections won't be the same. It would seem to me that having a "branch" field to identify which library a particular entry about would be The Way to split off the data. That in effect means having one database with 150 tables. If you _really_ want to separate things into having 150 tables for each library, then the appropriate approach would likely be to set up a schema for each library. That allows using one database, and not utterly preventing joining data from the separate libraries in cases where that would be useful... -- select 'cbbrowne' || '@' || 'cbbrowne.com'; http://cbbrowne.com/info/emacs.html "The beginning of wisdom for a [software engineer] is to recognize the difference between getting a program to work, and getting it right." -- M A Jackson, 1975