Thread: Big Tables vs. many Tables vs. many Databases

Big Tables vs. many Tables vs. many Databases

From

"Dirk Olbertz"

Date:

18 February 2004, 23:44:24

Hi there,

I'm currently about to redesign a database which you could compare with a
database for managing a library. Now this solution will not only manage one
library, but 100 to 500 of them. Currently, eg. all the data about the
inventory (books) is held in one table for all the libraries.

Is it useful to spread this to one table for each library, by eg. giving it
an id as a postfix?

For one library, we currently need about 150 tables, so that would easily
increase a lot if there would be a set of this tables for each library. On
the other hand, there are only a very few tables (2-5), which are used by
all libraries. All the rest does not interact with each other - and don't
think about exchanging books betweens libs, as the library is only an
example...

One other solution would be to make one database for each library. What do
you think of that? Does PostgreSQL has any problems with that much tables?
Would it better to spread the data by databases?

Thanks for your opinions,
  Dirk

Re: Big Tables vs. many Tables vs. many Databases

From

"Uwe C. Schroeder"

Date:

19 February 2004, 06:55:49

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

PostgreSQL certainly has a limit somewhere, well I never hit it.
I would never put them together into one database. If my math is correct
you're talking up to 75000 tables. Even if the database can handle that many
tables, why would you ? What happens if that one database gets corrupted
somehow ? You also lose a bunch of ways to scale the application using that
database. You'll need a humangous server, because normally if you handle 500
libraries there is at least 500 connections (assuming that each library wants
to access the database). Depending on what your application is doing you
might end up using a lot of memory. So maybe you want to spread that into
several databases on several servers. Monolithic design is nice when it comes
to maintenance - but it's actually very bad when it comes to failures. Think
you have a hardware failure on the server (shit happens you know ...) - with
everything in that one database on that one server you're dead in the water.
If you spread that onto several machines you're usually better off. Except
maybe you can afford something like a SUN HA E10k or similar, you know,
something with redundancy,automatic replication etc.

My $0.02

On Wednesday 18 February 2004 04:44 pm, Dirk Olbertz wrote:
> Hi there,
>
> I'm currently about to redesign a database which you could compare with a
> database for managing a library. Now this solution will not only manage one
> library, but 100 to 500 of them. Currently, eg. all the data about the
> inventory (books) is held in one table for all the libraries.
>
> Is it useful to spread this to one table for each library, by eg. giving it
> an id as a postfix?
>
> For one library, we currently need about 150 tables, so that would easily
> increase a lot if there would be a set of this tables for each library. On
> the other hand, there are only a very few tables (2-5), which are used by
> all libraries. All the rest does not interact with each other - and don't
> think about exchanging books betweens libs, as the library is only an
> example...
>
> One other solution would be to make one database for each library. What do
> you think of that? Does PostgreSQL has any problems with that much tables?
> Would it better to spread the data by databases?
>
> Thanks for your opinions,
>   Dirk
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 7: don't forget to increase your free space map settings

- --
    UC

- --
Open Source Solutions 4U, LLC    2570 Fleetwood Drive
Phone:  +1 650 872 2425        San Bruno, CA 94066
Cell:   +1 650 302 2405        United States
Fax:    +1 650 872 2417
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQFANGwFjqGXBvRToM4RAsrMAKDAmQqrlqUEdbqA/2FjEsAQk6heMACfXGuI
/dA2xWmt1ZiLmv9QNO+604U=
=Uzxj
-----END PGP SIGNATURE-----

Re: Big Tables vs. many Tables vs. many Databases

From

Richard Huxton

Date:

19 February 2004, 08:09:47

On Thursday 19 February 2004 00:44, Dirk Olbertz wrote:
> Hi there,
>
> I'm currently about to redesign a database which you could compare with a
> database for managing a library. Now this solution will not only manage one
> library, but 100 to 500 of them. Currently, eg. all the data about the
> inventory (books) is held in one table for all the libraries.
>
> Is it useful to spread this to one table for each library, by eg. giving it
> an id as a postfix?
[snip]

Uwe's raised some points about reliability - I'd suggest looking at how you
want to use the data.

You might want to look at multiple databases, but probably not multiple
tables. With multiple databases, you could backup/restore individual client's
data.

On the other hand, if you want to view data across multiple "libraries" then
one large table in one database is useful.

Also think about how you will make changes to the structure of your
database(s) - how much effort will it be to add a column in each case?

--
  Richard Huxton
  Archonet Ltd

Re: Big Tables vs. many Tables vs. many Databases

From

"Bas Scheffers"

Date:

19 February 2004, 09:11:26

> Is it useful to spread this to one table for each library, by eg. giving
> it an id as a postfix?
That sounds like a real PITA to me. One big database is the way to go,
just put in a branch ID where you need it to seperate the inventories.
This makes things much easier to scale; just put in one row for a new
branch and you are in business, no messy creating of database and
initializing them and linking those few shared tables...

From my experience, there is no such thing as a big table, no matter how
big it gets. The only thing you ever need to worry about is indexes,
because that is how you find things and when using correct index
strategies, Postgres is _fast_.

Bas.

Re: Big Tables vs. many Tables vs. many Databases

From

Lincoln Yeoh

Date:

19 February 2004, 15:47:00

At 01:44 AM 2/19/2004 +0100, Dirk Olbertz wrote:
>I'm currently about to redesign a database which you could compare with a
>database for managing a library. Now this solution will not only manage one
>library, but 100 to 500 of them. Currently, eg. all the data about the
>inventory (books) is held in one table for all the libraries.
>
>Is it useful to spread this to one table for each library, by eg. giving it
>an id as a postfix?

If the libraries don't belong together (administrated and used
independently) then it may be worth splitting them into different databases.

If you are really managing ALL libraries together then keep them in the
same database and even in the same tables.

Of course you have to be careful when sharing tables - if you screw up,
data could be exposed to the wrong parties.

Re: Big Tables vs. many Tables vs. many Databases

From

Christopher Browne

Date:

20 February 2004, 19:08:48

olbertz.dirk@gmx.de ("Dirk Olbertz") wrote:
> Hi there,
>
> I'm currently about to redesign a database which you could compare with a
> database for managing a library. Now this solution will not only manage one
> library, but 100 to 500 of them. Currently, eg. all the data about the
> inventory (books) is held in one table for all the libraries.
>
> Is it useful to spread this to one table for each library, by eg. giving it
> an id as a postfix?
>
> For one library, we currently need about 150 tables, so that would easily
> increase a lot if there would be a set of this tables for each library. On
> the other hand, there are only a very few tables (2-5), which are used by
> all libraries. All the rest does not interact with each other - and don't
> think about exchanging books betweens libs, as the library is only an
> example...
>
> One other solution would be to make one database for each library. What do
> you think of that? Does PostgreSQL has any problems with that much tables?
> Would it better to spread the data by databases?

The big challenge that you'll face (and this would be true with _any_
DBMS) if you separate things into separate databases is that the
proliferation of connections will get pretty frightful, and there will
be no option of having your application use connection pooling as the
connections won't be the same.

It would seem to me that having a "branch" field to identify which
library a particular entry about would be The Way to split off the
data.

That in effect means having one database with 150 tables.

If you _really_ want to separate things into having 150 tables for
each library, then the appropriate approach would likely be to set up
a schema for each library.  That allows using one database, and not
utterly preventing joining data from the separate libraries in cases
where that would be useful...
--
select 'cbbrowne' || '@' || 'cbbrowne.com';
http://cbbrowne.com/info/emacs.html
"The beginning of wisdom for a [software engineer] is to recognize the
difference between getting a program  to work, and getting it  right."
-- M A Jackson, 1975