On 03/08/12, Menelaos PerdikeasSemantix (mperdikeas.semantix@gmail.com) wrote:
> I would like to know what are the best practices / common patterns (or
> pointers to such) for using PostgreSQL in the context of a "big" web
> application with substantial data per user.
...
> [1] use just one database and schema and logically segregate companies data
> by having all tables have a client_id column as part of their primary key.
> [2] use multiple database (in the same server instance) and only the public
> schema in each of them for the customer's data.
> [3] use one database and multiple schemas to separate the different
> customer's data.
We have about 75 clients with 10-50GB of data each, running two main
services. Early on we decided to give each client their own database.
This has the following advantages:
- backup per client is really easy
- moving databases between db servers is easy
- a sequential pg_dump over all the dbs in a server runs pretty quickly
- testing on a subset of dbs works well
Problems:
- the team needs to get really good at running upgrades across all dbs
- if you need to do cross-db work you need to start looking towards
cross-database tools or messaging e.g. AMQP
Apart from the issues of dealing with cross-database queries (if you
require them) I firmly believe this is the way to go. If you are using
relatively small databases like us, you can scale when you need to, and
use commodity hardware instead of needing something exotic. Put this
together with the new streaming replication stuff, and you have a robust
and easily scalable solution.