Kieran <kieran@dunelm.org.uk> writes:
> My main requirements are:
> 1. Ability to store approx 200Gb of data, with about 5Gb of data
> changing per day.
Given sufficient iron, no problem.
> 2. Support for high number of concurrent short transactions under
> REPEATABLE READ transaction isolation with row-level locking (or
> equivalent optimistic concurrency control).
What do you consider a "high number"? I think we'd max out somewhere
on the order of a thousand simultaneous transactions (again, given
respectable iron).
> 3. Fast (i.e. < 5 mins) failover time to a constantly mirrored secondary
> database server.
People are doing this today using rserv (or better, the commercial
version available from PostgreSQL Inc). It's a bit of a pain in the
neck to work with, IMHO. Check the pgreplication mailing list for
ongoing work on better solutions.
> 4. Ability to perform continous network backups such that in the event
> of both the primary database server and mirrored database server
> suffering total failure, no more than 1 hour of data is lost.
The only tool we have for this today is pg_dump, and as you say backing
up 200Gb every hour doesn't seem real promising. I do wonder though
why you don't just redefine the problem: why not mirror to two slaves
at dispersed locations?
There is work being done on point-in-time recovery (ie, beefing up the
WAL facility to the point where WAL logs could usefully be archived).
That will eventually provide a more direct answer to your concern.
> I'd imagine it may be possible to satisfy 3. using file system level
> mirroring, but I'd appreciate it if someone could confirm this.
I wouldn't trust such an approach...
> My last question is somewhat pie-in-the sky, but assuming that
> PostgreSQL cannot currently meet requirements 3 & 4 even with 3rd party
> solutions, what are people's gut reactions to whether a small team (e.g.
> 5-6) of experienced, full-time paid developers could add mirroring and
> incremental backup support to PostgreSQL within 18 months?
If you're thinking of bringing in people with no prior experience with
Postgres, I'd counsel not. The learning curve is too long. If you're
thinking of paying existing developers to work on this, I can name
several people who'd love to take your money ;-).
regards, tom lane