Hello,
On Fri, 2005-12-09 at 08:47 -0500, Christopher Browne wrote:
> We *know* (particularly those of us that have had involvement in
> actually implementing replication systems used in production
> environments) that "user space" implementations of replication can
> function satisfactorily. We've implemented it.
While this might be true, allow me a sidenote: AFAIK the very first,
functional prototype we know of was Postgres-R for PostgreSQL 6.4.2 (1).
So the very same holds true for a replication solution integrated into
the backend: we know such an implementation can function satisfactorily.
As we mostly agree, the performance bottelneck is _not_ the CPU, but the
nodes interconnects (the network). Regarding communication between the
backends and the replication solution, performance isn't that much of an
issue, because the inter-node communication will allways be slower than
inter-process communication.
A different problem is how to distribute PostgreSQL with different
upcomming replication solutions. It seems to me that most people's main
concern is not being able to get a prebuilt PostgreSQL with
_just_one_replication_solution_that_works_(tm) For most users it really
doesn't matter _how_ exactly the solution technically got integrated.
This problem gets solved with hooks and preloading a library: you could
simply provide _one_ PostgreSQL package which provides hooks for
replication solutions. Those could then provide a package with their
library. This of course is only doable if the number of hooks is kept
low.
Regards
Markus
[1] pgreplication project on gborg:
http://gborg.postgresql.org/project/pgreplication/projdisplay.php