Re: Replication - Mailing list pgsql-general
From | Craig Ringer |
---|---|
Subject | Re: Replication |
Date | |
Msg-id | 1245719379.32535.22.camel@tillium.localnet Whole thread Raw |
In response to | Re: Replication (Gerry Reno <greno@verizon.net>) |
Responses |
Re: Replication
|
List | pgsql-general |
On Mon, 2009-06-22 at 20:48 -0400, Gerry Reno wrote: > > Anyway, you seem to be unaware that built-in replication for > > PostgreSQL already is moving along, with an implementation that's just > > not quite production quality yet, and might make into the next version > > after 8.4 if things go well. > No, I'm aware of this basic builtin replication. It was rather > disappointing to see it moved out of the 8.4 release. We need something > more that just basic master-slave replication which is all this simple > builtin replication will provide. We need a real replication solution > that can handle statement-based and row-based replication. Multi-master > replication. Full cyclic replication chain setups. Simple master-slave > just doesn't cut it. Statement-based replication is, frankly, scary. Personally I'd only be willing to use it if the database would guarantee to throw an exception when any statement that may produce different results on master and slave(s) was issued, like the limit-without-order-by case mentioned on the MySQL replication docs. Even then I don't really understand how it can produce consistent replicas in the face of, say, two concurrent statements both pulling values from a sequence. There would need to be some sort of side channel to allow the master to tell the slave about how it allocated values from the sequence. My overall sentiment is "ick". Re multi-master replication, out of interest: what needs does it satisfy for you that master-slave doesn't? - Scaling number of clients / read throughput in read-mostly workloads? - Client-transparent fault-tolerance? - ... ? What limitations of master-slave replication with read-only slaves present roadblocks for you? - Client must connect to master for writes, otherwise master or slave, so must be more aware of connection management - Client drivers have no way to transparently discover active master, must be told master hostname/ip - ... ? I personally find it difficult to understand how multi-master replication can add much to throughput on write-heavy workloads. DBs are often I/O limited after all, and if each master must write all the others' changes you may not see much of a performance win in write heavy environments. So: I presume multi-master replication is useful mainly in read-mostly workloads ? Or do you expect throughput gains in write-heavy workloads too? If the latter, is it really multiple master replication you want rather than a non-replica clustered database, where writes to one node don't get replicated to the other nodes, they just get notified via some sort of cache coherence protocol? I guess my point is that personally I think it'd be helpful to know _why_ you need more than what's on offer. What specific features pose problems or would benefit you, how, and why. Etc. > > That's probably why it's not on the survey--everybody knows that's > > important and it's already being worked on actively. > Ok, I just felt it should still be there. But, I hope development > understands just how important good replication really is. "development" appear to be well aware. They're also generally very willing to accept help, testing, and users who're willing to trial early efforts. Hint, hint. Donations of paid developer time to work on a project you find to be commercially important probably wouldn't go astray either. -- Craig Ringer
pgsql-general by date: