Hi,
Bill Moran wrote:
> I'm curious as to how Postgres-R would handle a situation where the
> constant throughput exceeded the processing speed of one of the nodes.
Well, what do you expect to happen? This case is easily detectable, but
I can only see two possible solutions: either stop the node which is to
slow or stop accepting new transactions for a while.
This technique is not meant to allow nodes to lag behind several
thousands of transactions - that should better be avoided. Rather it's
meant to decrease the commit delay necessary for synchronous replication.
> I can see your system working if it's just spike loads and the slow
> nodes can catch up during slow periods, but I'm wondering about the
> scenarios where an admin has underestimated the hardware requirements
> and one or more nodes is unable to keep up.
Please keep in mind, that replication per se does not speed your
database up, it rather adds a layer of reliability, which *costs* some
performance. To increase the transactional throughput you would need to
add partitioning to the mix. Or you could try to make use of the gained
reliability and abandon WAL - you won't need that as long as at least
one replica is running - that should increase the single node's
throughput and therefore the cluster's throughput, too.
When replication meets partitioning and load balancing, you'll get into
a whole new world, where new trade-offs need to be considered. Some look
similar to those with RAID storage - probably Sequoia's term RAIDb isn't
bad at all.
Regards
Markus