Re: Geographic High-Availability/Replication - Mailing list pgsql-general

From Markus Schiltknecht
Subject Re: Geographic High-Availability/Replication
Date
Msg-id 46D1B296.1020407@bluegap.ch
Whole thread Raw
In response to Re: Geographic High-Availability/Replication  (Bill Moran <wmoran@potentialtech.com>)
Responses Re: Geographic High-Availability/Replication  (Bill Moran <wmoran@potentialtech.com>)
List pgsql-general
Hi,

Bill Moran wrote:
> I'm curious as to how Postgres-R would handle a situation where the
> constant throughput exceeded the processing speed of one of the nodes.

Well, what do you expect to happen? This case is easily detectable, but
I can only see two possible solutions: either stop the node which is to
slow or stop accepting new transactions for a while.

This technique is not meant to allow nodes to lag behind several
thousands of transactions - that should better be avoided. Rather it's
meant to decrease the commit delay necessary for synchronous replication.

> I can see your system working if it's just spike loads and the slow
> nodes can catch up during slow periods, but I'm wondering about the
> scenarios where an admin has underestimated the hardware requirements
> and one or more nodes is unable to keep up.

Please keep in mind, that replication per se does not speed your
database up, it rather adds a layer of reliability, which *costs* some
performance. To increase the transactional throughput you would need to
add partitioning to the mix. Or you could try to make use of the gained
reliability and abandon WAL - you won't need that as long as at least
one replica is running - that should increase the single node's
throughput and therefore the cluster's throughput, too.

When replication meets partitioning and load balancing, you'll get into
a whole new world, where new trade-offs need to be considered. Some look
similar to those with RAID storage - probably Sequoia's term RAIDb isn't
bad at all.

Regards

Markus


pgsql-general by date:

Previous
From: Terry Yapt
Date:
Subject: Re: FATAL: could not reattach to shared memory (Win32)
Next
From: Tom Lane
Date:
Subject: Re: simple query runs 26 seconds