On Wed, May 9, 2012 at 7:34 PM, MauMau <maumau307@gmail.com> wrote:
>> I can't speak for other databases, but it's only natural to assume
>> that tps must drop. At minimum, you have to add the latency of
>> communication and remote sync operation to your transaction time. For
>> very short transactions this adds up to a lot of extra work relative
>> to the transaction itself.
>
>
> Yes, I understand it is natural for the response time of each transaction to
> double or more. But I think the throughput drop would be amortized among
> multiple simultaneous transactions. So, 50% throughput decrease seems
> unreasonable.
I'm pretty sure it depends a lot on the workload. Knowing the
methodology used that arrived to those figures is critical. Was the
thoughput decrease measured against no replication, or asynchronous
replication? How many clients were used? What was the workload like?
Was it CPU bound? I/O bound? Read-mostly?
We have asynchronous replication in production and thoughput has not
changed relative to no replication. I cannot see how making it
synchronous would change thoughput, as it only induces waiting time on
the clients, but no extra work. I can only assume the test didn't use
enough clients to saturate the hardware under high-latency situations,
or clients were somehow experiencing application-specific contention.
I don't know the code, but knowing how synchronous replication works,
I would say any such drop under high concurrency would be a bug,
contention among waiting processes or something like that, that needs
to be fixed.