Home > mailing lists

Re: Replication on the backend - Mailing list pgsql-hackers

From	J. Andrew Rogers
Subject	Re: Replication on the backend
Date	December 7, 2005 05:26:19
Msg-id	DC5354B1-808C-4E1A-9EDA-C7084C4914B1@neopolitan.com Whole thread Raw
In response to	Re: Replication on the backend (Gregory Maxwell <gmaxwell@gmail.com>)
List	pgsql-hackers

Tree view

On Dec 6, 2005, at 9:09 PM, Gregory Maxwell wrote:
> Eh, why would light limited delay be any slower than a disk on FC the
> same distance away? :)
>
> In any case, performance of PG on iscsi is just fine. You can't blame
> the network... Doing multimaster replication is hard because the
> locking primitives that are fine on a simple multiprocessor system
> (with a VERY high bandwidth very low latency interconnect between
> processors) just don't work across a network, so you're left finding
> other methods and making them work...

Speed of light latency shows up pretty damn often in real networks,  
even relatively local ones.  The number of people that wonder why a  
transcontinental SLA of 10ms is not possible is astonishing.  The  
silicon fabrics are sufficiently fast that most well-designed  
networks are limited by how fast one can push photons through a  
fiber, which is significantly slower than photons through a vacuum.   
Silicon switch fabrics add latency measured in nanoseconds, which is  
effectively zero for many networks that leave the system board.

Compared to single system simple SMP, a local cluster built on a  
first-rate fabric will have about an order of magnitude higher  
latency but very similar bandwidth.  On the other hand, at those  
latencies you can increase the number of addressable processors with  
that kind of bandwidth by an order of magnitude, so it is a bit of a  
trade.  However, latency matters a lot such that one would have to be  
a lot smarter about partitioning synchronization across that fabric  
even though one would lose nothing in the bandwidth department.

> But again, multimaster isn't hard because there of some inherently
> slow property of networks.

Eh?  As far as I know, the difficulty of multi-master is almost  
entirely a product of the latency of real networks such that they are  
too slow for scalable distributed locks.  SMP is little more than a  
distributed lock manager implemented in silicon.  Therefore, multi- 
master is hard in practice because we cannot drive networks fast  
enough.  That said, current state-of-the-art network fabrics are  
within an order of magnitude of SMP fabrics such that they could be  
real contenders, particularly once you get north of 8-16 processors.

The really sweet potential is in Opteron system boards with  
Infiniband directly attached to HyperTransport.  At that level of  
bandwidth and latency, both per node and per switch fabric, the  
architecture possibilities start to become intriguing.

J. Andrew Rogers

pgsql-hackers by date:

From: Markus Schiltknecht
Date: 07 December 2005, 05:23:54
Subject: Re: Replication on the backend

From: Harald Fuchs
Date: 07 December 2005, 05:27:07
Subject: Re: Oddity with extract microseconds?

Re: Replication on the backend - Mailing list pgsql-hackers

Previous

Next