Thread: [pgsql-cluster-hackers] Replication over RDMA with Infiniband or RoCE

[pgsql-cluster-hackers] Replication over RDMA with Infiniband or RoCE

From
DEV_OPS
Date:
Hi all

we are going to implement the replication over RDMA with Infiniband

the use case is target on high-end OLTP application , that requires NO
DATA LOST between master and slaves which in different locations or
cities, with HIGH PERFORMANCE REPLICATION with very LOW LATENCY (such as
in 'ns' or 'us', but not 'ms' or 's'), and robust, reliable and secure.


we writing to here is to ask for help:

for the design of the architect, need to archive above object/target,
our question is what's the matter need to be considered, which component
or sub-system will be impacted, I think just replace TCP IP replication
stack is not helpful too much.

so we writing to here for help.


would you please advice?

best wishes

-TY






Re: [pgsql-cluster-hackers] Replication over RDMA with Infiniband orRoCE

From
Greg Sabino Mullane
Date:
On Fri, Mar 10, 2017 at 12:11:11PM +0800, DEV_OPS wrote:
> we are going to implement the replication over RDMA with Infiniband
>
> the use case is target on high-end OLTP application , that requires NO
> DATA LOST between master and slaves which in different locations or
> cities, with HIGH PERFORMANCE REPLICATION with very LOW LATENCY (such as
> in 'ns' or 'us', but not 'ms' or 's'), and robust, reliable and secure.

A few more details are needed before we would be able to help you out.
What exactly are your goals? Are you trying to implement this over
Postgres' streaming replication system?

--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 2529 DF6A B8F7 9407 E944  45B4 BC9B 9067 1496 4AC8

Attachment

Re: [pgsql-cluster-hackers] Replication over RDMA with Infiniband or RoCE

From
"bruno@itopsolutions.com"
Date:
Can somebody recommend a good page that explain how implement Replication in different locations and city with PostgreSQL 
Thanks

El 19 mar 2017, a las 2:47 p.m., Greg Sabino Mullane <greg@endpoint.com> escribió:

Infiniband

Re: [pgsql-cluster-hackers] Replication over RDMA with Infiniband orRoCE

From
Greg Sabino Mullane
Date:
On Sun, Mar 19, 2017 at 02:52:29PM -0400, bruno@itopsolutions.com wrote:
> Can somebody recommend a good page that explain how implement
> Replication in different locations and city with PostgreSQL

This provides a decent overview:

https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling



--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 2529 DF6A B8F7 9407 E944  45B4 BC9B 9067 1496 4AC8

Attachment

Re: [pgsql-cluster-hackers] Replication over RDMA with Infiniband or RoCE

From
"bruno@itopsolutions.com"
Date:
Thanks you!
> El 19 mar 2017, a las 2:55 p.m., Greg Sabino Mullane <greg@endpoint.com> escribió:
>
> On Sun, Mar 19, 2017 at 02:52:29PM -0400, bruno@itopsolutions.com wrote:
>> Can somebody recommend a good page that explain how implement
>> Replication in different locations and city with PostgreSQL
>
> This provides a decent overview:
>
> https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
>
>
>
> --
> Greg Sabino Mullane greg@endpoint.com
> End Point Corporation
> PGP Key: 2529 DF6A B8F7 9407 E944  45B4 BC9B 9067 1496 4AC8




On 3/20/17 02:47, Greg Sabino Mullane wrote:
> On Fri, Mar 10, 2017 at 12:11:11PM +0800, DEV_OPS wrote:
>> we are going to implement the replication over RDMA with Infiniband
>>
>> the use case is target on high-end OLTP application , that requires NO
>> DATA LOST between master and slaves which in different locations or
>> cities, with HIGH PERFORMANCE REPLICATION with very LOW LATENCY (such as
>> in 'ns' or 'us', but not 'ms' or 's'), and robust, reliable and secure.
> A few more details are needed before we would be able to help you out.
> What exactly are your goals? Are you trying to implement this over
> Postgres' streaming replication system?
Yes, we are going to implement this over Postgres' streaming replication
system,
the usecase is DR, eg: the sync between two datacenter in same city,
there had FC link, (Now FC could be used to extend IB network, so we are
going to implement the native RDMA replication for low latency purpose
for that case-- it is very import for some critical system, also, this
replication is preferred base on Postgres' stream replication )

so, we are writing to here for advice/help for the design and implementation

we are going to plan following feature:

1.) compatible with current IP based stream replication system; but
also, need the IB native replication, so there is a parameter enable if
replication system use IP or IB

2.) the IB based replication system will work with current IP based
replication system, it mean that, in a replication system, maybe had IB
replication, and also had IP replication, the case1 is: NodeA to NodeB
to NodeC,  NodeA -IB->NodeB --IP-->NodeC ; means, NodeA to NodeB could
be IB base replication, it provide low latency ;     and NodeB to NodeC
it is based on IP, it's orig IP based replication system ;  the case2
is: NodeA--IB-->NodeB, NodeA--IP-->NodeC ; means NodeA to NodeB is IB
replication; and NodeA to NodeC is IP based replication,

3.) it will support sync replication and async replication ; also
support phy replication and logical replication .

4.) required rep system has robust, stable and high performance, low
latency.

5.) OS will support Linux and FreeBSD at least, possible Solaris X86
(currently, IB is supported on Linux and FreeBSD system, but for orig IB
support, Solaris was the best)


>





Re: [pgsql-cluster-hackers] Replication over RDMA with Infiniband orRoCE

From
Greg Sabino Mullane
Date:
> Yes, we are going to implement this over Postgres' streaming replication
> system,
> the usecase is DR, eg: the sync between two datacenter in same city,
> there had FC link, (Now FC could be used to extend IB network, so we are
> going to implement the native RDMA replication for low latency purpose
> for that case-- it is very import for some critical system, also, this
> replication is preferred base on Postgres' stream replication )

...

Well, I think the starting point should be, why is streaming replication
not preferred? It's very fast and very robust.

> 2.) the IB based replication system will work with current IP based
> replication system, it mean that, in a replication system, maybe had IB
> replication, and also had IP replication, the case1 is: NodeA to NodeB
> to NodeC,  NodeA -IB->NodeB --IP-->NodeC ; means, NodeA to NodeB could
> be IB base replication, it provide low latency ;     and NodeB to NodeC
> it is based on IP, it's orig IP based replication system ;  the case2
> is: NodeA--IB-->NodeB, NodeA--IP-->NodeC ; means NodeA to NodeB is IB
> replication; and NodeA to NodeC is IP based replication,
>
> 3.) it will support sync replication and async replication ; also
> support phy replication and logical replication .

...

Honestly, that sounds fairly complex, for not a lot of gain, but I don't
know that I am competely understanding your proposal. This is a very
low-activity mailing list - what I would recommend at this point is writing
up a concise statement about why you need this and run it up the flagpole
to pgsql-hackers@postgresql.org. I promise you will get more of a response
there, for better or for worse. :)

--
Greg Sabino Mullane greg@endpoint.com
End Point Corporation
PGP Key: 2529 DF6A B8F7 9407 E944  45B4 BC9B 9067 1496 4AC8

Attachment
Thanks Greg for the advice

Tony


On 3/22/17 04:57, Greg Sabino Mullane wrote:
>> Yes, we are going to implement this over Postgres' streaming replication
>> system,
>> the usecase is DR, eg: the sync between two datacenter in same city,
>> there had FC link, (Now FC could be used to extend IB network, so we are
>> going to implement the native RDMA replication for low latency purpose
>> for that case-- it is very import for some critical system, also, this
>> replication is preferred base on Postgres' stream replication )
> ...
>
> Well, I think the starting point should be, why is streaming replication
> not preferred? It's very fast and very robust.
>
>> 2.) the IB based replication system will work with current IP based
>> replication system, it mean that, in a replication system, maybe had IB
>> replication, and also had IP replication, the case1 is: NodeA to NodeB
>> to NodeC,  NodeA -IB->NodeB --IP-->NodeC ; means, NodeA to NodeB could
>> be IB base replication, it provide low latency ;     and NodeB to NodeC
>> it is based on IP, it's orig IP based replication system ;  the case2
>> is: NodeA--IB-->NodeB, NodeA--IP-->NodeC ; means NodeA to NodeB is IB
>> replication; and NodeA to NodeC is IP based replication,
>>
>> 3.) it will support sync replication and async replication ; also
>> support phy replication and logical replication .
> ...
>
> Honestly, that sounds fairly complex, for not a lot of gain, but I don't
> know that I am competely understanding your proposal. This is a very
> low-activity mailing list - what I would recommend at this point is writing
> up a concise statement about why you need this and run it up the flagpole
> to pgsql-hackers@postgresql.org. I promise you will get more of a response
> there, for better or for worse. :)
>