Re: Bi-Directional replication client awareness - Mailing list pgsql-general

From Craig Ringer
Subject Re: Bi-Directional replication client awareness
Date
Msg-id 53C0A446.4050705@2ndquadrant.com
Whole thread Raw
In response to Bi-Directional replication client awareness  (Martin Gudmundsson <martingudmundsson@gmail.com>)
Responses Re: Bi-Directional replication client awareness  (Martin Gudmundsson <martingudmundsson@gmail.com>)
List pgsql-general
On 07/12/2014 02:42 AM, Martin Gudmundsson wrote:
> Hi all!
> I was wondering if there are any specific load balancing/failover functionality planned for client drivers connection
toa BDR group. In my case the jdbc driver, but could be relevant for other drivers as well. 

PgJDBC actually already supports rudimentary client-based failover.

It's not very smart - it doesn't maintain any persistent state, so if a
host just vanishes it'll take a long time to connect to the next one as
it has to wait for a TCP connection timeout. Making it stateful would be
interesting, and may be more useful now that there's more reason to
bother with client-side failover. Making it reliable in the face of
classloader isolation, load/unload, etc will be "interesting", but it's
a problem we eventually need to solve anyway if we're to support
asynchronous notification callbacks.

AFAIK libpq doesn't have any kind of failover - but if we added
something similar, it'd be transparent to drivers like psycopg2, the Pg
gem, etc that use libpq. I'm not sure how we'd go about making it
stateful though - API changes would likely be needed for that bit.

psqlODBC would also need significant changes.


> Or is the long term plan that we need we need to rely on middleware like pgpool or other third party frameworks for
this?

At this point PgBouncer will likely be the way to go if you want to try
to make it client transparent, but I don't think that's a good idea.

Because BDR is asynchronous multi-master _replication_ though, clients
are expected to be aware of some of the anomalies that can occur. A
naïve client that just picked a random BDR server and did the next
transaction on it would be very likely to cause unwanted replication
anomalies, apply conflicts, etc.

For example, if the client inserted a row on one server then tried to
immediately update it on another, the update would likely fail because
the row probably hasn't replicated yet.

It is generally a good idea to make clients "sticky" to a given server,
but clients also need to be aware of replication anomalies unless each
server's data is a self-contained shard that doesn't interact with the
others. In which case you probably wouldn't be using BDR.

> Having client awareness of the nodes ip, port numbers, status and current load, could probably bring powerful
featuresto this as a whole. That is some how other database vendors take care of client failover and load balancing and
hasin my experience proved to work very well.  

Yes, that'd certainly be interesting.

> And not only for BDR, but also streaming replication setups, could use this to do automatic client failovers, in the
casethere is a server side failover. 
>
> Anyone who knows if there is anything in progress regarding this?

Other than the limited failover already in place for PgJDBC I'm not
aware of anything.

Work on its design, implementation and testing would be greatly
appreciated, so polish up your C skills.

--
 Craig Ringer                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


pgsql-general by date:

Previous
From: Sergey Konoplev
Date:
Subject: Re: Synonym/thesaurus dictionaries for FTS
Next
From: Ravi Kiran
Date:
Subject: debugging with gdb