Thread: Failover Datasource?

Failover Datasource?

From

"Tim H"

Date:

15 April 2008, 11:12:12

I'd like to create a pooling JDBC datasource that can handle failing
over to an alternate URL.

I've seen drivers from other vendors that allow you to pass in two
databases in the connection URL. I've scoured the mailing list here
and haven't found a thing.

Can someone point me in the right direction?

Thanks,
Tim

--
./tch

Re: Failover Datasource?

From

Bruce Adams

Date:

16 June 2011, 16:33:39

I, too, would like to be able to specify multiple Postgres servers in a
JDBC connection URL. I want the client application to prefer connecting
to a master database, but automatically failover to a replica when the
master is unavailable.

Many other databases have this feature in their JDBC drivers. For
example a MySQL JDBC URL can have a comma separated list of host:port in
the URL, like this:
           jdbc:mysql://master:3306,slave:3306/databasename

Is there some other way to setup client failover?

Writings I've found for Postgres always talk about a proxy, or even
multiple proxies (!), between the Java client and the real database
servers. I'm trying to build a robust system; any additional layer is
yet another thing that can break. I don't mind having the application
see errors in a failover, I just want some measure of recovery to be
automatic, even if the automatic recovery is to a read-only replica.

I plan to use streaming replication in PostgreSQL 9.0 for the hot backup(s).

- Bruce

On Tue, 15 Apr 2008 10:12:07 -0400, Tim H wrote:
> I'd like to create a pooling JDBC datasource that can handle failing
> over to an alternate URL.
>
> I've seen drivers from other vendors that allow you to pass in two
> databases in the connection URL. I've scoured the mailing list here
> and haven't found a thing.
>
> Can someone point me in the right direction?
>
> Thanks,
> Tim
>
> --
> ./tch

Re: Failover Datasource?

From

Thomas Kellerer

Date:

16 June 2011, 16:57:00

Bruce Adams wrote on 16.06.2011 21:33:
> I, too, would like to be able to specify multiple Postgres servers in
> a JDBC connection URL. I want the client application to prefer
> connecting to a master database, but automatically failover to a
> replica when the master is unavailable.
>
> Many other databases have this feature in their JDBC drivers. For
> example a MySQL JDBC URL can have a comma separated list of host:port
> in the URL, like this:
> jdbc:mysql://master:3306,slave:3306/databasename
>
> Is there some other way to setup client failover?
>
> Writings I've found for Postgres always talk about a proxy, or even
> multiple proxies (!), between the Java client and the real database
> servers. I'm trying to build a robust system; any additional layer is
> yet another thing that can break. I don't mind having the application
> see errors in a failover, I just want some measure of recovery to be
> automatic, even if the automatic recovery is to a read-only replica.
>
> I plan to use streaming replication in PostgreSQL 9.0 for the hot
> backup(s).

pgBouncer or pgPool can both do that as far as I know

http://wiki.postgresql.org/wiki/PgBouncer
http://pgpool.projects.postgresql.org/

Thomas

Re: Failover Datasource?

From

Bruce Adams

Date:

16 June 2011, 17:27:20

Both pgBouncer and pgPool add another process between the client and the
server.

Their primary goal appears to be connection pooling. My Java application
server (Apache Tomcat) is already pooling database connections.

Adding another chunk of middle-ware adds another thing I have to worry
about failing and is likely to slow things down.

If I write code to add failover support into the JDBC driver, what are
my chances of getting that patch into the next release?

- Bruce

On 06/16/2011 03:56 PM, Thomas Kellerer wrote:
> Bruce Adams wrote on 16.06.2011 21:33:
>> I, too, would like to be able to specify multiple Postgres servers in
>> a JDBC connection URL. I want the client application to prefer
>> connecting to a master database, but automatically failover to a
>> replica when the master is unavailable.
>>
>> Many other databases have this feature in their JDBC drivers. For
>> example a MySQL JDBC URL can have a comma separated list of host:port
>> in the URL, like this:
>> jdbc:mysql://master:3306,slave:3306/databasename
>>
>> Is there some other way to setup client failover?
>>
>> Writings I've found for Postgres always talk about a proxy, or even
>> multiple proxies (!), between the Java client and the real database
>> servers. I'm trying to build a robust system; any additional layer is
>> yet another thing that can break. I don't mind having the application
>> see errors in a failover, I just want some measure of recovery to be
>> automatic, even if the automatic recovery is to a read-only replica.
>>
>> I plan to use streaming replication in PostgreSQL 9.0 for the hot
>> backup(s).
>
> pgBouncer or pgPool can both do that as far as I know
>
> http://wiki.postgresql.org/wiki/PgBouncer
> http://pgpool.projects.postgresql.org/
>
> Thomas

Re: Failover Datasource?

From

Thomas Kellerer

Date:

16 June 2011, 17:35:34

Bruce Adams wrote on 16.06.2011 22:27:
> Both pgBouncer and pgPool add another process between the client and
> the server.
>
> Their primary goal appears to be connection pooling. My Java
> application server (Apache Tomcat) is already pooling database
> connections.

Especially pgPool (despite the name) is more than "just" a connection pooler.

Thomas

Re: Failover Datasource?

From

John R Pierce

Date:

16 June 2011, 20:51:53

On 06/16/11 1:27 PM, Bruce Adams wrote:
> Their primary goal appears to be connection pooling. My Java
> application server (Apache Tomcat) is already pooling database
> connections.

I think JDBC is the wrong layer for this, instead it should be
implemented in your java connection pool (tomcat, etc), where you
configure it with the multiple connections and the connection pooling
rules (master/failover, vs round robin vs whatever).

--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast

Re: Failover Datasource?

From

Bruce Adams

Date:

16 June 2011, 22:55:24

In principal, I agree; in practice, that's not the way it's been done in
the Java application server world.

I have two readily available Java database connection pool
implementations available: the one bundled with Apache Tomcat and
Hibernate's c3p0. Neither of these directly support failover. They each
expect the lower level JDBC driver to deal with failover. (This is true
of BEA WebLogic and IBM WebSphere as well, at least as of a few years
ago when I last used them intensely.)

What I'm looking for is very standard stuff in the Java application
server world. The JDBC driver handles failover and/or load balancing to
multiple backend database servers.

- Bruce

On 06/16/2011 07:51 PM, John R Pierce wrote:
> On 06/16/11 1:27 PM, Bruce Adams wrote:
>> Their primary goal appears to be connection pooling. My Java
>> application server (Apache Tomcat) is already pooling database
>> connections
>
> I think JDBC is the wrong layer for this, instead it should be
> implemented in your java connection pool (tomcat, etc), where you
> configure it with the multiple connections and the connection pooling
> rules (master/failover, vs round robin vs whatever).

Re: Failover Datasource?

From

John R Pierce

Date:

16 June 2011, 23:02:51

On 06/16/11 6:55 PM, Bruce Adams wrote:
> In principal, I agree; in practice, that's not the way it's been done
> in the Java application server world.
>
> I have two readily available Java database connection pool
> implementations available: the one bundled with Apache Tomcat and
> Hibernate's c3p0. Neither of these directly support failover. They
> each expect the lower level JDBC driver to deal with failover. (This
> is true of BEA WebLogic and IBM WebSphere as well, at least as of a
> few years ago when I last used them intensely.)
>
> What I'm looking for is very standard stuff in the Java application
> server world. The JDBC driver handles failover and/or load balancing
> to multiple backend database servers.

it just seems to me that the individual client drivers shouldn't be what
is tracking the state of the server cluster.    I can't imagine the
driver layer could do more than 'try connecting to server 1, if that
fails, try server 2'...  if server 1 is dead and not responding, this is
going to be painfully slow and result in minute long TCP connection
timeouts on each connect.

Do oracle jdbc drivers support this multiple-server notation, same as
shown here earlier for mysql?

--
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast

Re: Failover Datasource?

From

Bruce Adams

Date:

16 June 2011, 23:20:26

Yes, Oracle's JDBC drivers support insanely long JDBC URLs which allow
specifying a whole cluster of database servers, what the failover
policies are, including various timeouts, load balance weights and more.
A single JDBC URL for Oracle can be a thousand characters or more. I
suspect the other database vendors (I specifically know about IBM DB2
and MySQL) have followed Oracle's lead here.

This page has a simple example:
http://programmersjournal.blogspot.com/2008/08/jdbc-connection-string-for-oracle-rac.html

On 06/16/2011 10:02 PM, John R Pierce wrote:
> On 06/16/11 6:55 PM, Bruce Adams wrote:
>> In principal, I agree; in practice, that's not the way it's been done
>> in the Java application server world.
>>
>> I have two readily available Java database connection pool
>> implementations available: the one bundled with Apache Tomcat and
>> Hibernate's c3p0. Neither of these directly support failover. They
>> each expect the lower level JDBC driver to deal with failover. (This
>> is true of BEA WebLogic and IBM WebSphere as well, at least as of a
>> few years ago when I last used them intensely.)
>>
>> What I'm looking for is very standard stuff in the Java application
>> server world. The JDBC driver handles failover and/or load balancing
>> to multiple backend database servers.
>
> it just seems to me that the individual client drivers shouldn't be
> what is tracking the state of the server cluster.    I can't imagine
> the driver layer could do more than 'try connecting to server 1, if
> that fails, try server 2'...  if server 1 is dead and not responding,
> this is going to be painfully slow and result in minute long TCP
> connection timeouts on each connect.
>
> Do oracle jdbc drivers support this multiple-server notation, same as
> shown here earlier for mysql?
>

Re: Failover Datasource?

From

tsaixingwei

Date:

10 April 2012, 12:06:13

Bruce,
I was looking for the same thing you are in the Postgresql JDBC driver.
Judging from the silence on this thread since your last post, I'm guessing
this feature won't be in the Postgresql JDBC driver anytime soon.

I'm trying out something else at the moment - HAJDBC - which is multi-master
replication instead of master-slave replication and it is implemented in a
JDBC driver itself. See this: http://ha-jdbc.sourceforge.net/



--
View this message in context: http://postgresql.1045698.n5.nabble.com/Re-Failover-Datasource-tp4496411p5629626.html
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.

Re: Failover Datasource?

From

kaprikorn07

Date:

12 December 2012, 09:09:09

Hi All,

As Bruce Adams has mentioned, please let me know if there is any other way
to do it other than pgBouncer and pgPool.

Please help!



--
View this message in context: http://postgresql.1045698.n5.nabble.com/Re-Failover-Datasource-tp4496411p5736270.html
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.

Re: Failover Datasource?

From

Mikko Tiihonen

Date:

12 December 2012, 10:41:11

With latest jdbc drivers you can simply configure multiple host:port pairs in the url separated by comma:

jdbc:postgresql://host1:port1,host2:port2/test

It has only the most basic failover support. When a new connection is opened the hosts are tried in round-robin until a
connectionis successfully established. 

-Mikko
________________________________________
From: pgsql-jdbc-owner@postgresql.org [pgsql-jdbc-owner@postgresql.org] on behalf of kaprikorn07
[bharath.spyk@gmail.com]
Sent: 12 December 2012 11:09
To: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] Failover Datasource?

Hi All,

As Bruce Adams has mentioned, please let me know if there is any other way
to do it other than pgBouncer and pgPool.

Please help!



--
View this message in context: http://postgresql.1045698.n5.nabble.com/Re-Failover-Datasource-tp4496411p5736270.html
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.


--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Failover Datasource?

From

Dave Cramer

Date:

12 December 2012, 10:50:04

Mikko,

I can't recall did you ever provide a documentation patch for this ?

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On Wed, Dec 12, 2012 at 5:41 AM, Mikko Tiihonen <Mikko.Tiihonen@nitorcreations.com> wrote:

With latest jdbc drivers you can simply configure multiple host:port pairs in the url separated by comma:

jdbc:postgresql://host1:port1,host2:port2/test

It has only the most basic failover support. When a new connection is opened the hosts are tried in round-robin until a connection is successfully established.

-Mikko
________________________________________
From: pgsql-jdbc-owner@postgresql.org [pgsql-jdbc-owner@postgresql.org] on behalf of kaprikorn07 [bharath.spyk@gmail.com]
Sent: 12 December 2012 11:09
To: pgsql-jdbc@postgresql.org
Subject: Re: [JDBC] Failover Datasource?

Hi All,

As Bruce Adams has mentioned, please let me know if there is any other way
to do it other than pgBouncer and pgPool.

Please help!

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Re-Failover-Datasource-tp4496411p5736270.html
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: Failover Datasource?

From

Mikko Tiihonen

Date:

12 December 2012, 15:06:17

On 12/12/2012 12:49 PM, Dave Cramer wrote:
> Mikko,
>
> I can't recall did you ever provide a documentation patch for this ?

I'm not sure, but I still found the patch that I was supposed to send... attached.

-Mikko

> On Wed, Dec 12, 2012 at 5:41 AM, Mikko Tiihonen <Mikko.Tiihonen@nitorcreations.com
<mailto:Mikko.Tiihonen@nitorcreations.com>>wrote: 
>
>     With latest jdbc drivers you can simply configure multiple host:port pairs in the url separated by comma:
>
>     jdbc:postgresql://host1:port1,host2:port2/test
>
>     It has only the most basic failover support. When a new connection is opened the hosts are tried in round-robin
untila connection is successfully established. 
>
>     -Mikko
>     ________________________________________
>     From: pgsql-jdbc-owner@postgresql.org <mailto:pgsql-jdbc-owner@postgresql.org> [pgsql-jdbc-owner@postgresql.org
<mailto:pgsql-jdbc-owner@postgresql.org>]on 
>     behalf of kaprikorn07 [bharath.spyk@gmail.com <mailto:bharath.spyk@gmail.com>]
>     Sent: 12 December 2012 11:09
>     To: pgsql-jdbc@postgresql.org <mailto:pgsql-jdbc@postgresql.org>
>     Subject: Re: [JDBC] Failover Datasource?
>
>     Hi All,
>
>     As Bruce Adams has mentioned, please let me know if there is any other way
>     to do it other than pgBouncer and pgPool.
>
>     Please help!
>
>
>
>     --
>     View this message in context:
http://postgresql.1045698.n5.nabble.com/Re-Failover-Datasource-tp4496411p5736270.html
>     Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.
>
>
>     --
>     Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org <mailto:pgsql-jdbc@postgresql.org>)
>     To make changes to your subscription:
>     http://www.postgresql.org/mailpref/pgsql-jdbc
>
>
>     --
>     Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org <mailto:pgsql-jdbc@postgresql.org>)
>     To make changes to your subscription:
>     http://www.postgresql.org/mailpref/pgsql-jdbc
>
>

Attachment

connection-failover-documentation.patch

Re: performance problem of Failover Datasource?

From

Chen Huajun

Date:

13 December 2012, 10:43:41

Hi All,

In the latest jdbc driver,multi backends can be assigned in the URL as following.

 >With latest jdbc drivers you can simply configure multiple host:port pairs in the url separated by comma:
 >jdbc:postgresql://host1:port1,host2:port2/test
 >It has only the most basic failover support. When a new connection is opened the hosts are tried in round-robin until
aconnection is successfully established. 

But there is a performance problem .if the first host is down,
all connecting must be blocked until connect timeout at first,
and then try connect to the next host.

Why not adjust the order of hosts dynamically?
For excample, after a successful connecting,if the target host is not the first host,
swap the target host and the first host.
And then subsequent connecting will try the most suitable host at first.

--
Best Regards，
Chen Huajun
(2012/12/12 17:09), kaprikorn07 wrote:
> Hi All,
>
> As Bruce Adams has mentioned, please let me know if there is any other way
> to do it other than pgBouncer and pgPool.
>
> Please help!
>
>
>
> --
> View this message in context: http://postgresql.1045698.n5.nabble.com/Re-Failover-Datasource-tp4496411p5736270.html
> Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.
>
>

Re: performance problem of Failover Datasource?

From

Dave Cramer

Date:

13 December 2012, 21:02:19

Feel free to send us a patch.

Dave

Dave Cramer

dave.cramer(at)credativ(dot)ca
http://www.credativ.ca

On Thu, Dec 13, 2012 at 6:42 AM, Chen Huajun <chenhj@cn.fujitsu.com> wrote:

Hi All,

In the latest jdbc driver,multi backends can be assigned in the URL as following.

>With latest jdbc drivers you can simply configure multiple host:port pairs in the url separated by comma:
>jdbc:postgresql://host1:port1,host2:port2/test
>It has only the most basic failover support. When a new connection is opened the hosts are tried in round-robin until a connection is successfully established.

But there is a performance problem .if the first host is down,
all connecting must be blocked until connect timeout at first,
and then try connect to the next host.

Why not adjust the order of hosts dynamically?
For excample, after a successful connecting,if the target host is not the first host,
swap the target host and the first host.
And then subsequent connecting will try the most suitable host at first.

--
Best Regards，
Chen Huajun
(2012/12/12 17:09), kaprikorn07 wrote:
Hi All,

As Bruce Adams has mentioned, please let me know if there is any other way
to do it other than pgBouncer and pgPool.

Please help!

--
View this message in context: http://postgresql.1045698.n5.nabble.com/Re-Failover-Datasource-tp4496411p5736270.html
Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.

--
Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-jdbc

Re: performance problem of Failover Datasource?

From

Chen Huajun

Date:

14 December 2012, 02:58:58

 > Feel free to send us a patch.

OK,I will make the  patch soon.

--
Best Regards,
Chen Huajun

(2012/12/14 5:01), Dave Cramer wrote:
> Feel free to send us a patch.
>
> Dave
>
> Dave Cramer
>
> dave.cramer(at)credativ(dot)ca
> http://www.credativ.ca
>
>
>
> On Thu, Dec 13, 2012 at 6:42 AM, Chen Huajun <chenhj@cn.fujitsu.com <mailto:chenhj@cn.fujitsu.com>> wrote:
>
>     Hi All,
>
>
>     In the latest jdbc driver,multi backends can be assigned in the URL as following.
>
>      >With latest jdbc drivers you can simply configure multiple host:port pairs in the url separated by comma:
>      >jdbc:postgresql://host1: port1,host2:port2/test
>      >It has only the most basic failover support. When a new connection is opened the hosts are tried in round-robin
untila connection is successfully established. 
>
>     But there is a performance problem .if the first host is down,
>     all connecting must be blocked until connect timeout at first,
>     and then try connect to the next host.
>
>     Why not adjust the order of hosts dynamically?
>     For excample, after a successful connecting,if the target host is not the first host,
>     swap the target host and the first host.
>     And then subsequent connecting will try the most suitable host at first.
>
>
>     --
>     Best Regards，
>     Chen Huajun
>     (2012/12/12 17:09), kaprikorn07 wrote:
>
>         Hi All,
>
>         As Bruce Adams has mentioned, please let me know if there is any other way
>         to do it other than pgBouncer and pgPool.
>
>         Please help!
>
>
>
>         --
>         View this message in context: http://postgresql.1045698.n5. nabble.com/Re-Failover-
Datasource-tp4496411p5736270.html 
>         <http://postgresql.1045698.n5.nabble.com/Re-Failover-Datasource-tp4496411p5736270.html>
>         Sent from the PostgreSQL - jdbc mailing list archive at Nabble.com.
>
>
>
>
>
>
>     --
>     Sent via pgsql-jdbc mailing list (pgsql-jdbc@postgresql.org <mailto:pgsql-jdbc@postgresql.org>)
>     To make changes to your subscription:
>     http://www.postgresql.org/ mailpref/pgsql-jdbc <http://www.postgresql.org/mailpref/pgsql-jdbc>
>
>

Re: performance problem of Failover Datasource?

From

Chen Huajun

Date:

14 December 2012, 13:31:15

Hi

I had make the patch,Please check it.

and I had run the testcases,the result is same as before modified.


--
Best Regards,
Chen Huajun

Attachment

pgjdbc_optimization_for_mutiServerUrl.patch

Re: performance problem of Failover Datasource?

From

Chen Huajun

Date:

15 December 2012, 07:12:21

Hi

In this patch,I use Collections.synchronizedSet to synchronize within multi-threads.
But i worry about locking operation is a litter frequent by Collections.synchronizedSet
  and may affect performance.
I think using keword "synchronized" explicitly instead of Collections.synchronizedSet
  may reduce times of locking.Is there any better suggestion?

In addition, I have a idea.
By adjusting the order of hosts we also can implement a simple load balance
while all of the hosts are master or read only slave.
For example:
Basically pick up the server randomly.If one server had dead remove it from the candidates,
and retry the next server.
And after a while(can be configured) re-add the dead host to the candidates because the dead
server may had been repaired.

What about that?

(2012/12/14 21:29), Chen Huajun wrote:
> Hi
>
> I had make the patch,Please check it.
>
> and I had run the testcases,the result is same as before modified.
>
>
>
>
>

--
Best Regards,
Chen Huajun

Re: performance problem of Failover Datasource?

From

Scott Harrington

Date:

15 December 2012, 16:19:12

On Sat, 15 Dec 2012, Chen Huajun wrote:

> In this patch,I use Collections.synchronizedSet to synchronize within
> multi-threads. But i worry about locking operation is a litter frequent
> by Collections.synchronizedSet and may affect performance. I think using
> keword "synchronized" explicitly instead of Collections.synchronizedSet
> may reduce times of locking.Is there any better suggestion?
>
> In addition, I have a idea. By adjusting the order of hosts we also can
> implement a simple load balance while all of the hosts are master or
> read only slave. For example: Basically pick up the server randomly.If
> one server had dead remove it from the candidates, and retry the next
> server. And after a while(can be configured) re-add the dead host to the
> candidates because the dead server may had been repaired.
>
> What about that?

Perhaps instead you could abstract this logic into a
org.postgresql.util.HostChooser interface, which would replace the
HostSpec[] array that currently gets passed around.

Something like this, which doesn't require synchronization but rather
stores a single volatile index of the "last known good" server address
(warning: coding off the top of my head).

public interface HostChooser implements Iterable<HostSpec> {
   Iterator<HostSpec> iterator();
   void reportSuccessfulConnection(HOstSpec hostSpec);
}

public class SingleHostChooser implements HostChooser
{
   private final HostSpec hostSpec;
   public SingleHostChooser(HostSpec hostSpec) { this.hostSpec = hostSpec; }
   public Iterator<HostSpec> iterator() { return Collections.singletonList(hostSpec).iterator(); }
   public void reportSuccessfulConnection(HostSpec ignored) {}
}

public class LastKnownGoodHostChooser implements HostChooser
{
   private final HostSpec[] hostSpecs;
   private volatile int lastKnownGood = -1;
   public LastKnownGoodHostChooser(HostSpec[] hostSpecs) { this.hostSpecs = hostSpecs.clone(); }

   public Iterator<HostSpec> iterator() {
     int first = lastKnownGood;
     if (first <= 0) {
       return Arrays.asList(hostSpecs).iterator();
     }
     ArrayList<HostSpec> reordered = new ArrayList<HostSpec>(hostSpecs.length);
     for (int ii = first; ii < hostSpecs.length; ++ii) {
       reordered.add(hostSpecs[ii]);
     }
     for (int ii = 0; ii < first; ++ii) {
       reordered.add(hostSpecs[ii]);
     }
     return reordered.iterator();
   }

   public void reportSuccessfulConnection(HostSpec hostSpec) {
     lastKnownGood = Arrays.asList(hostSpecs).indexOf(hostSpec);
   }
}

Re: performance problem of Failover Datasource?

From

Chen Huajun

Date:

18 December 2012, 00:07:08

Thanks for you advise.
I will try to  made a new patch and add load balance supporting.


--
Best Regards，
Chen Huajun

(2012/12/16 0:19), Scott Harrington wrote:
> On Sat, 15 Dec 2012, Chen Huajun wrote:
>
>> In this patch,I use Collections.synchronizedSet to synchronize within multi-threads. But i worry about locking
operationis a litter frequent by Collections.synchronizedSet and may affect 
>> performance. I think using keword "synchronized" explicitly instead of Collections.synchronizedSet may reduce times
oflocking.Is there any better suggestion? 
>>
>> In addition, I have a idea. By adjusting the order of hosts we also can implement a simple load balance while all of
thehosts are master or read only slave. For example: Basically pick up the 
>> server randomly.If one server had dead remove it from the candidates, and retry the next server. And after a
while(canbe configured) re-add the dead host to the candidates because the dead server 
>> may had been repaired.
>>
>> What about that?
>
> Perhaps instead you could abstract this logic into a org.postgresql.util.HostChooser interface, which would replace
theHostSpec[] array that currently gets passed around. 
>
> Something like this, which doesn't require synchronization but rather stores a single volatile index of the "last
knowngood" server address (warning: coding off the top of my head). 
>
> public interface HostChooser implements Iterable<HostSpec> {
> Iterator<HostSpec> iterator();
> void reportSuccessfulConnection(HOstSpec hostSpec);
> }
>
> public class SingleHostChooser implements HostChooser
> {
> private final HostSpec hostSpec;
> public SingleHostChooser(HostSpec hostSpec) { this.hostSpec = hostSpec; }
> public Iterator<HostSpec> iterator() { return Collections.singletonList(hostSpec).iterator(); }
> public void reportSuccessfulConnection(HostSpec ignored) {}
> }
>
> public class LastKnownGoodHostChooser implements HostChooser
> {
> private final HostSpec[] hostSpecs;
> private volatile int lastKnownGood = -1;
> public LastKnownGoodHostChooser(HostSpec[] hostSpecs) { this.hostSpecs = hostSpecs.clone(); }
>
> public Iterator<HostSpec> iterator() {
> int first = lastKnownGood;
> if (first <= 0) {
> return Arrays.asList(hostSpecs).iterator();
> }
> ArrayList<HostSpec> reordered = new ArrayList<HostSpec>(hostSpecs.length);
> for (int ii = first; ii < hostSpecs.length; ++ii) {
> reordered.add(hostSpecs[ii]);
> }
> for (int ii = 0; ii < first; ++ii) {
> reordered.add(hostSpecs[ii]);
> }
> return reordered.iterator();
> }
>
> public void reportSuccessfulConnection(HostSpec hostSpec) {
> lastKnownGood = Arrays.asList(hostSpecs).indexOf(hostSpec);
> }
> }
>
>
>
>
>

Re: performance problem of Failover Datasource?

From

Craig Ringer

Date:

18 December 2012, 00:29:46

On 16/12/2012 12:19 AM, Scott Harrington wrote:
> On Sat, 15 Dec 2012, Chen Huajun wrote:
>
>> In this patch,I use Collections.synchronizedSet to synchronize within
>> multi-threads. But i worry about locking operation is a litter
>> frequent by Collections.synchronizedSet and may affect performance. I
>> think using keword "synchronized" explicitly instead of
>> Collections.synchronizedSet may reduce times of locking.Is there any
>> better suggestion?
>>
>> In addition, I have a idea. By adjusting the order of hosts we also
>> can implement a simple load balance while all of the hosts are master
>> or read only slave. For example: Basically pick up the server
>> randomly.If one server had dead remove it from the candidates, and
>> retry the next server. And after a while(can be configured) re-add
>> the dead host to the candidates because the dead server may had been
>> repaired.
>>
>> What about that?
>
> Perhaps instead you could abstract this logic into a
> org.postgresql.util.HostChooser interface, which would replace the
> HostSpec[] array that currently gets passed around.
There's certainly value to that. We're almost certainly goin to want to
support things like connecting to any slave and asking it "which server
is the master I should connect to" in future, so a way to abstract host
selection would be a big win.

--
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Re: performance problem of Failover Datasource?

From

Chen Huajun

Date:

24 December 2012, 10:27:57

Hi

I have made a new patch(with my test). Please give a look.
It support the following features
1) performance improve for fail over by avoiding dead hosts.
2) simple load balance by picking up the first host from multiple valid hosts randomly.
3) ability of choosing master or slave to connect to.

And in the patch, three connection parameters were added.

targetServerType = String
Specifies what kind of server to connect.The value should be one of the following:
  any
  master
  slave
  slavefirst (Try connecting to the slaves first.If failed try the master)
The default is 'any'.

enableLoadBalance = boolean
Enable or disable load balance when multiple hosts were specified;If load balance is enabled,specified multiple hosts
willbe picked up randomly. 
The default is false.

failedHostCheckPeriod = int
Specifies period(seconds) to check whether the failed hosts had been repaired, when load balance is enabled; 0 means
nevercheck. 
The default is 600 seconds.

(2012/12/18 8:05), Chen Huajun wrote:
>
> Thanks for you advise.
> I will try to made a new patch and add load balance supporting.
>
>

--
Best Regards,
Chen Huajun

Attachment

Re: performance problem of Failover Datasource?

From

Scott Harrington

Date:

24 December 2012, 20:10:59

Hmm, there's some neat stuff in there, slave-only, slavefirst, etc.

But could you (and perhaps Mikko Tiihonen who originally proposed the
"Simple connection failover support") remind the rest of us why we want
this complexity inside the pgjdbc driver, rather than in a more robust and
featureful layer like pgpool-II?

At first glance, there are a couple of issues:

1. Double-Checked Locking in reportHostStatus, which is bad form

2. Synchronized code in a subclass that locks the base class

3. No need for 'volatile' if you're also using 'synchronized'

Taking a step back, it seems you have implemented a DNS-like static
(JVM-global) helper which performs lazy-caching of information about
servers. I would argue PGJDBC itself should only do simple single-host
connections, but perhaps provide a well-documented HostChooser interface
and a JVM-global (static) method such as Driver.setHostChooser(), similar
to Driver.setLogLevel(), so that applicaitons that need to override the
default "DNS lookups" (or "host choosing") may do so.

Applications that want the load balancing would use something like
"host=myvirtualpool" which would would obviously fail unless the you've
installed some sort of LoadBalanceHostChooser, which knows about all the
"realservers" that comprise the "myvirtualpool" and their
master/slave/OK/dead status. (Starts to sound more and more like pgpool-II
or other projects that already exist for this).

Side benefit is your LoadBalanceHostChooser could be designed to do
"eager" connection probing on worker threads so that when an application
thread needs a PGJDBC connection, you would avoid any of the slow
connection / dead server issues you were originally trying to solve.

On Mon, 24 Dec 2012, Chen Huajun wrote:

> Hi
>
> I have made a new patch(with my test). Please give a look.
> It support the following features
> 1) performance improve for fail over by avoiding dead hosts.
> 2) simple load balance by picking up the first host from multiple valid hosts
> randomly.
> 3) ability of choosing master or slave to connect to.
>
> And in the patch, three connection parameters were added.
>
> targetServerType = String
> Specifies what kind of server to connect.The value should be one of the
> following:
> any
> master
> slave
> slavefirst (Try connecting to the slaves first.If failed try the master)
> The default is 'any'.
>
> enableLoadBalance = boolean
> Enable or disable load balance when multiple hosts were specified;If load
> balance is enabled,specified multiple hosts will be picked up randomly.
> The default is false.
>
> failedHostCheckPeriod = int
> Specifies period(seconds) to check whether the failed hosts had been
> repaired, when load balance is enabled; 0 means never check.
> The default is 600 seconds.
>
>
> (2012/12/18 8:05), Chen Huajun wrote:
>>
>> Thanks for you advise.
>> I will try to made a new patch and add load balance supporting.
>>
>>
>
> --
> Best Regards,
> Chen Huajun
>
>

Re: performance problem of Failover Datasource?

From

Chen Huajun

Date:

25 December 2012, 13:47:36

 > But could you (and perhaps Mikko Tiihonen who originally proposed the "Simple connection failover support") remind
therest of us why we want this complexity inside the pgjdbc driver, rather than in a 
 > more robust and featureful layer like pgpool-II?

I saw some mails discussed about this subject, but I don't know all.
I think the performance and simple is the reason.Is it right?

 > Taking a step back, it seems you have implemented a DNS-like static (JVM-global) helper which performs lazy-caching
ofinformation about servers. I would argue PGJDBC itself should only do simple 
 > single-host connections, but perhaps provide a well-documented HostChooser interface and a JVM-global (static)
methodsuch as Driver.setHostChooser(), similar to Driver.setLogLevel(), so that 
 > applicaitons that need to override the default "DNS lookups" (or "host choosing") may do so.

I think the problem is if some simple HostChoosers are useful.
If yes and they are not so complex for a jdbc drvier,why not providing them.
By the way,if HostChooser implement choosing is needed,a JVM-global property(such as
-Dxx.xx.HostChooser=XXXXHostChooser)
or a connection parameter may be more suitable,because they can avoid modifying applicaiton's source while change
HostChooserimplement. 

 > Applications that want the load balancing would use something like "host=myvirtualpool" which would would obviously
failunless the you've installed some sort of LoadBalanceHostChooser, which knows 
 > about all the "realservers" that comprise the "myvirtualpool" and their master/slave/OK/dead status. (Starts to
soundmore and more like pgpool-II or other projects that already exist for this). 

Really I know some products just do as "host=myvirtualpool".
And sometime the form is more convenient and easy to customizing,
the shortcoming may be more complex just as you said.



Now ,Could you explain the the detail of following issues?

 > At first glance, there are a couple of issues:
 >
 > 1. Double-Checked Locking in reportHostStatus, which is bad form

I knows a problem of Double-Checked Locking,while used in singleton pattern as following.

     public static Singleton getInstance() {
            if (instance == null) {
                synchronized (Singleton.class) {
                    if (instance == null) {
                        instance = new Singleton();
                    }
                }
            }
            return instance;
        }

because JVM would run "instance = new Singleton(); " as that :

mem = allocate();
instance = mem;
ctorSingleton(instance);

Do you think my code has the same problem or just it looks ugly?

 > 2. Synchronized code in a subclass that locks the base class

Synchronized code is for hostStatusCache which shared by all subclass,
so locks the base class.Is there any problems?


 > 3. No need for 'volatile' if you're also using 'synchronized'

'synchronized' is only used for write,'volatile' is for read.
It's for performance and is a bit complex.(It may be a excessive design)
I worry about JVM will optimize the following code

   HashMap<String,HostStatus> newHostStatusMap = (HashMap<String, HostStatus>) hostStatusCache.clone();
   newHostStatusMap.put(hostSpecKey, hostStatus);
   hostStatusCache=newHostStatusMap;
as:
   hostStatusCache=(HashMap<String, HostStatus>) hostStatusCache.clone();
   hostStatusCache.put(hostSpecKey, hostStatus);

do you know if it will happen?


(2012/12/25 4:10), Scott Harrington wrote:
> Hmm, there's some neat stuff in there, slave-only, slavefirst, etc.
>
> But could you (and perhaps Mikko Tiihonen who originally proposed the "Simple connection failover support") remind
therest of us why we want this complexity inside the pgjdbc driver, rather than in a 
> more robust and featureful layer like pgpool-II?
>
> At first glance, there are a couple of issues:
>
> 1. Double-Checked Locking in reportHostStatus, which is bad form
>
> 2. Synchronized code in a subclass that locks the base class
>
> 3. No need for 'volatile' if you're also using 'synchronized'
>
> Taking a step back, it seems you have implemented a DNS-like static (JVM-global) helper which performs lazy-caching
ofinformation about servers. I would argue PGJDBC itself should only do simple 
> single-host connections, but perhaps provide a well-documented HostChooser interface and a JVM-global (static) method
suchas Driver.setHostChooser(), similar to Driver.setLogLevel(), so that 
> applicaitons that need to override the default "DNS lookups" (or "host choosing") may do so.
>
> Applications that want the load balancing would use something like "host=myvirtualpool" which would would obviously
failunless the you've installed some sort of LoadBalanceHostChooser, which knows 
> about all the "realservers" that comprise the "myvirtualpool" and their master/slave/OK/dead status. (Starts to sound
moreand more like pgpool-II or other projects that already exist for this). 
>
> Side benefit is your LoadBalanceHostChooser could be designed to do "eager" connection probing on worker threads so
thatwhen an application thread needs a PGJDBC connection, you would avoid any of 
> the slow connection / dead server issues you were originally trying to solve.
>
>
> On Mon, 24 Dec 2012, Chen Huajun wrote:
>
>> Hi
>>
>> I have made a new patch(with my test). Please give a look.
>> It support the following features
>> 1) performance improve for fail over by avoiding dead hosts.
>> 2) simple load balance by picking up the first host from multiple valid hosts randomly.
>> 3) ability of choosing master or slave to connect to.
>>
>> And in the patch, three connection parameters were added.
>>
>> targetServerType = String
>> Specifies what kind of server to connect.The value should be one of the following:
>> any
>> master
>> slave
>> slavefirst (Try connecting to the slaves first.If failed try the master)
>> The default is 'any'.
>>
>> enableLoadBalance = boolean
>> Enable or disable load balance when multiple hosts were specified;If load balance is enabled,specified multiple
hostswill be picked up randomly. 
>> The default is false.
>>
>> failedHostCheckPeriod = int
>> Specifies period(seconds) to check whether the failed hosts had been repaired, when load balance is enabled; 0 means
nevercheck. 
>> The default is 600 seconds.
>>
>>
>> (2012/12/18 8:05), Chen Huajun wrote:
>>>
>>> Thanks for you advise.
>>> I will try to made a new patch and add load balance supporting.
>>>
>>>
>>
>> --
>> Best Regards,
>> Chen Huajun
>>
>>
>
>

--
Best Regards,
Chen Huajun

Re: performance problem of Failover Datasource?

From

Florent Guillaume

Date:

26 December 2012, 11:17:22

On Tue, Dec 25, 2012 at 2:45 PM, Chen Huajun <chenhj@cn.fujitsu.com> wrote:
> Now ,Could you explain the the detail of following issues?
>> At first glance, there are a couple of issues:
>>
>> 1. Double-Checked Locking in reportHostStatus, which is bad form
>
> I knows a problem of Double-Checked Locking,while used in singleton pattern
> as following.
>
>     public static Singleton getInstance() {
>                 if (instance == null) {
>                     synchronized (Singleton.class) {
>                         if (instance == null) {
>                             instance = new Singleton();
>                         }
>                     }
>                 }
>                 return instance;
>             }
>
> because JVM would run "instance = new Singleton(); " as that :
>
> mem = allocate();
> instance = mem;
> ctorSingleton(instance);
>
> Do you think my code has the same problem or just it looks ugly?

Double-checked locking generally is incorrect, and does not work. It
ONLY works if you're double-checking a volatile variable and using
Java >= 5.
Please read http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html


>> 3. No need for 'volatile' if you're also using 'synchronized'
>
> 'synchronized' is only used for write,'volatile' is for read.

This sentence shows you're confused about synchronization and
multi-threading in Java.

> It's for performance and is a bit complex.(It may be a excessive design)
> I worry about JVM will optimize the following code
>
>   HashMap<String,HostStatus> newHostStatusMap = (HashMap<String,
> HostStatus>) hostStatusCache.clone();
>   newHostStatusMap.put(hostSpecKey, hostStatus);
>   hostStatusCache=newHostStatusMap;
> as:
>   hostStatusCache=(HashMap<String, HostStatus>) hostStatusCache.clone();
>   hostStatusCache.put(hostSpecKey, hostStatus);
>
> do you know if it will happen?

Of course it can. The JVM is free to reorder a lot of things while
respecting the Java Memory Model.

All this means you shouldn't try to play games with the JVM, just use
a basic lock or synchronization primitive where you need it.

Florent

--
Florent Guillaume, Director of R&D, Nuxeo
Open Source, Java EE based, Enterprise Content Management (ECM)
http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87

Re: performance problem of Failover Datasource?

From

Chen Huajun

Date:

26 December 2012, 11:46:51

I got it, thanks!

(2012/12/26 19:17), Florent Guillaume wrote:
> On Tue, Dec 25, 2012 at 2:45 PM, Chen Huajun<chenhj@cn.fujitsu.com>  wrote:
>> Now ,Could you explain the the detail of following issues?
>>> At first glance, there are a couple of issues:
>>>
>>> 1. Double-Checked Locking in reportHostStatus, which is bad form
>>
>> I knows a problem of Double-Checked Locking,while used in singleton pattern
>> as following.
>>
>>      public static Singleton getInstance() {
>>                  if (instance == null) {
>>                      synchronized (Singleton.class) {
>>                          if (instance == null) {
>>                              instance = new Singleton();
>>                          }
>>                      }
>>                  }
>>                  return instance;
>>              }
>>
>> because JVM would run "instance = new Singleton(); " as that :
>>
>> mem = allocate();
>> instance = mem;
>> ctorSingleton(instance);
>>
>> Do you think my code has the same problem or just it looks ugly?
>
> Double-checked locking generally is incorrect, and does not work. It
> ONLY works if you're double-checking a volatile variable and using
> Java>= 5.
> Please read http://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedLocking.html
>
>
>>> 3. No need for 'volatile' if you're also using 'synchronized'
>>
>> 'synchronized' is only used for write,'volatile' is for read.
>
> This sentence shows you're confused about synchronization and
> multi-threading in Java.
>
>> It's for performance and is a bit complex.(It may be a excessive design)
>> I worry about JVM will optimize the following code
>>
>>    HashMap<String,HostStatus>  newHostStatusMap = (HashMap<String,
>> HostStatus>) hostStatusCache.clone();
>>    newHostStatusMap.put(hostSpecKey, hostStatus);
>>    hostStatusCache=newHostStatusMap;
>> as:
>>    hostStatusCache=(HashMap<String, HostStatus>) hostStatusCache.clone();
>>    hostStatusCache.put(hostSpecKey, hostStatus);
>>
>> do you know if it will happen?
>
> Of course it can. The JVM is free to reorder a lot of things while
> respecting the Java Memory Model.
>
> All this means you shouldn't try to play games with the JVM, just use
> a basic lock or synchronization primitive where you need it.
>
> Florent
>
> --
> Florent Guillaume, Director of R&D, Nuxeo
> Open Source, Java EE based, Enterprise Content Management (ECM)
> http://www.nuxeo.com   http://www.nuxeo.org   +33 1 40 33 79 87
>
>

--
Best Regards,
Chen Huajun