Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternativehosts when some errors occur - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternativehosts when some errors occur |
Date | |
Msg-id | CA+TgmoaPNOqtwOmXF-dNSBLvTBBdMouycKb2UxiJRRQu3134=g@mail.gmail.com Whole thread Raw |
In response to | Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternativehosts when some errors occur ("Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com>) |
Responses |
Re: [HACKERS] [bug fix] PG10: libpq doesn't connect to alternative hosts when some errors occur
|
List | pgsql-hackers |
On Sun, May 14, 2017 at 9:50 PM, Tsunakawa, Takayuki <tsunakawa.takay@jp.fujitsu.com> wrote: >> I guess not as well. That would be tricky for the user to have a different >> behavior depending on the error returned by the server, which is why the >> current code is doing things right IMO. Now, the feature has been designed >> similarly to JDBC with its parametrization, so it could be surprising for >> users to get a different failure handling compared to that. Not saying that >> JDBC is doing it wrong, but libpq does nothing wrong either. > > I didn't intend to make the user have a different behavior depending on the error returned by the server. I meant attemptingconnection to alternative hosts when the server returned an error. I thought the new libpq feature tries to connectto other hosts when a connection attempt fails, where the "connection" is the *database connection* (user's perspective),not the *socket connection* (PG developer's perspective). I think PgJDBC meets the user's desire better --"Please connect to some host for better HA if a database server is unavailable for some reason." > > By the way, could you elaborate what problem could occur if my solution is applied? (it doesn't seem easy for me to imagine...) Sure. Imagine that the user thinks that 'foo' and 'bar' are the relevant database servers for some service and writes 'dbname=quux host=foo,bar' as a connection string. However, actually the user has made a mistake and 'foo' is supporting some other service entirely; it has no database 'quux'; the database servers which have database 'quux' are in fact 'bar' and 'baz'. All appears well as long as 'bar' remains up, because the missing-database error for 'foo' is ignored and we just connect to 'bar'. However, when 'bar' goes down then we are out of service instead of failing over to 'baz' as we should have done. Now it's quite possible that the user, if they test carefully, might realize that things are not working as intended, because the DBA might say "hey, all of your connections are being directed to 'bar' instead of being load-balanced properly!". But even if they are careful enough to realize this, it may not be clear what has gone wrong. Under your proposal, the connection to 'foo' could be failing for *any reason whatsoever* from lack of connectivity to a missing database to a missing user to a missing CONNECT privilege to an authentication failure. If the user looks at the server log and can pick out the entries from their own connection attempts they can figure it out, but otherwise they might spend quite a bit of time wondering what's wrong; after all, libpq will report no error, as long as the connection to the other server works. Now, this is all arguable. You could certainly say -- and you are saying -- that this feature ought to be defined to retry after any kind of failure whatsoever. But I think what Tom and Michael and I are saying is that this is a failover feature and therefore ought to try the next server when the first one in the list appears to have gone down, but not when the first one in the list is unhappy with the connection request for some other reason. Who is right is a judgement call, but I don't think it's self-evident that users want to ignore anything and everything that might have gone wrong with the connection to the first server, rather than only those things which resemble a down server. It seems quite possible to me that if we had defined it as you are proposing, somebody would now be arguing for a behavior change in the other direction. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: