Robert Haas <robertmhaas@gmail.com> writes:
> Takayuki Tsunakawa raised a very similar issue in another thread
> related to another open item, namely
> https://www.postgresql.org/message-id/flat/0A3221C70F24FB45833433255569204D1F6F5659%40G01JPEXMBYT05
> in which he argued that libpq ought to try then next host after a
> connection failure regardless of the reason for the connection
> failure. Tom, Michael Paquier, and I all disagreed; none of us
> believe that this feature was intended to retry the connection to a
> different host after an arbitrary error reported by the remote server.
> This thread is essentially the same issue, except here the question
> isn't what should happen after we connect to a server and it returns
> an error, but rather what happens when we time out waiting to connect
> to a server. When that happens, should we give up, or try the next
> server?
FWIW, I think the position most of us were taking is that this feature
is meant to retry transport-level connection failures, not cases where
we successfully make a connection to a server and then it rejects our
login attempt. I would classify a timeout as a transport-level failure
as long as it occurs before we got any server response --- if it happens
during the authentication protocol, that's less clear. But it might not
be very practical to distinguish those two cases.
In short, +1 for retrying on timeout during connection, and I'm okay with
retrying a timeout during authentication if it's not practical to treat
that differently.
regards, tom lane