Re: Proposal: Implement failover on libpq connect level. - Mailing list pgsql-hackers

From Christopher Browne
Subject Re: Proposal: Implement failover on libpq connect level.
Date
Msg-id CAFNqd5Xu=DZ3b7CJ68Wf0m+QWfUWcwkvPbypj2kMfbb1oUT5tg@mail.gmail.com
Whole thread Raw
In response to Re: Proposal: Implement failover on libpq connect level.  ("Shulgin, Oleksandr" <oleksandr.shulgin@zalando.de>)
List pgsql-hackers
On 2 September 2015 at 04:52, Shulgin, Oleksandr <oleksandr.shulgin@zalando.de> wrote:
>
> On Tue, Sep 1, 2015 at 8:12 PM, Andres Freund <andres@anarazel.de> wrote:
>>
>> On 2015-09-01 14:07:19 -0400, Robert Haas wrote:
>> > But I think it's quite wrong to assume that the infrastructure for
>> > this is available and usable everywhere, because in my experience,
>> > that's far from the case.
>>
>> Especially when the alternative is a rather short patch implementing an
>> otherwise widely available feature.
>
>
> But that won't actually help in the case described by Robert: if the master server A failed, the client has no idea if B or C would become the new master.
>
> Unless it actually tries to connect them in turn and check for the result of pg_is_in_recovery().  I think that brings enough complexity for keeping this outside of libpq.  Also think about all the extra flexibility people will likely want to have: number of retries, delay between retries, delay backoff, etc., to the point we'll have to support some sort of client code retry_policy_callback.
>
> For read-only clients you might want to include a number of slave hostnames, and let the connector choose one, but then again you can't achieve load-balancing on the client side, you're better off using round-robin DNS.  To add or remove a slave you only need to update DNS, and not configuration on all the clients.
>
> For the master failover I think a common technique is to just move the floating IP address from the old master to the new one.  This doesn't require touching the DNS record.

It seems to me that any time the answer involves touching the DNS record, then having this be libpq configuration is terribly inappropriate.

I'll note that OpenLDAP tends to accept multiple database URIs, and is nonspecific as to the relative semantics.  That points me at two things that may be worth considering for libpq:

1.  It seems to me that it would be ridiculous to try to have semantics trying to interpret multiple values passed to PGPORT/PGHOST/PGDATABASE/-p/-h/-d/, but rather, that if there are to be multiple values, they should be presented as URIs.

I'm finding, by the way, that it is very useful in simplifying application configuration to use the URIs that have been usable since about 9.2.

I <3
postgresql://postgres@localhost:7093/myfavedatabase

2.  I'd be inclined to go with pretty simple functionality in anything that would be built-in to libpq.  No great amount of logic, rather

   a) Try the first URI.  If that works (where "connects" = "works"), we're done.
   b) Try the next URI.  If that works, we're done.
   c) Is there another one?  Try it...  Works... Done.  Repeat as needful.

There's all kinds of functionality that you could add, between...
  - Random selection of URI.
  - Running some query against each URI tried to see if it's a "master/slave", changing client functionality based on that.
  - Picking URI based on connecting (somewhere) and deciding what URI to prefer.

I'd suggest implementing the simplest thing (e.g. - try URI 1, then 2, then ...), as that's most likely the most useful.
And more sophisticated functionality probably ought to be in the application anyways.
--
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: [PATCH] Microvacuum for gist.
Next
From: Andres Freund
Date:
Subject: Re: Allow a per-tablespace effective_io_concurrency setting