Attached is an updated patch with the following changes:
1. rebased (including solved merge conflict)
2. fixed failing tests in CI
3. changed the commit message a little bit
4. addressed the two remarks from Micheal
5. changed the prng_state from a global to a connection level value for thread-safety
6. use pg_prng_uint64_range
> Maybe my imagination is not so great, but what else than hosts could we
> possibly load-balance? I don't mind calling it load_balance, but I also
> don't feel very strongly one way or the other and this is clearly
> bikeshed territory.
I agree, which is why I called it load_balance in my original patch. But I also
think it's useful to match the naming for the already existing implementations
in the PG ecosystem around this. But like you I don't really feel strongly either
way. It's a tradeoff between short name and consistency in the ecosystem.
> If I understand correctly, you've added DNS-based load balancing on top
> of just shuffling the provided hostnames. This makes sense if a
> hostname is backed by more than one IP address in the context of load
> balancing, but it also complicates the patch. So I'm wondering how much
> shorter the patch would be if you leave that out for now?
Yes, that's correct and indeed the patch would be simpler without, i.e. all the
addrinfo changes would become unnecessary. But IMHO the behaviour of
the added option would be very unexpected if it didn't load balance across
multiple IPs in a DNS record. libpq currently makes no real distinction in
handling of provided hosts and handling of their resolved IPs. If load balancing
would only apply to the host list that would start making a distinction
between the two.
Apart from that the load balancing across IPs is one of the main reasons
for my interest in this patch. The reason is that it allows expanding or reducing
the number of nodes that are being load balanced across transparently to the
application. Which means that there's no need to re-deploy applications with
new connection strings when changing the number hosts.
> On the other hand, I believe pgJDBC keeps track of which hosts are up or
> down and only load balances among the ones which are up (maybe
> rechecking after a timeout? I don't remember), is this something you're
> doing, or did you consider it?
I don't think it's possible to do this in libpq without huge changes to its
architecture, since normally a connection will only a PGconn will only
create a single connection. The reason pgJDBC can do this is because
it's actually a connection pooler, so it will open more than one connection
and can thus keep some global state about the different hosts.
Jelte