Re: Support load balancing in libpq - Mailing list pgsql-hackers

From Jelte Fennema
Subject Re: Support load balancing in libpq
Date
Msg-id DBBPR83MB0507F1883B6B0E4BA2E60195F75B9@DBBPR83MB0507.EURPRD83.prod.outlook.com
Whole thread Raw
In response to Re: [EXTERNAL] Re: Support load balancing in libpq  (Michael Banck <mbanck@gmx.net>)
Responses Re: Support load balancing in libpq  (Jelte Fennema <Jelte.Fennema@microsoft.com>)
List pgsql-hackers
I attached a new patch which does the following:
1. adds tap tests
2. adds random_seed parameter to libpq (required for tap tests)
3. frees conn->loadbalance in freePGConn
4. add more expansive docs on the feature its behaviour

Apart from bike shedding on the name of the option I think it's pretty good now.

> Isn't this exactly what connect_timeout is providing? In my tests, it
> worked exactly as I would expect it, i.e. after connect_timeout seconds,
> libpq was re-shuffling and going for another host.

Yes, this was the main purpose of multiple hosts previously. This patch
doesn't change that, and it indeed continues to work when enabling
load balancing too. I included this in the tap tests.

> I tested this some more, and found it somewhat surprising that at least
> when looking at it on a microscopic level, some hosts are chosen more
> often than the others for a while.

That does seem surprising, but it looks like it might simply be bad luck.
Did you compile with OpenSSL support? Otherwise, the strong random
source might not be used.

> So it looks like it load-balances between pg1 and pg3, and not between
> the three IPs -  is this expected?
>
> If I switch from "host=pg1,pg3" to "host=pg1,pg1,pg3", each IP adress is
> hit roughly equally.
>
> So I guess this is how it should work, but in that case I think the
> documentation should be more explicit about what is to be expected if a
> host has multiple IP addresses or hosts are specified multiple times in
> the connection string.

Yes, this behaviour is expected I tried to make that clearer in the newest
version of the docs.

> For the patch itself, I think it is better to use a more precise time
> function in libpq_prng_init or call it only once.
> Thought it is a special corner case, imagine all the connection attempts at
> first second will be seeded with the save

I agree that using microseconds would probably be preferable. But that seems
like a separate patch, since I took this initialization code from the InitProcessGlobals
function. Also, it shouldn't be a big issue in practice, since usually the strong random
source will be used.
Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: problems with making relfilenodes 56-bits
Next
From: Justin Pryzby
Date:
Subject: Re: PostgreSQL 15 GA release date