Re: Support load balancing in libpq - Mailing list pgsql-hackers

From Bharath Rupireddy
Subject Re: Support load balancing in libpq
Date
Msg-id CALj2ACXywtN=EhvD_Qi1CxqniwwA4YT0pTz+VKeZ3bLAt2+Lvw@mail.gmail.com
Whole thread Raw
In response to Support load balancing in libpq  (Jelte Fennema <Jelte.Fennema@microsoft.com>)
Responses Re: [EXTERNAL] Re: Support load balancing in libpq
List pgsql-hackers
On Fri, Jun 10, 2022 at 10:01 PM Jelte Fennema
<Jelte.Fennema@microsoft.com> wrote:
>
> Load balancing connections across multiple read replicas is a pretty
> common way of scaling out read queries. There are two main ways of doing
> so, both with their own advantages and disadvantages:
> 1. Load balancing at the client level
> 2. Load balancing by connecting to an intermediary load balancer
>
> Option 1 has been supported by JDBC (Java) for 8 years and Npgsql (C#)
> merged support about a year ago. This patch adds the same functionality
> to libpq. The way it's implemented is the same as the implementation of
> JDBC, and contains two levels of load balancing:
> 1. The given hosts are randomly shuffled, before resolving them
>     one-by-one.
> 2. Once a host its addresses get resolved, those addresses are shuffled,
>     before trying to connect to them one-by-one.

Thanks for the patch. +1 for the general idea of redirecting connections.

I'm quoting a previous attempt by Satyanarayana Narlapuram on this
topic [1], it also has a patch set.

IMO, rebalancing of the load must be based on parameters (as also
suggested by Aleksander Alekseev in this thread) such as the
read-only/write queries, CPU/IO/Memory utilization of the
primary/standby, network distance etc. We may not have to go the extra
mile to determine all of these parameters dynamically during query
authentication time, but we can let users provide a list of standby
hosts based on "some" priority (Satya's thread [1] attempts to do
this, in a way, with users specifying the hosts via pg_hba.conf file).
If required, randomization in choosing the hosts can be optional.

Also, IMO, the solution must have a fallback mechanism if the
standby/chosen host isn't reachable.

Few thoughts on the patch:
1) How are we determining if the submitted query is read-only or write?
2) What happens for explicit transactions? The queries related to the
same txn get executed on the same host right? How are we guaranteeing
this?
3) Isn't it good to provide a way to test the patch?

[1]
https://www.postgresql.org/message-id/flat/CY1PR21MB00246DE1F9E9C58455A78A37915C0%40CY1PR21MB0024.namprd21.prod.outlook.com

Regards,
Bharath Rupireddy.



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Using PQexecQuery in pipeline mode produces unexpected Close messages
Next
From: gkokolatos@pm.me
Date:
Subject: Re: Add LZ4 compression in pg_dump