Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch - Mailing list pgsql-hackers

From Laurenz Albe
Subject Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch
Date
Msg-id 6919b4d51c5aa36f6de8f99c1874fab58dae40eb.camel@cybertec.at
Whole thread Raw
In response to Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch  (Evgeny Kuzin <evgeny.kuzin@outlook.com>)
Responses Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch
Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch
List pgsql-hackers
On Thu, 2026-03-05 at 14:59 +0000, Evgeny Kuzin wrote:
> We run a PostgreSQL clusters with streaming replication. After a failover, the old primary
> becomes a standby and vice versa. The challenge is: how do clients find the new primary?
>
> Current options:
>    1. Update DNS on every failover - operationally complex, TTL delays, requires automation

Your proposal would also suffer from TTL delays in the case of a cluster reconfiguration.

>    2. Consul/etcd - adds operational complexity and another failure domain
>    3. Multiple hosts in connection string - requires application changes when cluster
>       topology changes (e.g., adding a new standby)
>
> The proposed approach:
>  * Single A-record (db.internal) pointing to all cluster member IPs
>  * Clients connect with
>    host=db.internal target_session_attrs=read-write
>  * libpq tries each IP until it finds the primary
>
> IIUC this​ is how JDBC'stargetServerType=primary works - it iterates through all resolved
> addresses. The "useless connection attempts" are actually the feature: it's probing to
> find the right server, same as when you specify multiple hosts explicitly.
> The only difference fromhost=pg1,pg2,pg3 is that DNS provides the list instead of the
> connection string. From libpq's perspective, why should it matter where the address list came from?

I see the point of your proposal.

One example of what Tom worries about is "localhost" resolving to both "127.0.0.1" and "::1",
a very common case.  With the proposed change, any connection attempt to "localhost" that fails
would now take twice as long to fail.  Also, if the problem is authentication, the server would
perform two authentication attempts.  That is a clear regression that may affect many people.

The question is whether the overall benefits of your proposal (which certainly makes sense
in a setup like you describe) would be worth a performance and resource usage regression like
the one I described above.  Or can you see a way to modify your approach so that that problem
can be avoided?

Yours,
Laurenz Albe



pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: Refactor recovery conflict signaling a little
Next
From: Andres Freund
Date:
Subject: Re: index prefetching