Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch - Mailing list pgsql-hackers

From Evgeny Kuzin
Subject Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch
Date
Msg-id AM9PR09MB4900AA740C4A5DC772E6FB139744A@AM9PR09MB4900.eurprd09.prod.outlook.com
Whole thread Raw
In response to Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch  (Jacob Champion <jacob.champion@enterprisedb.com>)
Responses Re: [PATCH] libpq: try all addresses for a host before moving to next on target_session_attrs mismatch
List pgsql-hackers
> I would like to, as kindly as possible, say that I don't like *either*
> of these approaches, on this thread or the other.

I appreciate the careful pushback. A week into this discussion, I'm realizing why postgres takes this approach - a "simple" change touches millions of connections across every imaginable setup. It's worth getting right.


> I'm no DNS expert, but I can't shake the feeling that you're
> (mis)using round-robin A records to reimplement, say, SRV records

The SRV thread you mentioned seems promising - same use case (patroni/HA + target_session_attrs), clean separation of concerns. Would reviving SRV support be a direction you'd consider architecturally sound?


> I think you've tangled a Postgres-level concern (find me a host with
> these characteristics) with a socket-level concern (find me the
> addresses for a host), and the main reason you were able to do that
> was because PQconnectPoll() currently puts all those concerns into one
> impossibly complex function. If someone later wanted to replace
> getaddrinfo/connect with a Happy Eyeballs library, to cut down on
> connection times, this proposal would prevent them from doing that.
> (Both your patch, and the other thread's.) Personally I think we
> should reserve the ability to use any API that says "connect me to
> this hostname as fast as possible; I do not care how."


Another thought - what about cluster-aware routing at the protocol level? A standby could redirect to the primary - similar to HTTP 302. The cluster knows its own topology, libpq stays fast and dumb about it. That would preserve the "connect me as fast as possible" ability you mentioned. Though that feels like a bigger architectural lift compared to SRV.


Would either of these be worth exploring further?

pgsql-hackers by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: POC: PLpgSQL FOREACH IN JSON ARRAY
Next
From: Peter Eisentraut
Date:
Subject: Re: meson vs. llvm bitcode files