Re: PROXY protocol support - Mailing list pgsql-hackers

From Jacob Champion
Subject Re: PROXY protocol support
Date
Msg-id 8aeede56330ca5721265428302c7f73e4e2e1264.camel@vmware.com
Whole thread Raw
In response to Re: PROXY protocol support  (Magnus Hagander <magnus@hagander.net>)
Responses Re: PROXY protocol support  (Magnus Hagander <magnus@hagander.net>)
List pgsql-hackers
Hi Magnus,

I'm only just starting to page this back into my head, so this is by no
means a full review of the v7 changes -- just stuff I've noticed over
the last day or so of poking around.

On Tue, 2021-06-29 at 11:48 +0200, Magnus Hagander wrote:
> On Thu, Mar 11, 2021 at 12:05 AM Jacob Champion <pchampion@vmware.com> wrote:
> > On Tue, 2021-03-09 at 11:25 +0100, Magnus Hagander wrote:
> > > - \x0 : LOCAL : [...] The receiver must accept this connection as
> > > valid and must use the real connection endpoints and discard the
> > > protocol block including the family which is ignored.
> > 
> > So we should ignore the entire "protocol block" (by which I believe
> > they mean the protocol-and-address-family byte) in the case of LOCAL,
> > and just accept it with the original address info intact. That seems to
> > match the sample code in the back of the spec. The current behavior in
> > the patch will apply the PROXY behavior incorrectly if the sender sends
> > a LOCAL header with something other than UNSPEC -- which is strange
> > behavior but not explicitly prohibited as far as I can see.
> 
> Yeah, I think we do the right thing in the "right usecase".

The current implementation is, I think, stricter than the spec asks
for. We're supposed to ignore the family for LOCAL cases, and it's not
clear to me whether we're supposed to also ignore the entire "fam"
family-and-protocol byte (the phrase "protocol block" is not actually
defined in the spec).

It's probably not a big deal in practice, but it could mess with
interoperability for lazier proxy implementations. I think I'll ask the
HAProxy folks for some clarification tomorrow.

> +      <term><varname>proxy_servers</varname> (<type>string</type>)
> +      <indexterm>
> +       <primary><varname>proxy_servers</varname> configuration parameter</primary>
> +      </indexterm>
> +      </term>
> +      <listitem>
> +       <para>
> +        A comma separated list of one or more host names, cidr specifications or the
> +        literal <literal>unix</literal>, indicating which proxy servers to trust when
> +        connecting on the port specified in <xref linkend="guc-proxy-port" />.

The documentation mentions that host names are valid in proxy_servers,
but check_proxy_servers() uses the AI_NUMERICHOST hint with
getaddrinfo(), so host names get rejected.

> +           GUC_check_errdetail("Invalid IP addrress %s", tok);

s/addrress/address/

I've been thinking more about your earlier comment:

> An interesting thing is what to do about
> inet_server_addr/inet_server_port. That sort of loops back up to the
> original question of where/how to expose the information about the
> proxy in general (since right now it just logs). Right now you can
> actually use inet_server_port() to see if the connection was proxied
> (as long as it was over tcp).

IMO these should return the "virtual" dst_addr/port, instead of
exposing the physical connection information to the client. That way,
if you intend for your proxy to be transparent, you're not advertising
your network internals to connected clients. It also means that clients
can reasonably expect to be able to reconnect to the addr:port that we
give them, and prevents confusion if the proxy is using an address
family that the client doesn't even support (e.g. the client is IPv4-
only but the proxy connects via IPv6).

--Jacob

pgsql-hackers by date:

Previous
From: Hannu Krosing
Date:
Subject: Re: [PoC] Improve dead tuple storage for lazy vacuum
Next
From: Masahiko Sawada
Date:
Subject: Re: Skipping logical replication transactions on subscriber side