Thread: PROXY protocol support
Hello, Nowadays, PostgreSQL is often used behind proxies. Some are PostgreSQL protocol aware (Pgpool, PgBouncer), some are pure TCP (HAProxy). From the database instance point of view, all clients come from the proxy. There are two major problems with this topology: * It neutralizes the host based authentication. Every client shares the same source. Either we allow this source or not but we cannot allow clients on a more fine-grained basis, or not by the IP address. * It makes debugging harder. If we have a DDL or a slow query logged, we cannot use the source to identify who is responsible. On one hand, we can move the authentication and logging mechanisms to PostgreSQL based proxies but they will never be as complete as PostgreSQL itself. And they don't have features like HTTP health checks to redirect trafic to nodes (health, role, whatever behind the URL). On the other hand, those features are not implemented at all because they don't know the PostgreSQL protocol, they simply forward requests. In the HTTP reverse proxies world, there's a "dirty hack" to identify the source IP address: add an HTTP header "X-Forwared-For" to the request. It's the destination duty to do whatever they want with this information. With this feature in mind, someone from HAProxy has implemented this mechanism at the protocol level. It's called the PROXY protocol. With this piece of logic at the beginning of the protocol, we could implement a totally transparent proxy and benefit from the great features of PostgreSQL regarding clients. Note that MariaDB support the PROXY protocol in MaxScale (proxy) and MariaDB Server in recent versions. My question is, what do you think of this feature? Is it worth to spend time implementing it in PostgreSQL or not? Links: - http://www.haproxy.org/download/1.8/doc/proxy-protocol.txt - https://mariadb.com/kb/en/library/proxy-protocol-support/ Thanks, Julien PS: I've already sent this message to a wrong mailing list. Stephen Frost said it's implemented in pgbouncer but all I can find is an open issue: https://github.com/pgbouncer/pgbouncer/issues/241.
Greetings, * Julien Riou (julien@riou.xyz) wrote: > Nowadays, PostgreSQL is often used behind proxies. Some are PostgreSQL > protocol aware (Pgpool, PgBouncer), some are pure TCP (HAProxy). From > the database instance point of view, all clients come from the proxy. > > There are two major problems with this topology: > > * It neutralizes the host based authentication. Every client shares > the same source. Either we allow this source or not but we cannot allow > clients on a more fine-grained basis, or not by the IP address. You can instead have the IP-based checking done at the pooler. > * It makes debugging harder. If we have a DDL or a slow query logged, we > cannot use the source to identify who is responsible. Protocol-level poolers are able to do this, and pgbouncer does (see application_name_add_host). > On one hand, we can move the authentication and logging mechanisms to > PostgreSQL based proxies but they will never be as complete as > PostgreSQL itself. And they don't have features like HTTP health checks > to redirect trafic to nodes (health, role, whatever behind the URL). On > the other hand, those features are not implemented at all because they > don't know the PostgreSQL protocol, they simply forward requests. > > In the HTTP reverse proxies world, there's a "dirty hack" to identify > the source IP address: add an HTTP header "X-Forwared-For" to the > request. It's the destination duty to do whatever they want with this > information. With this feature in mind, someone from HAProxy has > implemented this mechanism at the protocol level. It's called the PROXY > protocol. Someone from HAProxy could certainly implement something similar by having HAProxy understand PostgreSQL's protocol. > With this piece of logic at the beginning of the protocol, we could > implement a totally transparent proxy and benefit from the great > features of PostgreSQL regarding clients. Note that MariaDB support the > PROXY protocol in MaxScale (proxy) and MariaDB Server in recent > versions. pgbouncer is already a transparent proxy that understands the PG protocol, and, even better, it has support for transaction-level pooling (as well as connection-level), which is really critical for larger PG deployments as PG backend startup is (relatively) expensive. > PS: I've already sent this message to a wrong mailing list. Stephen > Frost said it's implemented in pgbouncer but all I can find is an open > issue: https://github.com/pgbouncer/pgbouncer/issues/241. That would be some *other* proxy system (Amazon's ELB) that apparently also doesn't understand the PG protocol and therefore doesn't have a feature similar to pgbouncer's application_name_add_host. I haven't looked very closely at if it'd be possible to interpret the PROXY protocol thing that Amazon's ELB can do without confusing it with a regular PG authentication startup and I'm not sure if we'd really want to wed ourselves to something like that. Certainly, what pgbouncer does works quite well and is about as transparent to clients as possible. You'd almost certainly want something like pgbouncer after the ELB anyway to avoid having tons of connections to PG and avoid spinning up new backends constantly. Thanks, Stephen
Attachment
On May 19, 2019 5:59:04 PM GMT+02:00, Stephen Frost <sfrost@snowman.net> wrote: >Greetings, > >* Julien Riou (julien@riou.xyz) wrote: >> Nowadays, PostgreSQL is often used behind proxies. Some are >PostgreSQL >> protocol aware (Pgpool, PgBouncer), some are pure TCP (HAProxy). From >> the database instance point of view, all clients come from the proxy. >> >> There are two major problems with this topology: >> >> * It neutralizes the host based authentication. Every client shares >> the same source. Either we allow this source or not but we cannot >allow >> clients on a more fine-grained basis, or not by the IP address. > >You can instead have the IP-based checking done at the pooler. > >> * It makes debugging harder. If we have a DDL or a slow query logged, >we >> cannot use the source to identify who is responsible. > >Protocol-level poolers are able to do this, and pgbouncer does (see >application_name_add_host). > >> On one hand, we can move the authentication and logging mechanisms to >> PostgreSQL based proxies but they will never be as complete as >> PostgreSQL itself. And they don't have features like HTTP health >checks >> to redirect trafic to nodes (health, role, whatever behind the URL). >On >> the other hand, those features are not implemented at all because >they >> don't know the PostgreSQL protocol, they simply forward requests. >> >> In the HTTP reverse proxies world, there's a "dirty hack" to identify >> the source IP address: add an HTTP header "X-Forwared-For" to the >> request. It's the destination duty to do whatever they want with this >> information. With this feature in mind, someone from HAProxy has >> implemented this mechanism at the protocol level. It's called the >PROXY >> protocol. > >Someone from HAProxy could certainly implement something similar by >having HAProxy understand PostgreSQL's protocol. > >> With this piece of logic at the beginning of the protocol, we could >> implement a totally transparent proxy and benefit from the great >> features of PostgreSQL regarding clients. Note that MariaDB support >the >> PROXY protocol in MaxScale (proxy) and MariaDB Server in recent >> versions. > >pgbouncer is already a transparent proxy that understands the PG >protocol, and, even better, it has support for transaction-level >pooling >(as well as connection-level), which is really critical for larger PG >deployments as PG backend startup is (relatively) expensive. > >> PS: I've already sent this message to a wrong mailing list. Stephen >> Frost said it's implemented in pgbouncer but all I can find is an >open >> issue: https://github.com/pgbouncer/pgbouncer/issues/241. > >That would be some *other* proxy system (Amazon's ELB) that apparently >also doesn't understand the PG protocol and therefore doesn't have a >feature similar to pgbouncer's application_name_add_host. > >I haven't looked very closely at if it'd be possible to interpret the >PROXY protocol thing that Amazon's ELB can do without confusing it with >a regular PG authentication startup and I'm not sure if we'd really >want >to wed ourselves to something like that. Certainly, what pgbouncer >does >works quite well and is about as transparent to clients as possible. > >You'd almost certainly want something like pgbouncer after the ELB >anyway to avoid having tons of connections to PG and avoid spinning up >new backends constantly. > >Thanks, > >Stephen It could be proprietary Amazon load balancers I don't have experience with, or simple HAProxy coupled with a Patroni HTTPAPI to tell if a backend is healthy or not. The PgBouncer approach is interesting. I'm already using the application name as a workaround to identify containerized applicationsbut didn't used it for setting the source IP. If we take a look at the MariaDB implementation, they check for errors in the startup packet then run the PROXY protocoldecoding then return a real error if it doesn't work. As our bouncers are all behind a pool of HAProxy, and if weconsider PgBouncer as a trusted extension of PostgreSQL, maybe implementing it in PgBouncer first will be easier. Thanks for your insightful comments. Julien
On 19.05.2019 18:36, Julien Riou wrote: > Hello, > > Nowadays, PostgreSQL is often used behind proxies. Some are PostgreSQL > protocol aware (Pgpool, PgBouncer), some are pure TCP (HAProxy). From > the database instance point of view, all clients come from the proxy. > > There are two major problems with this topology: > > * It neutralizes the host based authentication. Every client shares > the same source. Either we allow this source or not but we cannot allow > clients on a more fine-grained basis, or not by the IP address. > > * It makes debugging harder. If we have a DDL or a slow query logged, we > cannot use the source to identify who is responsible. > > On one hand, we can move the authentication and logging mechanisms to > PostgreSQL based proxies but they will never be as complete as > PostgreSQL itself. And they don't have features like HTTP health checks > to redirect trafic to nodes (health, role, whatever behind the URL). On > the other hand, those features are not implemented at all because they > don't know the PostgreSQL protocol, they simply forward requests. > > In the HTTP reverse proxies world, there's a "dirty hack" to identify > the source IP address: add an HTTP header "X-Forwared-For" to the > request. It's the destination duty to do whatever they want with this > information. With this feature in mind, someone from HAProxy has > implemented this mechanism at the protocol level. It's called the PROXY > protocol. > > With this piece of logic at the beginning of the protocol, we could > implement a totally transparent proxy and benefit from the great > features of PostgreSQL regarding clients. Note that MariaDB support the > PROXY protocol in MaxScale (proxy) and MariaDB Server in recent > versions. > > My question is, what do you think of this feature? Is it worth to spend > time implementing it in PostgreSQL or not? > > Links: > - http://www.haproxy.org/download/1.8/doc/proxy-protocol.txt > - https://mariadb.com/kb/en/library/proxy-protocol-support/ > > Thanks, > Julien > > PS: I've already sent this message to a wrong mailing list. Stephen > Frost said it's implemented in pgbouncer but all I can find is an open > issue: https://github.com/pgbouncer/pgbouncer/issues/241. > > Hi, From my point of view it will be better to support embedded connection pooler in Postgres. In this case all mentioned problems can be more or less straightforwardly solved without inventing new protocol. There is my prototype implementation of built-in connection pooler on commit-fest: https://commitfest.postgresql.org/23/2067/ -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
+1 on this one...
MySQL and derivatives support it very well.. it is a standard that can be used with either haproxy or better, ProxySQL.
Would be nice to have it in core.
It is a show stopper for us to use proxying because of compliance and tracability reasons.
Le dim. 19 mai 2019 11:36 AM, Julien Riou <julien@riou.xyz> a écrit :
Hello,
Nowadays, PostgreSQL is often used behind proxies. Some are PostgreSQL
protocol aware (Pgpool, PgBouncer), some are pure TCP (HAProxy). From
the database instance point of view, all clients come from the proxy.
There are two major problems with this topology:
* It neutralizes the host based authentication. Every client shares
the same source. Either we allow this source or not but we cannot allow
clients on a more fine-grained basis, or not by the IP address.
* It makes debugging harder. If we have a DDL or a slow query logged, we
cannot use the source to identify who is responsible.
On one hand, we can move the authentication and logging mechanisms to
PostgreSQL based proxies but they will never be as complete as
PostgreSQL itself. And they don't have features like HTTP health checks
to redirect trafic to nodes (health, role, whatever behind the URL). On
the other hand, those features are not implemented at all because they
don't know the PostgreSQL protocol, they simply forward requests.
In the HTTP reverse proxies world, there's a "dirty hack" to identify
the source IP address: add an HTTP header "X-Forwared-For" to the
request. It's the destination duty to do whatever they want with this
information. With this feature in mind, someone from HAProxy has
implemented this mechanism at the protocol level. It's called the PROXY
protocol.
With this piece of logic at the beginning of the protocol, we could
implement a totally transparent proxy and benefit from the great
features of PostgreSQL regarding clients. Note that MariaDB support the
PROXY protocol in MaxScale (proxy) and MariaDB Server in recent
versions.
My question is, what do you think of this feature? Is it worth to spend
time implementing it in PostgreSQL or not?
Links:
- http://www.haproxy.org/download/1.8/doc/proxy-protocol.txt
- https://mariadb.com/kb/en/library/proxy-protocol-support/
Thanks,
Julien
PS: I've already sent this message to a wrong mailing list. Stephen
Frost said it's implemented in pgbouncer but all I can find is an open
issue: https://github.com/pgbouncer/pgbouncer/issues/241.
On Tue, Mar 9, 2021 at 11:25 AM Magnus Hagander <magnus@hagander.net> wrote: > > On Sat, Mar 6, 2021 at 5:30 PM Magnus Hagander <magnus@hagander.net> wrote: > > > > On Sat, Mar 6, 2021 at 4:17 PM Magnus Hagander <magnus@hagander.net> wrote: > > > > > > On Fri, Mar 5, 2021 at 8:11 PM Jacob Champion <pchampion@vmware.com> wrote: > > > > > > > > On Fri, 2021-03-05 at 10:22 +0100, Magnus Hagander wrote: > > > > > On Fri, Mar 5, 2021 at 12:21 AM Jacob Champion <pchampion@vmware.com> wrote: > > > > > > The original-host logging isn't working for me: > > > > > > > > > > > > [...] > > > > > > > > > > That's interesting -- it works perfectly fine here. What platform are > > > > > you testing on? > > > > > > > > Ubuntu 20.04. > > > > > > Curious. It doesn't show up on my debian. > > > > > > But either way -- it was clearly wrong :) > > > > > > > > > > > (I sent for sizeof(SockAddr) to make it > > > > > easier to read without having to look things up, but the net result is > > > > > the same) > > > > > > > > Cool. Did you mean to attach a patch? > > > > > > I didn't, I had some other hacks that were broken :) I've attached one > > > now which includes those changes. > > > > > > > > > > == More Notes == > > > > > > > > (Stop me if I'm digging too far into a proof of concept patch.) > > > > > > Definitely not -- much appreciated, and just what was needed to take > > > it from poc to a proper one! > > > > > > > > > > > + proxyaddrlen = pg_ntoh16(proxyheader.len); > > > > > + > > > > > + if (proxyaddrlen > sizeof(proxyaddr)) > > > > > + { > > > > > + ereport(COMMERROR, > > > > > + (errcode(ERRCODE_PROTOCOL_VIOLATION), > > > > > + errmsg("oversized proxy packet"))); > > > > > + return STATUS_ERROR; > > > > > + } > > > > > > > > I think this is not quite right -- if there's additional data beyond > > > > the IPv6 header size, that just means there are TLVs tacked onto the > > > > header that we should ignore. (Or, eventually, use.) > > > > > > Yeah, you're right. Fallout of too much moving around. I think inthe > > > end that code should just be removed, in favor of the discard path as > > > you mentinoed below. > > > > > > > > > > Additionally, we need to check for underflow as well. A misbehaving > > > > proxy might not send enough data to fill up the address block for the > > > > address family in use. > > > > > > I used to have that check. I seem to have lost it in restructuring. Added back! > > > > > > > > > > > + /* If there is any more header data present, skip past it */ > > > > > + if (proxyaddrlen > sizeof(proxyaddr)) > > > > > + pq_discardbytes(proxyaddrlen - sizeof(proxyaddr)); > > > > > > > > This looks like dead code, given that we'll error out for the same > > > > check above -- but once it's no longer dead code, the return value of > > > > pq_discardbytes should be checked for EOF. > > > > > > Yup. > > > > > > > > > > > + else if (proxyheader.fam == 0x11) > > > > > + { > > > > > + /* TCPv4 */ > > > > > + port->raddr.addr.ss_family = AF_INET; > > > > > + port->raddr.salen = sizeof(struct sockaddr_in); > > > > > + ((struct sockaddr_in *) &port->raddr.addr)->sin_addr.s_addr = proxyaddr.ip4.src_addr; > > > > > + ((struct sockaddr_in *) &port->raddr.addr)->sin_port = proxyaddr.ip4.src_port; > > > > > + } > > > > > > > > I'm trying to reason through the fallout of setting raddr and not > > > > laddr. I understand why we're not setting laddr -- several places in > > > > the code rely on the laddr to actually refer to a machine-local address > > > > -- but the fact that there is no actual connection from raddr to laddr > > > > could cause shenanigans. For example, the ident auth protocol will just > > > > break (and it might be nice to explicitly disable it for PROXY > > > > connections). Are there any other situations where a "faked" raddr > > > > could throw off Postgres internals? > > > > > > That's a good point to discuss. I thought about it initially and > > > figured it'd be even worse to actually copy over laddr since that > > > woudl then suddenly have the IP address belonging to a different > > > machine.. And then I forgot to enumerate the other cases. > > > > > > For ident, disabling the method seems reasonable. > > > > > > Another thing that shows up with added support for running the proxy > > > protocol over Unix sockets, is that PostgreSQL refuses to do SSL over > > > Unix sockets. So that check has to be updated to allow it over proxy > > > connections. Same for GSSAPI. > > > > > > An interesting thing is what to do about > > > inet_server_addr/inet_server_port. That sort of loops back up to the > > > original question of where/how to expose the information about the > > > proxy in general (since right now it just logs). Right now you can > > > actually use inet_server_port() to see if the connection was proxied > > > (as long as it was over tcp). > > > > > > Attached is an updated, which covers your comments, as well as adds > > > unix socket support (per your question and Alvaros confirmed usecase). > > > It allows proxy connections over unix sockets, but I saw no need to > > > get into unix sockets over the proxy protocol (dealing with paths > > > between machines etc). > > > > > > I changed the additional ListenSocket array to instead declare > > > ListenSocket as an array of structs holding two fields. Seems cleaner, > > > and especially should there be further extensions needed in the > > > future. > > > > > > I've also added some trivial tests (man that took an ungodly amount of > > > fighting perl -- it's clearly been a long time since I used perl > > > properly). They probably need some more love but it's a start. > > > > > > And of course rebased. > > > > Pfft, I was hoping for cfbot to pick it up and test it on a different > > platform. Of course, for it to do that, I need to include the test > > directory in the Makefile. Here's a new one which adds that, no other > > changes. > > So cfbot didn't like thato ne one bit. Turns out that it's not a great > idea to hardcode the username "mha" in the tests :) > > And also changed to only use unix sockets for the tests on linux, and > tcp only on windows. Because that's how our tests are supposed to be. PFA a rebase to make cfbot happy. There's another set or review notes from Jacob on March 11, that I will also address, but it's not included in this version. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Attachment
On Thu, Mar 11, 2021 at 12:05 AM Jacob Champion <pchampion@vmware.com> wrote: > > On Tue, 2021-03-09 at 11:25 +0100, Magnus Hagander wrote: > > I've also added some trivial tests (man that took an ungodly amount of > > fighting perl -- it's clearly been a long time since I used perl > > properly). > > Yeah. The tests I'm writing for this and NSS have been the same way; > it's a real problem. I'm basically writing supplemental tests in Python > as the "daily driver", then trying to port whatever is easiest (not > much) into Perl, when I get time. > > == More Notes == > > Some additional spec-compliance stuff: > > > /* Lower 4 bits hold type of connection */ > > if (proxyheader.fam == 0) > > { > > /* LOCAL connection, so we ignore the address included */ > > } > > (fam == 0) is the UNSPEC case, which isn't necessarily LOCAL. We have > to do something different for the LOCAL case: Oh ugh. yeah, and the comment is wrong too -- it got the "command" confused with "connection family". Too many copy/paste I think. > > - \x0 : LOCAL : [...] The receiver must accept this connection as > > valid and must use the real connection endpoints and discard the > > protocol block including the family which is ignored. > > So we should ignore the entire "protocol block" (by which I believe > they mean the protocol-and-address-family byte) in the case of LOCAL, > and just accept it with the original address info intact. That seems to > match the sample code in the back of the spec. The current behavior in > the patch will apply the PROXY behavior incorrectly if the sender sends > a LOCAL header with something other than UNSPEC -- which is strange > behavior but not explicitly prohibited as far as I can see. Yeah, I think we do the right thing in the "right usecase". > We also need to reject all connections that aren't either LOCAL or > PROXY commands: Indeed. > > - other values are unassigned and must not be emitted by senders. > > Receivers must drop connections presenting unexpected values here. > > ...and naturally it'd be Nice (tm) if the tests covered those corner > cases. I think that's covered in the attached update. > Over on the struct side: > > > + struct > > + { /* for TCP/UDP over IPv4, len = 12 */ > > + uint32 src_addr; > > + uint32 dst_addr; > > + uint16 src_port; > > + uint16 dst_port; > > + } ip4; > > ... snip ... > > + /* TCPv4 */ > > + if (proxyaddrlen < 12) > > + { > > Given the importance of these hardcoded lengths matching reality, is it > possible to add some static assertions to make sure that sizeof(<ipv4 > block>) == 12 and so on? That would also save any poor souls who are > using compilers with nonstandard struct-packing behavior. Yeah, probably makes sense. Added. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Attachment
Hi Magnus, I'm only just starting to page this back into my head, so this is by no means a full review of the v7 changes -- just stuff I've noticed over the last day or so of poking around. On Tue, 2021-06-29 at 11:48 +0200, Magnus Hagander wrote: > On Thu, Mar 11, 2021 at 12:05 AM Jacob Champion <pchampion@vmware.com> wrote: > > On Tue, 2021-03-09 at 11:25 +0100, Magnus Hagander wrote: > > > - \x0 : LOCAL : [...] The receiver must accept this connection as > > > valid and must use the real connection endpoints and discard the > > > protocol block including the family which is ignored. > > > > So we should ignore the entire "protocol block" (by which I believe > > they mean the protocol-and-address-family byte) in the case of LOCAL, > > and just accept it with the original address info intact. That seems to > > match the sample code in the back of the spec. The current behavior in > > the patch will apply the PROXY behavior incorrectly if the sender sends > > a LOCAL header with something other than UNSPEC -- which is strange > > behavior but not explicitly prohibited as far as I can see. > > Yeah, I think we do the right thing in the "right usecase". The current implementation is, I think, stricter than the spec asks for. We're supposed to ignore the family for LOCAL cases, and it's not clear to me whether we're supposed to also ignore the entire "fam" family-and-protocol byte (the phrase "protocol block" is not actually defined in the spec). It's probably not a big deal in practice, but it could mess with interoperability for lazier proxy implementations. I think I'll ask the HAProxy folks for some clarification tomorrow. > + <term><varname>proxy_servers</varname> (<type>string</type>) > + <indexterm> > + <primary><varname>proxy_servers</varname> configuration parameter</primary> > + </indexterm> > + </term> > + <listitem> > + <para> > + A comma separated list of one or more host names, cidr specifications or the > + literal <literal>unix</literal>, indicating which proxy servers to trust when > + connecting on the port specified in <xref linkend="guc-proxy-port" />. The documentation mentions that host names are valid in proxy_servers, but check_proxy_servers() uses the AI_NUMERICHOST hint with getaddrinfo(), so host names get rejected. > + GUC_check_errdetail("Invalid IP addrress %s", tok); s/addrress/address/ I've been thinking more about your earlier comment: > An interesting thing is what to do about > inet_server_addr/inet_server_port. That sort of loops back up to the > original question of where/how to expose the information about the > proxy in general (since right now it just logs). Right now you can > actually use inet_server_port() to see if the connection was proxied > (as long as it was over tcp). IMO these should return the "virtual" dst_addr/port, instead of exposing the physical connection information to the client. That way, if you intend for your proxy to be transparent, you're not advertising your network internals to connected clients. It also means that clients can reasonably expect to be able to reconnect to the addr:port that we give them, and prevents confusion if the proxy is using an address family that the client doesn't even support (e.g. the client is IPv4- only but the proxy connects via IPv6). --Jacob
On Fri, Jul 9, 2021 at 1:42 AM Jacob Champion <pchampion@vmware.com> wrote: > > Hi Magnus, > > I'm only just starting to page this back into my head, so this is by no > means a full review of the v7 changes -- just stuff I've noticed over > the last day or so of poking around. > > On Tue, 2021-06-29 at 11:48 +0200, Magnus Hagander wrote: > > On Thu, Mar 11, 2021 at 12:05 AM Jacob Champion <pchampion@vmware.com> wrote: > > > On Tue, 2021-03-09 at 11:25 +0100, Magnus Hagander wrote: > > > > - \x0 : LOCAL : [...] The receiver must accept this connection as > > > > valid and must use the real connection endpoints and discard the > > > > protocol block including the family which is ignored. > > > > > > So we should ignore the entire "protocol block" (by which I believe > > > they mean the protocol-and-address-family byte) in the case of LOCAL, > > > and just accept it with the original address info intact. That seems to > > > match the sample code in the back of the spec. The current behavior in > > > the patch will apply the PROXY behavior incorrectly if the sender sends > > > a LOCAL header with something other than UNSPEC -- which is strange > > > behavior but not explicitly prohibited as far as I can see. > > > > Yeah, I think we do the right thing in the "right usecase". > > The current implementation is, I think, stricter than the spec asks > for. We're supposed to ignore the family for LOCAL cases, and it's not > clear to me whether we're supposed to also ignore the entire "fam" > family-and-protocol byte (the phrase "protocol block" is not actually > defined in the spec). > > It's probably not a big deal in practice, but it could mess with > interoperability for lazier proxy implementations. I think I'll ask the > HAProxy folks for some clarification tomorrow. Thanks! Yeah, I have no problem being stricter than necessary, unless that actually causes any interop problems. It's a lot worse to not be strict enough.. > > + <term><varname>proxy_servers</varname> (<type>string</type>) > > + <indexterm> > > + <primary><varname>proxy_servers</varname> configuration parameter</primary> > > + </indexterm> > > + </term> > > + <listitem> > > + <para> > > + A comma separated list of one or more host names, cidr specifications or the > > + literal <literal>unix</literal>, indicating which proxy servers to trust when > > + connecting on the port specified in <xref linkend="guc-proxy-port" />. > > The documentation mentions that host names are valid in proxy_servers, > but check_proxy_servers() uses the AI_NUMERICHOST hint with > getaddrinfo(), so host names get rejected. Ah, good point. Should say "ip addresses". > > > + GUC_check_errdetail("Invalid IP addrress %s", tok); > > s/addrress/address/ Oops. > I've been thinking more about your earlier comment: > > > An interesting thing is what to do about > > inet_server_addr/inet_server_port. That sort of loops back up to the > > original question of where/how to expose the information about the > > proxy in general (since right now it just logs). Right now you can > > actually use inet_server_port() to see if the connection was proxied > > (as long as it was over tcp). > > IMO these should return the "virtual" dst_addr/port, instead of > exposing the physical connection information to the client. That way, > if you intend for your proxy to be transparent, you're not advertising > your network internals to connected clients. It also means that clients > can reasonably expect to be able to reconnect to the addr:port that we > give them, and prevents confusion if the proxy is using an address > family that the client doesn't even support (e.g. the client is IPv4- > only but the proxy connects via IPv6). That reasoning I think makes a lot of sense, especially with the comment about being able to connect back to it. The question at that point extends to, would we also add extra functions to get the data on the proxy connection itself? Maybe add a inet_proxy_addr()/inet_proxy_port()? Better names? PFA a patch that fixes the above errors, and changes inet_server_addr()/inet_server_port(). Does not yet add anything to receive the actual local port in this case. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Attachment
On Mon, 2021-07-12 at 18:28 +0200, Magnus Hagander wrote: > Yeah, I have no problem being stricter than necessary, unless that > actually causes any interop problems. It's a lot worse to not be > strict enough.. Agreed. Haven't heard back from the HAProxy mailing list yet, so staying strict seems reasonable in the meantime. That could always be rolled back later. > > I've been thinking more about your earlier comment: > > > > > An interesting thing is what to do about > > > inet_server_addr/inet_server_port. That sort of loops back up to the > > > original question of where/how to expose the information about the > > > proxy in general (since right now it just logs). Right now you can > > > actually use inet_server_port() to see if the connection was proxied > > > (as long as it was over tcp). > > > > IMO these should return the "virtual" dst_addr/port, instead of > > exposing the physical connection information to the client. That way, > > if you intend for your proxy to be transparent, you're not advertising > > your network internals to connected clients. It also means that clients > > can reasonably expect to be able to reconnect to the addr:port that we > > give them, and prevents confusion if the proxy is using an address > > family that the client doesn't even support (e.g. the client is IPv4- > > only but the proxy connects via IPv6). > > That reasoning I think makes a lot of sense, especially with the > comment about being able to connect back to it. > > The question at that point extends to, would we also add extra > functions to get the data on the proxy connection itself? Maybe add a > inet_proxy_addr()/inet_proxy_port()? Better names? What's the intended use case? I have trouble viewing those as anything but information disclosure vectors, but I'm jaded. :) If the goal is to give a last-ditch debugging tool to someone whose proxy isn't behaving properly -- though I'd hope the proxy in question has its own ways to debug its behavior -- maybe they could be locked behind one of the pg_monitor roles, so that they're only available to someone who could get that information anyway? > PFA a patch that fixes the above errors, and changes > inet_server_addr()/inet_server_port(). Does not yet add anything to > receive the actual local port in this case. Looking good in local testing. I'm going to reread the spec with fresh eyes and do a full review pass, but this is shaping up nicely IMO. Something that I haven't thought about very hard yet is proxy authentication, but I think the simple IP authentication will be enough for a first version. For the Unix socket case, it looks like anyone currently relying on peer auth will need to switch to a unix_socket_group/_permissions model. For now, that sounds like a reasonable v1 restriction, though I think not being able to set the proxy socket's permissions separately from the "normal" one might lead to some complications in more advanced setups. --Jacob
On Wed, Jul 14, 2021 at 8:24 PM Jacob Champion <pchampion@vmware.com> wrote: > > On Mon, 2021-07-12 at 18:28 +0200, Magnus Hagander wrote: > > Yeah, I have no problem being stricter than necessary, unless that > > actually causes any interop problems. It's a lot worse to not be > > strict enough.. > > Agreed. Haven't heard back from the HAProxy mailing list yet, so > staying strict seems reasonable in the meantime. That could always be > rolled back later. Any further feedback from them now, two months later? :) (Sorry, I was out on vacation for the end of the last CF, so didn't get around to this one, but it seemed there'd be plenty of time in this CF) > > > I've been thinking more about your earlier comment: > > > > > > > An interesting thing is what to do about > > > > inet_server_addr/inet_server_port. That sort of loops back up to the > > > > original question of where/how to expose the information about the > > > > proxy in general (since right now it just logs). Right now you can > > > > actually use inet_server_port() to see if the connection was proxied > > > > (as long as it was over tcp). > > > > > > IMO these should return the "virtual" dst_addr/port, instead of > > > exposing the physical connection information to the client. That way, > > > if you intend for your proxy to be transparent, you're not advertising > > > your network internals to connected clients. It also means that clients > > > can reasonably expect to be able to reconnect to the addr:port that we > > > give them, and prevents confusion if the proxy is using an address > > > family that the client doesn't even support (e.g. the client is IPv4- > > > only but the proxy connects via IPv6). > > > > That reasoning I think makes a lot of sense, especially with the > > comment about being able to connect back to it. > > > > The question at that point extends to, would we also add extra > > functions to get the data on the proxy connection itself? Maybe add a > > inet_proxy_addr()/inet_proxy_port()? Better names? > > What's the intended use case? I have trouble viewing those as anything > but information disclosure vectors, but I'm jaded. :) "Covering all the bases"? I'm not entirely sure what the point is of the *existing* functions for that though, so I'm definitely not wedded to including it. > If the goal is to give a last-ditch debugging tool to someone whose > proxy isn't behaving properly -- though I'd hope the proxy in question > has its own ways to debug its behavior -- maybe they could be locked > behind one of the pg_monitor roles, so that they're only available to > someone who could get that information anyway? Yeah, agreed. > > PFA a patch that fixes the above errors, and changes > > inet_server_addr()/inet_server_port(). Does not yet add anything to > > receive the actual local port in this case. > > Looking good in local testing. I'm going to reread the spec with fresh > eyes and do a full review pass, but this is shaping up nicely IMO. Thanks! > Something that I haven't thought about very hard yet is proxy > authentication, but I think the simple IP authentication will be enough > for a first version. For the Unix socket case, it looks like anyone > currently relying on peer auth will need to switch to a > unix_socket_group/_permissions model. For now, that sounds like a > reasonable v1 restriction, though I think not being able to set the > proxy socket's permissions separately from the "normal" one might lead > to some complications in more advanced setups. Agreed in principle, but I think those are some quite uncommon usecases, so definitely something we don't need to cover in a first feature. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
On Tue, 2021-09-07 at 12:24 +0200, Magnus Hagander wrote: > On Wed, Jul 14, 2021 at 8:24 PM Jacob Champion <pchampion@vmware.com> wrote: > > On Mon, 2021-07-12 at 18:28 +0200, Magnus Hagander wrote: > > > Yeah, I have no problem being stricter than necessary, unless that > > > actually causes any interop problems. It's a lot worse to not be > > > strict enough.. > > > > Agreed. Haven't heard back from the HAProxy mailing list yet, so > > staying strict seems reasonable in the meantime. That could always be > > rolled back later. > > Any further feedback from them now, two months later? :) Not yet :( I've bumped the thread; in the meantime I still think the stricter operation is fine, since in the worst case you just make it less strict in the future. > (Sorry, I was out on vacation for the end of the last CF, so didn't > get around to this one, but it seemed there'd be plenty of time in > this CF) No worries! > > > The question at that point extends to, would we also add extra > > > functions to get the data on the proxy connection itself? Maybe add a > > > inet_proxy_addr()/inet_proxy_port()? Better names? > > > > What's the intended use case? I have trouble viewing those as anything > > but information disclosure vectors, but I'm jaded. :) > > "Covering all the bases"? > > I'm not entirely sure what the point is of the *existing* functions > for that though, so I'm definitely not wedded to including it. I guess I'm in the same boat. I'm probably not the right person to weigh in. > > Looking good in local testing. I'm going to reread the spec with fresh > > eyes and do a full review pass, but this is shaping up nicely IMO. > > Thanks! I still owe you that overall review. Hoping to get to it this week. > > Something that I haven't thought about very hard yet is proxy > > authentication, but I think the simple IP authentication will be enough > > for a first version. For the Unix socket case, it looks like anyone > > currently relying on peer auth will need to switch to a > > unix_socket_group/_permissions model. For now, that sounds like a > > reasonable v1 restriction, though I think not being able to set the > > proxy socket's permissions separately from the "normal" one might lead > > to some complications in more advanced setups. > > Agreed in principle, but I think those are some quite uncommon > usecases, so definitely something we don't need to cover in a first > feature. Hm. I guess I'm overly optimistic that "properly securing your database" is not such an uncommon case, but... :) --Jacob
On Wed, 2021-09-08 at 18:51 +0000, Jacob Champion wrote: > I still owe you that overall review. Hoping to get to it this week. And here it is. I focused on things other than UnwrapProxyConnection() for this round, since I think that piece is looking solid. > + if (port->isProxy) > + { > + ereport(LOG, > + (errcode_for_socket_access(), > + errmsg("Ident authentication cannot be used over PROXY connections"))); What are the rules on COMMERROR vs LOG when dealing with authentication code? I always thought COMMERROR was required, but I see now that LOG (among others) is suppressed to the client during authentication. > #ifdef USE_SSL > /* No SSL when disabled or on Unix sockets */ > - if (!LoadedSSL || IS_AF_UNIX(port->laddr.addr.ss_family)) > + if (!LoadedSSL || (IS_AF_UNIX(port->laddr.addr.ss_family) && !port->isProxy)) > SSLok = 'N'; > else > SSLok = 'S'; /* Support for SSL */ > @@ -2087,7 +2414,7 @@ retry1: > > #ifdef ENABLE_GSS > /* No GSSAPI encryption when on Unix socket */ > - if (!IS_AF_UNIX(port->laddr.addr.ss_family)) > + if (!IS_AF_UNIX(port->laddr.addr.ss_family) || port->isProxy) > GSSok = 'G'; Now that we have port->daddr, could these checks be simplified to just IS_AF_UNIX(port->daddr...)? Or is there a corner case I'm missing for the port->isProxy case? > + * Note: AuthenticationTimeout is applied here while waiting for the > + * startup packet, and then again in InitPostgres for the duration of any > + * authentication operations. So a hostile client could tie up the > + * process for nearly twice AuthenticationTimeout before we kick him off. This comment needs to be adjusted after the move; waiting for the startup packet comes later, and it looks like we can now tie up 3x the timeout for the proxy case. > + /* Check if this is a proxy connection and if so unwrap the proxying */ > + if (port->isProxy) > + { > + enable_timeout_after(STARTUP_PACKET_TIMEOUT, AuthenticationTimeout * 1000); > + if (UnwrapProxyConnection(port) != STATUS_OK) > + proc_exit(0); I think the timeout here could comfortably be substantially less than the overall authentication timeout, since the proxy should send its header immediately even if the client takes its time with the startup packet. The spec suggests allowing 3 seconds minimum to cover a retransmission. Maybe something to tune in the future? > + /* Also listen to the PROXY port on this address, if configured */ > + if (ProxyPortNumber) > + { > + if (strcmp(curhost, "*") == 0) > + socket = StreamServerPort(AF_UNSPEC, NULL, > + (unsigned short) ProxyPortNumber, > + NULL, > + ListenSocket, MAXLISTEN); Sorry if you already talked about this upthread somewhere, but it looks like another downside of treating "proxy mode" as a server-wide on/off switch is that it cuts the effective MAXLISTEN in half, from 64 to 32, since we're opening two sockets for every address. If I've understood that correctly, it might be worth mentioning in the docs. > - if (!success && elemlist != NIL) > + if (socket == NULL && elemlist != NIL) > ereport(FATAL, > (errmsg("could not create any TCP/IP sockets"))); With this change in PostmasterMain, it looks like `success` is no longer a useful variable. But I'm not convinced that this is the correct logic -- this is just checking to see if the last socket creation succeeded, as opposed to seeing if any of them succeeded. Is that what you intended? > +plan tests => 25; > + > +my $node = get_new_node('node'); The TAP test will need to be rebased over the changes in 201a76183e. --Jacob
On Fri, Sep 10, 2021 at 1:44 AM Jacob Champion <pchampion@vmware.com> wrote: > > On Wed, 2021-09-08 at 18:51 +0000, Jacob Champion wrote: > > I still owe you that overall review. Hoping to get to it this week. > > And here it is. I focused on things other than UnwrapProxyConnection() > for this round, since I think that piece is looking solid. Thanks! > > + if (port->isProxy) > > + { > > + ereport(LOG, > > + (errcode_for_socket_access(), > > + errmsg("Ident authentication cannot be used over PROXY connections"))); > > What are the rules on COMMERROR vs LOG when dealing with authentication > code? I always thought COMMERROR was required, but I see now that LOG > (among others) is suppressed to the client during authentication. I honestly don't know :) In this case, LOG is what's used for all the other message in errors in ident_inet(), so I picked it for consistency. > > #ifdef USE_SSL > > /* No SSL when disabled or on Unix sockets */ > > - if (!LoadedSSL || IS_AF_UNIX(port->laddr.addr.ss_family)) > > + if (!LoadedSSL || (IS_AF_UNIX(port->laddr.addr.ss_family) && !port->isProxy)) > > SSLok = 'N'; > > else > > SSLok = 'S'; /* Support for SSL */ > > @@ -2087,7 +2414,7 @@ retry1: > > > > #ifdef ENABLE_GSS > > /* No GSSAPI encryption when on Unix socket */ > > - if (!IS_AF_UNIX(port->laddr.addr.ss_family)) > > + if (!IS_AF_UNIX(port->laddr.addr.ss_family) || port->isProxy) > > GSSok = 'G'; > > Now that we have port->daddr, could these checks be simplified to just > IS_AF_UNIX(port->daddr...)? Or is there a corner case I'm missing for > the port->isProxy case? Yeah, I think they could. > > + * Note: AuthenticationTimeout is applied here while waiting for the > > + * startup packet, and then again in InitPostgres for the duration of any > > + * authentication operations. So a hostile client could tie up the > > + * process for nearly twice AuthenticationTimeout before we kick him off. > > This comment needs to be adjusted after the move; waiting for the > startup packet comes later, and it looks like we can now tie up 3x the > timeout for the proxy case. Good point. > > + /* Check if this is a proxy connection and if so unwrap the proxying */ > > + if (port->isProxy) > > + { > > + enable_timeout_after(STARTUP_PACKET_TIMEOUT, AuthenticationTimeout * 1000); > > + if (UnwrapProxyConnection(port) != STATUS_OK) > > + proc_exit(0); > > I think the timeout here could comfortably be substantially less than > the overall authentication timeout, since the proxy should send its > header immediately even if the client takes its time with the startup > packet. The spec suggests allowing 3 seconds minimum to cover a > retransmission. Maybe something to tune in the future? Maybe. I'll leave it with a new comment for now about us diong it, and that we may want to consider igt in the future. > > > + /* Also listen to the PROXY port on this address, if configured */ > > + if (ProxyPortNumber) > > + { > > + if (strcmp(curhost, "*") == 0) > > + socket = StreamServerPort(AF_UNSPEC, NULL, > > + (unsigned short) ProxyPortNumber, > > + NULL, > > + ListenSocket, MAXLISTEN); > > Sorry if you already talked about this upthread somewhere, but it looks > like another downside of treating "proxy mode" as a server-wide on/off > switch is that it cuts the effective MAXLISTEN in half, from 64 to 32, > since we're opening two sockets for every address. If I've understood > that correctly, it might be worth mentioning in the docs. Correct. I don't see a way to avoid that without complicating things (as long as we want the ports to be separate), but I also don't see it as something that's reality to be an issue in reality. I would agree with documenting it, but I can't actually find us documenting the MAXLISTEN value anywhere. Do we? > > - if (!success && elemlist != NIL) > > + if (socket == NULL && elemlist != NIL) > > ereport(FATAL, > > (errmsg("could not create any TCP/IP sockets"))); > > With this change in PostmasterMain, it looks like `success` is no > longer a useful variable. But I'm not convinced that this is the > correct logic -- this is just checking to see if the last socket > creation succeeded, as opposed to seeing if any of them succeeded. Is > that what you intended? Eh, no, that's clearly a code-moving-bug. I think the reasonable thing is to succeed if we create either a regular socket *or* a proxy one, but FATAL out if you configured either of them but we failed co create any. > > +plan tests => 25; > > + > > +my $node = get_new_node('node'); > > The TAP test will need to be rebased over the changes in 201a76183e. Done, and adjustments above according to your comments, along with a small docs fix "a proxy connection is done" -> "a proxy connection is made". -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Attachment
> On 28 Sep 2021, at 15:23, Magnus Hagander <magnus@hagander.net> wrote: > On Fri, Sep 10, 2021 at 1:44 AM Jacob Champion <pchampion@vmware.com> wrote: >> The TAP test will need to be rebased over the changes in 201a76183e. > > Done And now the TAP test will need to be rebased over the changes in b3b4d8e68ae83f432f43f035c7eb481ef93e1583. -- Daniel Gustafsson https://vmware.com/
On Wed, Nov 3, 2021 at 2:36 PM Daniel Gustafsson <daniel@yesql.se> wrote:
> On 28 Sep 2021, at 15:23, Magnus Hagander <magnus@hagander.net> wrote:
> On Fri, Sep 10, 2021 at 1:44 AM Jacob Champion <pchampion@vmware.com> wrote:
>> The TAP test will need to be rebased over the changes in 201a76183e.
>
> Done
And now the TAP test will need to be rebased over the changes in
b3b4d8e68ae83f432f43f035c7eb481ef93e1583.
Thanks for the pointer, PFA a rebase.
Attachment
On Thu, 2021-11-04 at 12:03 +0100, Magnus Hagander wrote: > Thanks for the pointer, PFA a rebase. I think the Unix socket handling needs the same "success" fix that you applied to the TCP socket handling above it: > @@ -1328,9 +1364,23 @@ PostmasterMain(int argc, char *argv[]) > ereport(WARNING, > (errmsg("could not create Unix-domain socket in directory \"%s\"", > socketdir))); > + > + if (ProxyPortNumber) > + { > + socket = StreamServerPort(AF_UNIX, NULL, > + (unsigned short) ProxyPortNumber, > + socketdir, > + ListenSocket, MAXLISTEN); > + if (socket) > + socket->isProxy = true; > + else > + ereport(WARNING, > + (errmsg("could not create Unix-domain PROXY socket for \"%s\"", > + socketdir))); > + } > } > > - if (!success && elemlist != NIL) > + if (socket == NULL && elemlist != NIL) > ereport(FATAL, > (errmsg("could not create any Unix-domain sockets"))); Other than that, I can find nothing else to improve, and I think this is ready for more eyes than mine. :) -- To tie off some loose ends from upthread: I didn't find any MAXLISTEN documentation either, so I guess it's only a documentation issue if someone runs into it, heh. I was not able to find any other cases (besides ident) where using daddr instead of laddr would break things. I am going a bit snow-blind on the patch, though, and there's a lot of auth code. I never did hear back from the PROXY spec maintainer on how strict to be with LOCAL; another contributor did chime in but only to add that they didn't know the answer. That conversation is at [1], in case someone picks it up in the future. A summary of possible improvements talked about upthread, for a future v2: - SQL functions to get the laddr info (scoped to superusers, somehow), if there's a use case for them - Setting up PROXY Unix socket permissions separately from the "main" socket - Allowing PROXY-only communication (disable the "main" port) Thanks, --Jacob [1] https://www.mail-archive.com/haproxy@formilux.org/msg40899.html
On Tue, Nov 16, 2021 at 12:03 AM Jacob Champion <pchampion@vmware.com> wrote: > > On Thu, 2021-11-04 at 12:03 +0100, Magnus Hagander wrote: > > Thanks for the pointer, PFA a rebase. > > I think the Unix socket handling needs the same "success" fix that you > applied to the TCP socket handling above it: > > > @@ -1328,9 +1364,23 @@ PostmasterMain(int argc, char *argv[]) > > ereport(WARNING, > > (errmsg("could not create Unix-domain socket in directory \"%s\"", > > socketdir))); > > + > > + if (ProxyPortNumber) > > + { > > + socket = StreamServerPort(AF_UNIX, NULL, > > + (unsigned short) ProxyPortNumber, > > + socketdir, > > + ListenSocket, MAXLISTEN); > > + if (socket) > > + socket->isProxy = true; > > + else > > + ereport(WARNING, > > + (errmsg("could not create Unix-domain PROXY socket for \"%s\"", > > + socketdir))); > > + } > > } > > > > - if (!success && elemlist != NIL) > > + if (socket == NULL && elemlist != NIL) > > ereport(FATAL, > > (errmsg("could not create any Unix-domain sockets"))); > > Other than that, I can find nothing else to improve, and I think this > is ready for more eyes than mine. :) Here's another rebase on top of the AF_UNIX patch. > To tie off some loose ends from upthread: > > I didn't find any MAXLISTEN documentation either, so I guess it's only > a documentation issue if someone runs into it, heh. > > I was not able to find any other cases (besides ident) where using > daddr instead of laddr would break things. I am going a bit snow-blind > on the patch, though, and there's a lot of auth code. Yeah, that's definitely a good reason for more eyes on it. > A summary of possible improvements talked about upthread, for a future > v2: > > - SQL functions to get the laddr info (scoped to superusers, somehow), > if there's a use case for them > > - Setting up PROXY Unix socket permissions separately from the "main" > socket > > - Allowing PROXY-only communication (disable the "main" port) These all seem useful, but I'm liking the idea of putting them in a v2, to avoid expanding the scope too much. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Attachment
A general question on this feature: AFAICT, you can only send the proxy header once at the beginning of the connection. So this wouldn't be of use for PostgreSQL-protocol connection poolers (pgbouncer, pgpool), where the same server connection can be used for clients from different source addresses. Do I understand that correctly?
On Wed, Mar 9, 2022 at 5:23 PM Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > A general question on this feature: AFAICT, you can only send the proxy > header once at the beginning of the connection. So this wouldn't be of > use for PostgreSQL-protocol connection poolers (pgbouncer, pgpool), > where the same server connection can be used for clients from different > source addresses. Do I understand that correctly? Correct. It's only sent at connection startup, so if you're re-using the connection it would also re-use the IP address of the first one. For reusing the connection for multiple clients, you'd want something different, like a "priviliged mode in the tunnel" that the pooler can handle. -- Magnus Hagander Me: https://www.hagander.net/ Work: https://www.redpill-linpro.com/
Hi, I've been able to test the patch. Here is a recap of the experimentation. # Setup All tests have been done witch 3 VMs (PostgreSQL, HAproxy, psql client) on Debian 11 communicating over private network. * PostgreSQL have been built with proxy_protocol_11.patch applied on master branch (465ab24296). * psql client is from postgresql-client-13 from Debian 11 repository. * HAproxy version used is 2.5.5-1~bpo11+1 installed from https://haproxy.debian.net # Configuration PostgresSQL has been configured to listen only on its private IP. To enable proxy protocol support `proxy_port` has been configured to `5431` and `proxy_servers` to `10.0.0.0/24`. `log_connections` has been turned on to make sure the correct IP address is logged. `log_min_duration_statement` has been configured to 0 to log all queries. Finally `log_destination` has been configured to `csvlog`. pg_hba.conf is like this: local all all trust host all all 127.0.0.1/32 trust host all all ::1/128 trust local replication all trust host replication all 127.0.0.1/32 trust host replication all ::1/128 trust host all all 10.0.0.208/32 md5 Where 10.0.0.208 is the IP host the psql client's VM. HAproxy has two frontends, one for proxy protocol (port 5431) and one for regular TCP traffic. The configuration looks like this: listen postgresql bind 10.0.0.222:5432 server pg 10.0.0.253:5432 check listen postgresql_proxy bind 10.0.0.222:5431 server pg 10.0.0.253:5431 send-proxy-v2 Where 10.0.0.222 is the IP of HAproxy's VM and 10.0.0.253 is the IP of PostgreSQL's VM. # Tests * from psql's vm to haproxy on port 5432 (no proxy protocol) --> connection denied by pg_hba.conf, as expected * from psql's vm to postgresql's VM on port 5432 (no proxy protocol) --> connection success with psql's vm ip in logfile and pg_stat_activity * from psql's vm to postgresql's VM on port 5431 (proxy protocol) --> unable to open a connection, as expected * from psql's vm to haproxy on port 5431 (proxy protocol) --> connection success with psql's vm ip in logfile and pg_stat_activity I've also tested without proxy protocol enable (and pg_hba.conf updated accordingly), PostgreSQL behave as expected. # Conclusion From my point of view the documentation is clear enough and the feature works as expected.
On Sat, Apr 2, 2022 at 12:17 AM wilfried roset <wilfried.roset@gmail.com> wrote:
Hi,
I've been able to test the patch. Here is a recap of the experimentation.
# Setup
All tests have been done witch 3 VMs (PostgreSQL, HAproxy, psql client) on
Debian 11 communicating over private network.
* PostgreSQL have been built with proxy_protocol_11.patch applied on master branch (465ab24296).
* psql client is from postgresql-client-13 from Debian 11 repository.
* HAproxy version used is 2.5.5-1~bpo11+1 installed from https://haproxy.debian.net
# Configuration
PostgresSQL has been configured to listen only on its private IP. To enable
proxy protocol support `proxy_port` has been configured to `5431` and
`proxy_servers` to `10.0.0.0/24`. `log_connections` has been turned on to make
sure the correct IP address is logged. `log_min_duration_statement` has been
configured to 0 to log all queries. Finally `log_destination` has been
configured to `csvlog`.
pg_hba.conf is like this:
local all all trust
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
local replication all trust
host replication all 127.0.0.1/32 trust
host replication all ::1/128 trust
host all all 10.0.0.208/32 md5
Where 10.0.0.208 is the IP host the psql client's VM.
HAproxy has two frontends, one for proxy protocol (port 5431) and one for
regular TCP traffic. The configuration looks like this:
listen postgresql
bind 10.0.0.222:5432
server pg 10.0.0.253:5432 check
listen postgresql_proxy
bind 10.0.0.222:5431
server pg 10.0.0.253:5431 send-proxy-v2
Where 10.0.0.222 is the IP of HAproxy's VM and 10.0.0.253 is the IP of
PostgreSQL's VM.
# Tests
* from psql's vm to haproxy on port 5432 (no proxy protocol)
--> connection denied by pg_hba.conf, as expected
* from psql's vm to postgresql's VM on port 5432 (no proxy protocol)
--> connection success with psql's vm ip in logfile and pg_stat_activity
* from psql's vm to postgresql's VM on port 5431 (proxy protocol)
--> unable to open a connection, as expected
* from psql's vm to haproxy on port 5431 (proxy protocol)
--> connection success with psql's vm ip in logfile and pg_stat_activity
I've also tested without proxy protocol enable (and pg_hba.conf updated
accordingly), PostgreSQL behave as expected.
# Conclusion
From my point of view the documentation is clear enough and the feature works
as expected.
Hi!
Thanks for this review and testing!
I think it could do with at least noe more look-over at the source code level as well at this point though since it's been sitting around for a while, so it won't make it in for this deadline. But hopefully I can get it in early in the next cycle!
This needs a rebase, but after that I expect it to be RfC. --Jacob The new status of this patch is: Waiting on Author
On 7/28/22 22:05, Jacob Champion wrote: > This needs a rebase, but after that I expect it to be RfC. > > --Jacob > > The new status of this patch is: Waiting on Author Hello folks, Thank you all for this awesome work! I'm looking for this feature for years now. Last year, I've tried to rebase the patch without success. Unfortunately, this is out of my league. Magnus, please let me know if I can help. Have a nice day, Julien