Re: PROXY protocol support - Mailing list pgsql-hackers
From | Julien Riou |
---|---|
Subject | Re: PROXY protocol support |
Date | |
Msg-id | 45EE9893-5885-476E-85B6-F8F8E6AE9FA3@riou.xyz Whole thread Raw |
In response to | Re: PROXY protocol support (Stephen Frost <sfrost@snowman.net>) |
List | pgsql-hackers |
On May 19, 2019 5:59:04 PM GMT+02:00, Stephen Frost <sfrost@snowman.net> wrote: >Greetings, > >* Julien Riou (julien@riou.xyz) wrote: >> Nowadays, PostgreSQL is often used behind proxies. Some are >PostgreSQL >> protocol aware (Pgpool, PgBouncer), some are pure TCP (HAProxy). From >> the database instance point of view, all clients come from the proxy. >> >> There are two major problems with this topology: >> >> * It neutralizes the host based authentication. Every client shares >> the same source. Either we allow this source or not but we cannot >allow >> clients on a more fine-grained basis, or not by the IP address. > >You can instead have the IP-based checking done at the pooler. > >> * It makes debugging harder. If we have a DDL or a slow query logged, >we >> cannot use the source to identify who is responsible. > >Protocol-level poolers are able to do this, and pgbouncer does (see >application_name_add_host). > >> On one hand, we can move the authentication and logging mechanisms to >> PostgreSQL based proxies but they will never be as complete as >> PostgreSQL itself. And they don't have features like HTTP health >checks >> to redirect trafic to nodes (health, role, whatever behind the URL). >On >> the other hand, those features are not implemented at all because >they >> don't know the PostgreSQL protocol, they simply forward requests. >> >> In the HTTP reverse proxies world, there's a "dirty hack" to identify >> the source IP address: add an HTTP header "X-Forwared-For" to the >> request. It's the destination duty to do whatever they want with this >> information. With this feature in mind, someone from HAProxy has >> implemented this mechanism at the protocol level. It's called the >PROXY >> protocol. > >Someone from HAProxy could certainly implement something similar by >having HAProxy understand PostgreSQL's protocol. > >> With this piece of logic at the beginning of the protocol, we could >> implement a totally transparent proxy and benefit from the great >> features of PostgreSQL regarding clients. Note that MariaDB support >the >> PROXY protocol in MaxScale (proxy) and MariaDB Server in recent >> versions. > >pgbouncer is already a transparent proxy that understands the PG >protocol, and, even better, it has support for transaction-level >pooling >(as well as connection-level), which is really critical for larger PG >deployments as PG backend startup is (relatively) expensive. > >> PS: I've already sent this message to a wrong mailing list. Stephen >> Frost said it's implemented in pgbouncer but all I can find is an >open >> issue: https://github.com/pgbouncer/pgbouncer/issues/241. > >That would be some *other* proxy system (Amazon's ELB) that apparently >also doesn't understand the PG protocol and therefore doesn't have a >feature similar to pgbouncer's application_name_add_host. > >I haven't looked very closely at if it'd be possible to interpret the >PROXY protocol thing that Amazon's ELB can do without confusing it with >a regular PG authentication startup and I'm not sure if we'd really >want >to wed ourselves to something like that. Certainly, what pgbouncer >does >works quite well and is about as transparent to clients as possible. > >You'd almost certainly want something like pgbouncer after the ELB >anyway to avoid having tons of connections to PG and avoid spinning up >new backends constantly. > >Thanks, > >Stephen It could be proprietary Amazon load balancers I don't have experience with, or simple HAProxy coupled with a Patroni HTTPAPI to tell if a backend is healthy or not. The PgBouncer approach is interesting. I'm already using the application name as a workaround to identify containerized applicationsbut didn't used it for setting the source IP. If we take a look at the MariaDB implementation, they check for errors in the startup packet then run the PROXY protocoldecoding then return a real error if it doesn't work. As our bouncers are all behind a pool of HAProxy, and if weconsider PgBouncer as a trusted extension of PostgreSQL, maybe implementing it in PgBouncer first will be easier. Thanks for your insightful comments. Julien
pgsql-hackers by date: