Re: Built-in connection pooler - Mailing list pgsql-hackers

From Antonin Houska
Subject Re: Built-in connection pooler
Date
Msg-id 1915.1619604896@antos
Whole thread Raw
In response to Re: Built-in connection pooler  (Konstantin Knizhnik <knizhnik@garret.ru>)
List pgsql-hackers
Konstantin Knizhnik <knizhnik@garret.ru> wrote:

> People asked me to resubmit built-in connection pooler patch to commitfest.
> Rebased version of connection pooler is attached.

I've reviewd the patch but haven't read the entire thread thoroughly. I hope
that I don't duplicate many comments posted earlier by others.

(Please note that the patch does not apply to the current master, I had to
reset the head of my repository to the appropriate commit.)


Documentation / user interface
------------------------------

* session_pool_size (config.sgml)

I wonder if

    "The default value is 10, so up to 10 backends will serve each database,"

should rather be

    "The default value is 10, so up to 10 backends will serve each database/user combination."

However, when I read the code, I think that each proxy counts the size of the
pool separately, so the total number of backends used for particular
database/user combination seems to be

    session_pool_size * connection_proxies

Since the feature uses shared memory statistics anyway, shouldn't it only
count the total number of backends per database/user? It would need some
locking, but the actual pools (hash tables) could still be local to the proxy
processes.

* connection_proxies

(I've noticed that Ryan Lambert questioned this variable upthread.)

I think this variable makes the configuration less straightforward from the
user perspective. Cannot the server launch additional proxies dynamically, as
needed, e.g. based on the shared memory statistics that the patch introduces?
I see that postmaster would have to send the sockets in a different way. How
about adding a "proxy launcher" process that would take care of the scheduling
and launching new proxies?

* multitenant_proxy

I thought the purpose of this setting is to reduce the number of backends
needed, but could not find evidence in the code. In particular,
client_attach() always retrieves the backend from the appropriate pool, and
backend_reschedule() does so as well. Thus the role of both client and backend
should always match. What piece of information do I miss?

* typo (2 occurrences in config.sgml)

  "stanalone" -> "standalone"


Design / coding
---------------

* proxy.c:backend_start() does not change the value of the "host" parameter to
  the socket directory, so I assume the proxy connects to the backend via TCP
  protocol. I think the unix socket should be preferred for this connection if
  the platform has it, however:

* is libpq necessary for the proxy to connect to backend at all? Maybe
  postgres.c:ReadCommand() can be adjusted so that the backend can communicate
  with the proxy just via the plain socket.

  I don't like the idea of server components communicating via libpq (do we
  need anything else of the libpq connection than the socket?) as such, but
  especially these includes in proxy.c look weird:

  #include "../interfaces/libpq/libpq-fe.h"
  #include "../interfaces/libpq/libpq-int.h"

* How does the proxy recognize connections to the walsender? I haven't tested
  that, but it's obvious that these connections should not be proxied.

* ConnectionProxyState is in shared memory, so access to its fields should be
  synchronized.

* StartConnectionProxies() is only called from PostmasterMain(), so I'm not
  sure the proxies get restarted after crash. Perhaps PostmasterStateMachine()
  needs to call it too after calling StartupDataBase().

* Why do you need the Channel.magic integer field? Wouldn't a boolean field
  "active" be sufficient?

  ** In proxy_loop(), I've noticed tests (chan->magic == ACTIVE_CHANNEL_MAGIC)
     tests inside the branch

     else if (chan->magic == ACTIVE_CHANNEL_MAGIC)

     Since neither channel_write() nor channel_read() seem to change the
     value, I think those tests are not necessary.

* Comment lines are often too long.

* pgindent should be applied to the patch at some point.


I can spend more time reviewing the patch during the next CF.

--
Antonin Houska
Web: https://www.cybertec-postgresql.com



pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: [PATCH] We install pg_regress and isolationtester but not pg_isolation_regress
Next
From: Aleksander Alekseev
Date:
Subject: Re: Some oversights in query_id calculation