Re: Built-in connection pooling - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Built-in connection pooling |
Date | |
Msg-id | 7bc1e831-54e6-b2db-f083-2d26e9ae0dd7@2ndquadrant.com Whole thread Raw |
In response to | Re: Built-in connection pooling (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Responses |
Re: Built-in connection pooling
Re: Built-in connection pooling |
List | pgsql-hackers |
On 01/19/2018 05:17 PM, Konstantin Knizhnik wrote: > > > On 19.01.2018 18:53, Tomas Vondra wrote: >> >> ... >> >> The questions I'm asking myself are mostly these: >> >> 1) When assigning a backend, we first try to get one from a pool, which >> happens right at the beginning of BackendStartup. If we find a usable >> backend, we send the info to the backend (pg_send_sock/pg_recv_sock). >> >> But AFAICS this only only happens at connection time, right? But it your >> initial message you say "Rescheduling is done at transaction level," >> which in my understanding means "transaction pooling". So, how does that >> part work? > > Here it is: > > ChooseSession: > ... > OK, thanks. >> >> 2) How does this deal with backends for different databases? I >> don't see any checks that the requested database matches the >> backend database (not any code switching the backend from one db to >> another - which would be very tricky, I think). > As I wrote in the initial mail this problem is not handled now. > It is expected that all clients are connected to the same database using > the same user. > I only check and report an error if this assumption is violated. > Definitely it should be fixed. And it is one of the main challenge with > this approach! And I want to receive some advices from community about > the best ways of solving it. > The problem is that we get information about database/user in > ProcessStartupPackage function in the beackend, when session is already > assigned to the particular backend. > We either have to somehow redirect session to some other backend > (somehow notify postmaster that we are not able to handle it)? > either obtain database/user name in postmaster. But it meas that > ProcessStartupPackage should be called in postmaster and Postmaster has > to read from client's socket. > I afraid that postmaster can be a bottleneck in this case. > Hmmm, that's unfortunate. I guess you'll have process the startup packet in the main process, before it gets forked. At least partially. > The problem can be much easily solved in case of using pthread version > of Postgres. In this case reassigning session to another executor > (thread) can be don much easily. > And there is no need to use unportable trick with passing fiel > descriptor to other process. > And in future I am going to combine them. The problem is that pthread > version of Postgres is still in very raw state. > Yeah. Unfortunately, we're using processes now, and switching to threads will take time (assuming it happens at all). >> 3) Is there any sort of shrinking the pools? I mean, if the backend is >> idle for certain period of time (or when we need backends for other >> databases), does it get closed automatically? > > When client is disconnected, client session is closed. But backen is not > terminated even if there are no more sessions at this backend. > It was done intentionally, to avoid permanent spawning of new processes > when there is one or few clients which frequently connect/disconnect to > the database. Sure, but it means a short peak will exhaust the backends indefinitely. That's acceptable for a PoC, but I think needs to be fixed eventually. >> >> Furthermore, I'm rather confused about the meaning of session_pool_size. >> I mean, that GUC determines the number of backends in the pool, it has >> nothing to do with sessions per se, right? Which would mean it's a bit >> misleading to name it "session_..." (particularly if the pooling happens >> at transaction level, not session level - which is question #1). >> > Yehh, yes it is not right name. It means maximal number of backends > which should be used to serve client's sessions. > But "max backends" is already used and has completely different meaning. > >> When I've been thinking about adding a built-in connection pool, my >> rough plan was mostly "bgworker doing something like pgbouncer" (that >> is, listening on a separate port and proxying everything to regular >> backends). Obviously, that has pros and cons, and probably would not >> work serve the threading use case well. > > And we will get the same problem as with pgbouncer: one process will not > be able to handle all connections... > Certainly it is possible to start several such scheduling bgworkers... > But in any case it is more efficient to multiplex session in backend > themselves. > Well, I haven't said it has to be single-threaded like pgbouncer. I don't see why the bgworker could not use multiple threads internally (of course, it'd need to be not to mess the stuff that is not thread-safe). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: