Re: Built-in connection pooler - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Built-in connection pooler |
Date | |
Msg-id | 20190726202437.2svtvcisiujqaoky@development Whole thread Raw |
In response to | Re: Built-in connection pooler (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>) |
Responses |
Re: Built-in connection pooler
Re: Built-in connection pooler Re: Built-in connection pooler |
List | pgsql-hackers |
Hi Konstantin, I've started reviewing this patch and experimenting with it, so let me share some initial thoughts. 1) not handling session state (yet) I understand handling session state would mean additional complexity, so I'm OK with not having it in v1. That being said, I think this is the primary issue with connection pooling on PostgreSQL - configuring and running a separate pool is not free, of course, but when people complain to us it's when they can't actually use a connection pool because of this limitation. So what are your plans regarding this feature? I think you mentioned you already have the code in another product. Do you plan to submit it in the pg13 cycle, or what's the plan? I'm willing to put some effort into reviewing and testing that. FWIW it'd be nice to expose it as some sort of interface, so that other connection pools can leverage it too. There are use cases that don't work with a built-in connection pool (say, PAUSE/RESUME in pgbouncer allows restarting the database) so projects like pgbouncer or odyssey are unlikely to disappear anytime soon. I also wonder if we could make it more permissive even in v1, without implementing dump/restore of session state. Consider for example patterns like this: BEGIN; SET LOCAL enable_nestloop = off; ... COMMIT; or PREPARE x(int) AS SELECT ...; EXECUTE x(1); EXECUTE x(2); ... EXECUTE x(100000); DEALLOCATE x; or perhaps even CREATE FUNCTION f() AS $$ ... $$ LANGUAGE sql SET enable_nestloop = off; In all those cases (and I'm sure there are other similar examples) the connection pool considers the session 'tainted' it marks it as tainted and we never reset that. So even when an application tries to play nice, it can't use pooling. Would it be possible to maybe track this with more detail (number of prepared statements, ignore SET LOCAL, ...)? That should allow us to do pooling even without full support for restoring session state. 2) configuration I think we need to rethink how the pool is configured. The options available at the moment are more a consequence of the implementation and are rather cumbersome to use in some cases. For example, we have session_pool_size, which is (essentially) the number of backends kept in the pool. Which seems fine at first, because it seems like you might say max_connections = 100 session_pool_size = 50 to say the connection pool will only ever use 50 connections, leaving the rest for "direct" connection. But that does not work at all, because the number of backends the pool can open is session_pool_size * connection_proxies * databases * roles which pretty much means there's no limit, because while we can specify the number of proxies, the number of databases and roles is arbitrary. And there's no way to restrict which dbs/roles can use the pool. So you can happily do this max_connections = 100 connection_proxies = 4 session_pool_size = 10 pgbench -c 24 -U user1 test1 pgbench -c 24 -U user2 test2 pgbench -c 24 -U user3 test3 pgbench -c 24 -U user4 test4 at which point it's pretty much game over, because each proxy has 4 pools, each with ~6 backends, 96 backends in total. And because non-tainted connections are never closed, no other users/dbs can use the pool (will just wait indefinitely). To allow practical configurations, I think we need to be able to define: * which users/dbs can use the connection pool * minimum/maximum pool size per user, per db and per user/db * maximum number of backend connections We need to be able to close connections when needed (when not assigned, and we need the connection for someone else). Plus those limits need to be global, not "per proxy" - it's just strange that increasing connection_proxies bumps up the effective pool size. I don't know what's the best way to specify this configuration - whether to store it in a separate file, in some system catalog, or what. 3) monitoring I think we need much better monitoring capabilities. At this point we have a single system catalog (well, a SRF) giving us proxy-level summary. But I think we need much more detailed overview - probably something like pgbouncer has - listing of client/backend sessions, with various details. Of course, that's difficult to do when those lists are stored in private memory of each proxy process - I think we need to move this to shared memory, which would also help to address some of the issues I mentioned in the previous section (particularly that the limits need to be global, not per proxy). 4) restart_pooler_on_reload I find it quite strange that restart_pooler_on_reload binds restart of the connection pool to reload of the configuration file. That seems like a rather surprising behavior, and I don't see why would you ever want that? Currently it seems like the only way to force the proxies to close the connections (the docs mention DROP DATABASE), but why shouldn't we have separate functions to do that? In particular, why would you want to close connections for all databases and not just for the one you're trying to drop? 5) session_schedule It's nice we support different strategies to assign connections to worker processes, but how do you tune it? How do you pick the right option for your workload? We either need to provide metrics to allow informed decision, or just not provide the option. And "load average" may be a bit misleading term (as used in the section about load-balancing option). It kinda suggests we're measuring how busy the different proxies were recently (that's what load average in Unix does) - by counting active processes, CPU usage or whatever. But AFAICS that's not what's happening at all - it just counts the connections, with SSL connections counted as more expensive. 6) issues during testin While testing, I've seen a couple of issues. Firstly, after specifying a db that does not exist: psql -h localhost -p 6543 xyz just hangs and waits forever. In the server log I see this: 2019-07-25 23:16:50.229 CEST [31296] FATAL: database "xyz" does not exist 2019-07-25 23:16:50.258 CEST [31251] WARNING: could not setup local connect to server 2019-07-25 23:16:50.258 CEST [31251] DETAIL: FATAL: database "xyz" does not exist But the client somehow does not get the message and waits. Secondly, when trying this pgbench -p 5432 -U x -i -s 1 test pgbench -p 6543 -U x -c 24 -C -T 10 test it very quickly locks up, with plenty of non-granted locks in pg_locks, but I don't see any interventions by deadlock detector so I presume the issue is somewhere else. I don't see any such issues whe running without the connection pool or without the -C option: pgbench -p 5432 -U x -c 24 -C -T 10 test pgbench -p 6543 -U x -c 24 -T 10 test This is with default postgresql.conf, except for connection_proxies = 4 regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: