Thread: Builtin connection polling
Hi hackers,
My recent experiments with pthread version of Postgres show that although pthread offers some performance advantages comparing with processes for large number of connections, them still can not eliminate need in connection pooling. Large number even of inactive connections cause significant degrade of Postgres performance.
So we need connection pooling. Most of enterprise systems working with Postgres are using pgbouncer or similar tools.
But pgbouncer has the following drawbacks:
1. It is an extra entity which complicates system installation and administration.
2. Pgbouncer itself can be a bottleneck and point of failure. For example with enabled SSL, single threaded model of pgbouncer becomes limiting factor when a lot of clients try to simultaneously reestablish connection. This is why some companies are building hierarchy of pgbouncers.
3. Using pool_mode other than "session" makes it not possible to use prepared statements and session variables.
Lack of prepared statements can itself decrease speed of simple queries up to two times.
So I thought about built-in connection pooling for Postgres. Ideally it should be integrated with pthreads, because in this case scheduling of sessions can be done more flexible and easily.
But I decided to start with patch to vanilla Postgres.
Idea is the following:
1. We start some number of normal backends (which forms backend pool for serving client sessions).
2. When number of connections exceeds number of backends, then instead of spawning new backend we choose some of existed backend and redirect connection to it.
There is more or less portable way in Unix to pass socket descriptors between processes using Unix sockets:
for example https://stackoverflow.com/questions/28003921/sending-file-descriptor-by-linux-socket/
(this is one of the places where pthreads Postgres will win). So a session is bounded to a backend. Backends and chosen using round-robin policy which should guarantee more or less unform distribution of sessions between backends if number of sessions is much larger than number of backends. But certainly skews in client application access patterns can violate this assumption.
3. Rescheduling is done at transaction level. So it is enough to have one entry in procarray for backend to correctly handle locks. Also transaction level pooling eliminates
problem with false deadlocks (caused by lack of free executors in the pool). Also transaction level pooling minimize changes in Postgres core needed to maintain correct session context:
no need to suspend/resume transaction state, static variables, ....
4. In the main Postgres query loop in PostgresMain we determine a moment when backend is not in transaction state and perform select of sockets of all active sessions and choose one of them.
5. When client is disconnected, then we close session but do not terminate backend.
6. To support prepared statements, we append session identifier to the name of the statement. So prepared statements of different sessions will not interleave. As far as session is bounded to the backend, it is possible to use prepared statements.
This is minimal plan for embedded session pooling I decided to implement as prototype.
Several things are not addressed now:
1. Temporary tables. In principle them can be handled in the same way as prepared statements: by concatenating session identifier to the name of the table.
But it require adjusting references to this table in all queries. It is much more complicated than in case of prepared statements.
2. Session level GUCs. In principle it is not difficult to remember GUCs modified by session and save/restore them on session switch.
But it is just not implemented now.
3. Support of multiple users/databases/... It is the most critical drawback. Right now my prototype implementation assumes that all clients are connected to the same database
under the same user with some connection options. And it is a challenge about which I want to know option of community. The name of the database and user are retrieved from client connection by ProcessStartupPacket function. In vanilla Posgres this function is executed by spawned backend. So I do not know which database a client is going to access before calling this function and reading data from the client's socket. Now I just choose random backend and assign connection to this backend. But it can happen that this backend is working with different database/user. Now I just return error in this case. Certainly it is possible to call ProcessStartupPacket at postmaster and then select proper backend working with specified database/user.
But I afraid that postmaster can become bottleneck i this case, especially in case of using SSL. Also larger number of databases/users can significantly suffer efficiency of pooling if each backend will be responsible only for database/user combination. May be backend should be bounded only to the database and concrete role should be set on session switch. But it can require flushing backend caches which devalues idea of embedded session pooling. This problem can be easily solved with multithreaded Postgres where it is possible to easily reassign session to another thread.
Now results shown by my prototype. I used pgbench with scale factor 100 in readonly mode (-S option).
Precise pgbench command is "pgbench -S -c N -M prepared -T 100 -P 1 -n". Results in the table below are in kTPS:
As you see instead of degrade of performance with increasing number of connections, Postgres with session pool shows stable performance result.
Moreover, for vanilla Postgres best results at my system are obtained for 10 connections, but Postgres with session pool shows better performance for 100 connections with the same number of spawned backends.
My patch to the Postgres is attached to this mail.
To switch on session polling set session_pool_size to some non-zero value. Another GUC variable which I have added is "max_sessions" which specifies maximal number of sessions handled by backend. So total number of handled client connections is session_pool_size*max_sessions.
Certainly it is just prototype far from practical use.
In addition to the challenges mentioned above, there are also some other issues which should be considered:
1. Long living transaction in client application blocks all other sessions in the backend and so can suspend work of the Postgres.
So Uber-style programming when database transaction is started with opening door of a car and finished at the end of the trip is completely not compatible with this approach.
2. Fatal errors cause disconnect not only of one client caused the problem but bunch of client sessions scheduled to this backend.
3. It is possible to use PL-APIs, such as plpython, but session level variables may not be used.
4. There may be some memory leaks caused by allocation of memory using malloc or in top memory context which is expected to be freed on backend exit.
But it is not deallocated at session close, so large number of handled sessions can cause memory overflow.
5. Some applications, handling mutliple connections inside single thread and multiplexing them at statement level (rather than on transaction level) may not work correctly.
It seems to be quite exotic use case. But pgbench actually behaves in this way! This is why attempt to start pgbench with multistatement transactions (-N) will fail if number of threads (-j) is smaller than number of connections (-c).
6. The approach with passing socket descriptors between processes was implemented only for Unix and tested only at Linux, although is expected to work also as MacOS and other Unix dialects. Windows is not supported now.
I will be glad to receive an feedback and suggestion concerning perspectives of embedded connection pooling.
My recent experiments with pthread version of Postgres show that although pthread offers some performance advantages comparing with processes for large number of connections, them still can not eliminate need in connection pooling. Large number even of inactive connections cause significant degrade of Postgres performance.
So we need connection pooling. Most of enterprise systems working with Postgres are using pgbouncer or similar tools.
But pgbouncer has the following drawbacks:
1. It is an extra entity which complicates system installation and administration.
2. Pgbouncer itself can be a bottleneck and point of failure. For example with enabled SSL, single threaded model of pgbouncer becomes limiting factor when a lot of clients try to simultaneously reestablish connection. This is why some companies are building hierarchy of pgbouncers.
3. Using pool_mode other than "session" makes it not possible to use prepared statements and session variables.
Lack of prepared statements can itself decrease speed of simple queries up to two times.
So I thought about built-in connection pooling for Postgres. Ideally it should be integrated with pthreads, because in this case scheduling of sessions can be done more flexible and easily.
But I decided to start with patch to vanilla Postgres.
Idea is the following:
1. We start some number of normal backends (which forms backend pool for serving client sessions).
2. When number of connections exceeds number of backends, then instead of spawning new backend we choose some of existed backend and redirect connection to it.
There is more or less portable way in Unix to pass socket descriptors between processes using Unix sockets:
for example https://stackoverflow.com/questions/28003921/sending-file-descriptor-by-linux-socket/
(this is one of the places where pthreads Postgres will win). So a session is bounded to a backend. Backends and chosen using round-robin policy which should guarantee more or less unform distribution of sessions between backends if number of sessions is much larger than number of backends. But certainly skews in client application access patterns can violate this assumption.
3. Rescheduling is done at transaction level. So it is enough to have one entry in procarray for backend to correctly handle locks. Also transaction level pooling eliminates
problem with false deadlocks (caused by lack of free executors in the pool). Also transaction level pooling minimize changes in Postgres core needed to maintain correct session context:
no need to suspend/resume transaction state, static variables, ....
4. In the main Postgres query loop in PostgresMain we determine a moment when backend is not in transaction state and perform select of sockets of all active sessions and choose one of them.
5. When client is disconnected, then we close session but do not terminate backend.
6. To support prepared statements, we append session identifier to the name of the statement. So prepared statements of different sessions will not interleave. As far as session is bounded to the backend, it is possible to use prepared statements.
This is minimal plan for embedded session pooling I decided to implement as prototype.
Several things are not addressed now:
1. Temporary tables. In principle them can be handled in the same way as prepared statements: by concatenating session identifier to the name of the table.
But it require adjusting references to this table in all queries. It is much more complicated than in case of prepared statements.
2. Session level GUCs. In principle it is not difficult to remember GUCs modified by session and save/restore them on session switch.
But it is just not implemented now.
3. Support of multiple users/databases/... It is the most critical drawback. Right now my prototype implementation assumes that all clients are connected to the same database
under the same user with some connection options. And it is a challenge about which I want to know option of community. The name of the database and user are retrieved from client connection by ProcessStartupPacket function. In vanilla Posgres this function is executed by spawned backend. So I do not know which database a client is going to access before calling this function and reading data from the client's socket. Now I just choose random backend and assign connection to this backend. But it can happen that this backend is working with different database/user. Now I just return error in this case. Certainly it is possible to call ProcessStartupPacket at postmaster and then select proper backend working with specified database/user.
But I afraid that postmaster can become bottleneck i this case, especially in case of using SSL. Also larger number of databases/users can significantly suffer efficiency of pooling if each backend will be responsible only for database/user combination. May be backend should be bounded only to the database and concrete role should be set on session switch. But it can require flushing backend caches which devalues idea of embedded session pooling. This problem can be easily solved with multithreaded Postgres where it is possible to easily reassign session to another thread.
Now results shown by my prototype. I used pgbench with scale factor 100 in readonly mode (-S option).
Precise pgbench command is "pgbench -S -c N -M prepared -T 100 -P 1 -n". Results in the table below are in kTPS:
Connections | Vanilla Postgres | Postgres with session pool size=10 |
10 | 186 | 181 |
100 | 118 | 224 |
1000 | 59 | 191 |
As you see instead of degrade of performance with increasing number of connections, Postgres with session pool shows stable performance result.
Moreover, for vanilla Postgres best results at my system are obtained for 10 connections, but Postgres with session pool shows better performance for 100 connections with the same number of spawned backends.
My patch to the Postgres is attached to this mail.
To switch on session polling set session_pool_size to some non-zero value. Another GUC variable which I have added is "max_sessions" which specifies maximal number of sessions handled by backend. So total number of handled client connections is session_pool_size*max_sessions.
Certainly it is just prototype far from practical use.
In addition to the challenges mentioned above, there are also some other issues which should be considered:
1. Long living transaction in client application blocks all other sessions in the backend and so can suspend work of the Postgres.
So Uber-style programming when database transaction is started with opening door of a car and finished at the end of the trip is completely not compatible with this approach.
2. Fatal errors cause disconnect not only of one client caused the problem but bunch of client sessions scheduled to this backend.
3. It is possible to use PL-APIs, such as plpython, but session level variables may not be used.
4. There may be some memory leaks caused by allocation of memory using malloc or in top memory context which is expected to be freed on backend exit.
But it is not deallocated at session close, so large number of handled sessions can cause memory overflow.
5. Some applications, handling mutliple connections inside single thread and multiplexing them at statement level (rather than on transaction level) may not work correctly.
It seems to be quite exotic use case. But pgbench actually behaves in this way! This is why attempt to start pgbench with multistatement transactions (-N) will fail if number of threads (-j) is smaller than number of connections (-c).
6. The approach with passing socket descriptors between processes was implemented only for Unix and tested only at Linux, although is expected to work also as MacOS and other Unix dialects. Windows is not supported now.
I will be glad to receive an feedback and suggestion concerning perspectives of embedded connection pooling.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
+1 to the concept... A lot of user could benefit if we did this in a good way.
On Wed, Jan 17, 2018 at 8:09 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
Hi hackers,
My recent experiments with pthread version of Postgres show that although pthread offers some performance advantages comparing with processes for large number of connections, them still can not eliminate need in connection pooling. Large number even of inactive connections cause significant degrade of Postgres performance.
So we need connection pooling. Most of enterprise systems working with Postgres are using pgbouncer or similar tools.
But pgbouncer has the following drawbacks:
1. It is an extra entity which complicates system installation and administration.
2. Pgbouncer itself can be a bottleneck and point of failure. For example with enabled SSL, single threaded model of pgbouncer becomes limiting factor when a lot of clients try to simultaneously reestablish connection. This is why some companies are building hierarchy of pgbouncers.
3. Using pool_mode other than "session" makes it not possible to use prepared statements and session variables.
Lack of prepared statements can itself decrease speed of simple queries up to two times.
So I thought about built-in connection pooling for Postgres. Ideally it should be integrated with pthreads, because in this case scheduling of sessions can be done more flexible and easily.
But I decided to start with patch to vanilla Postgres.
Idea is the following:
1. We start some number of normal backends (which forms backend pool for serving client sessions).
2. When number of connections exceeds number of backends, then instead of spawning new backend we choose some of existed backend and redirect connection to it.
There is more or less portable way in Unix to pass socket descriptors between processes using Unix sockets:
for example https://stackoverflow.com/questions/28003921/sending- file-descriptor-by-linux- socket/
(this is one of the places where pthreads Postgres will win). So a session is bounded to a backend. Backends and chosen using round-robin policy which should guarantee more or less unform distribution of sessions between backends if number of sessions is much larger than number of backends. But certainly skews in client application access patterns can violate this assumption.
3. Rescheduling is done at transaction level. So it is enough to have one entry in procarray for backend to correctly handle locks. Also transaction level pooling eliminates
problem with false deadlocks (caused by lack of free executors in the pool). Also transaction level pooling minimize changes in Postgres core needed to maintain correct session context:
no need to suspend/resume transaction state, static variables, ....
4. In the main Postgres query loop in PostgresMain we determine a moment when backend is not in transaction state and perform select of sockets of all active sessions and choose one of them.
5. When client is disconnected, then we close session but do not terminate backend.
6. To support prepared statements, we append session identifier to the name of the statement. So prepared statements of different sessions will not interleave. As far as session is bounded to the backend, it is possible to use prepared statements.
This is minimal plan for embedded session pooling I decided to implement as prototype.
Several things are not addressed now:
1. Temporary tables. In principle them can be handled in the same way as prepared statements: by concatenating session identifier to the name of the table.
But it require adjusting references to this table in all queries. It is much more complicated than in case of prepared statements.
2. Session level GUCs. In principle it is not difficult to remember GUCs modified by session and save/restore them on session switch.
But it is just not implemented now.
3. Support of multiple users/databases/... It is the most critical drawback. Right now my prototype implementation assumes that all clients are connected to the same database
under the same user with some connection options. And it is a challenge about which I want to know option of community. The name of the database and user are retrieved from client connection by ProcessStartupPacket function. In vanilla Posgres this function is executed by spawned backend. So I do not know which database a client is going to access before calling this function and reading data from the client's socket. Now I just choose random backend and assign connection to this backend. But it can happen that this backend is working with different database/user. Now I just return error in this case. Certainly it is possible to call ProcessStartupPacket at postmaster and then select proper backend working with specified database/user.
But I afraid that postmaster can become bottleneck i this case, especially in case of using SSL. Also larger number of databases/users can significantly suffer efficiency of pooling if each backend will be responsible only for database/user combination. May be backend should be bounded only to the database and concrete role should be set on session switch. But it can require flushing backend caches which devalues idea of embedded session pooling. This problem can be easily solved with multithreaded Postgres where it is possible to easily reassign session to another thread.
Now results shown by my prototype. I used pgbench with scale factor 100 in readonly mode (-S option).
Precise pgbench command is "pgbench -S -c N -M prepared -T 100 -P 1 -n". Results in the table below are in kTPS:
Connections Vanilla Postgres Postgres with session pool size=10 10 186 181 100 118 224 1000 59 191
As you see instead of degrade of performance with increasing number of connections, Postgres with session pool shows stable performance result.
Moreover, for vanilla Postgres best results at my system are obtained for 10 connections, but Postgres with session pool shows better performance for 100 connections with the same number of spawned backends.
My patch to the Postgres is attached to this mail.
To switch on session polling set session_pool_size to some non-zero value. Another GUC variable which I have added is "max_sessions" which specifies maximal number of sessions handled by backend. So total number of handled client connections is session_pool_size*max_sessions.
Certainly it is just prototype far from practical use.
In addition to the challenges mentioned above, there are also some other issues which should be considered:
1. Long living transaction in client application blocks all other sessions in the backend and so can suspend work of the Postgres.
So Uber-style programming when database transaction is started with opening door of a car and finished at the end of the trip is completely not compatible with this approach.
2. Fatal errors cause disconnect not only of one client caused the problem but bunch of client sessions scheduled to this backend.
3. It is possible to use PL-APIs, such as plpython, but session level variables may not be used.
4. There may be some memory leaks caused by allocation of memory using malloc or in top memory context which is expected to be freed on backend exit.
But it is not deallocated at session close, so large number of handled sessions can cause memory overflow.
5. Some applications, handling mutliple connections inside single thread and multiplexing them at statement level (rather than on transaction level) may not work correctly.
It seems to be quite exotic use case. But pgbench actually behaves in this way! This is why attempt to start pgbench with multistatement transactions (-N) will fail if number of threads (-j) is smaller than number of connections (-c).
6. The approach with passing socket descriptors between processes was implemented only for Unix and tested only at Linux, although is expected to work also as MacOS and other Unix dialects. Windows is not supported now.
I will be glad to receive an feedback and suggestion concerning perspectives of embedded connection pooling.-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
--
On 17.01.2018 19:09, Konstantin Knizhnik wrote:
Attached please find new version of the patch with few fixes.
And more results at NUMA system with 144 cores and 3Tb of RAM.
Read-only pgbench (-S):
Read-write contention test: access to small number of records with 1% of updates.
Hi hackers,
My recent experiments with pthread version of Postgres show that although pthread offers some performance advantages comparing with processes for large number of connections, them still can not eliminate need in connection pooling. Large number even of inactive connections cause significant degrade of Postgres performance.
So we need connection pooling. Most of enterprise systems working with Postgres are using pgbouncer or similar tools.
But pgbouncer has the following drawbacks:
1. It is an extra entity which complicates system installation and administration.
2. Pgbouncer itself can be a bottleneck and point of failure. For example with enabled SSL, single threaded model of pgbouncer becomes limiting factor when a lot of clients try to simultaneously reestablish connection. This is why some companies are building hierarchy of pgbouncers.
3. Using pool_mode other than "session" makes it not possible to use prepared statements and session variables.
Lack of prepared statements can itself decrease speed of simple queries up to two times.
So I thought about built-in connection pooling for Postgres. Ideally it should be integrated with pthreads, because in this case scheduling of sessions can be done more flexible and easily.
But I decided to start with patch to vanilla Postgres.
Idea is the following:
1. We start some number of normal backends (which forms backend pool for serving client sessions).
2. When number of connections exceeds number of backends, then instead of spawning new backend we choose some of existed backend and redirect connection to it.
There is more or less portable way in Unix to pass socket descriptors between processes using Unix sockets:
for example https://stackoverflow.com/questions/28003921/sending-file-descriptor-by-linux-socket/
(this is one of the places where pthreads Postgres will win). So a session is bounded to a backend. Backends and chosen using round-robin policy which should guarantee more or less unform distribution of sessions between backends if number of sessions is much larger than number of backends. But certainly skews in client application access patterns can violate this assumption.
3. Rescheduling is done at transaction level. So it is enough to have one entry in procarray for backend to correctly handle locks. Also transaction level pooling eliminates
problem with false deadlocks (caused by lack of free executors in the pool). Also transaction level pooling minimize changes in Postgres core needed to maintain correct session context:
no need to suspend/resume transaction state, static variables, ....
4. In the main Postgres query loop in PostgresMain we determine a moment when backend is not in transaction state and perform select of sockets of all active sessions and choose one of them.
5. When client is disconnected, then we close session but do not terminate backend.
6. To support prepared statements, we append session identifier to the name of the statement. So prepared statements of different sessions will not interleave. As far as session is bounded to the backend, it is possible to use prepared statements.
This is minimal plan for embedded session pooling I decided to implement as prototype.
Several things are not addressed now:
1. Temporary tables. In principle them can be handled in the same way as prepared statements: by concatenating session identifier to the name of the table.
But it require adjusting references to this table in all queries. It is much more complicated than in case of prepared statements.
2. Session level GUCs. In principle it is not difficult to remember GUCs modified by session and save/restore them on session switch.
But it is just not implemented now.
3. Support of multiple users/databases/... It is the most critical drawback. Right now my prototype implementation assumes that all clients are connected to the same database
under the same user with some connection options. And it is a challenge about which I want to know option of community. The name of the database and user are retrieved from client connection by ProcessStartupPacket function. In vanilla Posgres this function is executed by spawned backend. So I do not know which database a client is going to access before calling this function and reading data from the client's socket. Now I just choose random backend and assign connection to this backend. But it can happen that this backend is working with different database/user. Now I just return error in this case. Certainly it is possible to call ProcessStartupPacket at postmaster and then select proper backend working with specified database/user.
But I afraid that postmaster can become bottleneck i this case, especially in case of using SSL. Also larger number of databases/users can significantly suffer efficiency of pooling if each backend will be responsible only for database/user combination. May be backend should be bounded only to the database and concrete role should be set on session switch. But it can require flushing backend caches which devalues idea of embedded session pooling. This problem can be easily solved with multithreaded Postgres where it is possible to easily reassign session to another thread.
Now results shown by my prototype. I used pgbench with scale factor 100 in readonly mode (-S option).
Precise pgbench command is "pgbench -S -c N -M prepared -T 100 -P 1 -n". Results in the table below are in kTPS:
Connections Vanilla Postgres Postgres with session pool size=10 10 186 181 100 118 224 1000 59 191
As you see instead of degrade of performance with increasing number of connections, Postgres with session pool shows stable performance result.
Moreover, for vanilla Postgres best results at my system are obtained for 10 connections, but Postgres with session pool shows better performance for 100 connections with the same number of spawned backends.
My patch to the Postgres is attached to this mail.
To switch on session polling set session_pool_size to some non-zero value. Another GUC variable which I have added is "max_sessions" which specifies maximal number of sessions handled by backend. So total number of handled client connections is session_pool_size*max_sessions.
Certainly it is just prototype far from practical use.
In addition to the challenges mentioned above, there are also some other issues which should be considered:
1. Long living transaction in client application blocks all other sessions in the backend and so can suspend work of the Postgres.
So Uber-style programming when database transaction is started with opening door of a car and finished at the end of the trip is completely not compatible with this approach.
2. Fatal errors cause disconnect not only of one client caused the problem but bunch of client sessions scheduled to this backend.
3. It is possible to use PL-APIs, such as plpython, but session level variables may not be used.
4. There may be some memory leaks caused by allocation of memory using malloc or in top memory context which is expected to be freed on backend exit.
But it is not deallocated at session close, so large number of handled sessions can cause memory overflow.
5. Some applications, handling mutliple connections inside single thread and multiplexing them at statement level (rather than on transaction level) may not work correctly.
It seems to be quite exotic use case. But pgbench actually behaves in this way! This is why attempt to start pgbench with multistatement transactions (-N) will fail if number of threads (-j) is smaller than number of connections (-c).
6. The approach with passing socket descriptors between processes was implemented only for Unix and tested only at Linux, although is expected to work also as MacOS and other Unix dialects. Windows is not supported now.
I will be glad to receive an feedback and suggestion concerning perspectives of embedded connection pooling.-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attached please find new version of the patch with few fixes.
And more results at NUMA system with 144 cores and 3Tb of RAM.
Read-only pgbench (-S):
#Connections\kTPS | Vanilla Postgres | Session pool size 256 |
1k | 1300 | 1505 |
10k | 633 | 1519 |
100k | - | 1425 |
Read-write contention test: access to small number of records with 1% of updates.
#Clients\TPS | Vanilla Postgres | Session pool size 256 |
100 | 557232 | 573319 |
200 | 520395 | 551670 |
300 | 511423 | 533773 |
400 | 468562 | 523091 |
500 | 442268 | 514056 |
600 | 401860 | 526704 |
700 | 363912 | 530317 |
800 | 325148 | 512238 |
900 | 301310 | 512844 |
1000 | 278829 | 554516 |
So, as you can see, there is no degrade of performance with increased number of connections in case of using session pooling. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Thu, Jan 18, 2018 at 11:48 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
Attached please find new version of the patch with few fixes.
And more results at NUMA system with 144 cores and 3Tb of RAM.
Read-only pgbench (-S):
#Connections\kTPS Vanilla Postgres Session pool size 256 1k 1300 1505 10k 633 1519 100k - 1425
Read-write contention test: access to small number of records with 1% of updates.
#Clients\TPS Vanilla Postgres Session pool size 256 100 557232 573319 200 520395 551670 300 511423 533773 400 468562 523091 500 442268 514056 600 401860 526704 700 363912 530317 800 325148 512238 900 301310 512844 1000 278829 554516 So, as you can see, there is no degrade of performance with increased number of connections in case of using session pooling.
TBH, the tests you should be running are comparisons with a similar pool size managed by pgbouncer, not just vanilla unlimited postgres.
Of course a limited pool size will beat thousands of concurrent queries by a large margin. The real question is whether a pthread-based approach beats the pgbouncer approach.
Hi Konstantin, On 01/18/2018 03:48 PM, Konstantin Knizhnik wrote: > On 17.01.2018 19:09, Konstantin Knizhnik wrote: >> Hi hackers, >> >> ... > I haven't looked at the code yet, but after reading your message I have a simple question - gow iss this going to work with SSL? If you're only passing a file descriptor, that does not seem to be sufficient for the backends to do crypto (that requires the SSL stuff from Port). Maybe I'm missing something and it already works, though ... regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 18.01.2018 18:02, Tomas Vondra wrote: > Hi Konstantin, > > On 01/18/2018 03:48 PM, Konstantin Knizhnik wrote: >> On 17.01.2018 19:09, Konstantin Knizhnik wrote: >>> Hi hackers, >>> >>> ... > I haven't looked at the code yet, but after reading your message I have > a simple question - gow iss this going to work with SSL? If you're only > passing a file descriptor, that does not seem to be sufficient for the > backends to do crypto (that requires the SSL stuff from Port). > > Maybe I'm missing something and it already works, though ... > > > regards > Ooops, I missed this aspect with SSL. Thank you. New version of the patch which correctly maintain session context is attached. Now each session has its own allocator which should be used instead of TopMemoryAllocator. SSL connections work now. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 18.01.2018 18:00, Claudio Freire wrote:
On Thu, Jan 18, 2018 at 11:48 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
Attached please find new version of the patch with few fixes.
And more results at NUMA system with 144 cores and 3Tb of RAM.
Read-only pgbench (-S):
#Connections\kTPS Vanilla Postgres Session pool size 256 1k 1300 1505 10k 633 1519 100k - 1425
Read-write contention test: access to small number of records with 1% of updates.
#Clients\TPS Vanilla Postgres Session pool size 256 100 557232 573319 200 520395 551670 300 511423 533773 400 468562 523091 500 442268 514056 600 401860 526704 700 363912 530317 800 325148 512238 900 301310 512844 1000 278829 554516 So, as you can see, there is no degrade of performance with increased number of connections in case of using session pooling.TBH, the tests you should be running are comparisons with a similar pool size managed by pgbouncer, not just vanilla unlimited postgres.Of course a limited pool size will beat thousands of concurrent queries by a large margin. The real question is whether a pthread-based approach beats the pgbouncer approach.
Below are are results with pgbouncer:
#Connections\kTPS | Vanilla Postgres | Builti-in session pool size 256 | Postgres + pgbouncer with transaction pooling mode and pool size 256 | Postgres + 10 pgbouncers with pool size 20 |
1k | 1300 | 1505 | 105 | 751 |
10k | 633 | 1519 | 94 | 664 |
100k | - | 1425 | - | - |
(-) here means that I failed to start such number of connections (because of "resource temporary unavailable" and similar errors).
So single pgbouncer is 10 times slower than direct connection to the postgres.
No surprise here: pgbouncer is snigle threaded and CPU usage for pgbouncer is almost 100%.
So we have to launch several instances of pgbouncer and somehow distribute load between them.
In Linux it is possible to use REUSEPORT(https://lwn.net/Articles/542629/) to perform load balancing between several pgbouncer instances.
But you have to edit pgbouncer code: it doesn't support such mode. So I have started several instances of pgbouncer at different ports and explicitly distribute several pgbench instances between them.
But even in this case performance is twice slower than direct connection and built-in session pooling.
It is because of lacked of prepared statements which I can not use with pgbouncer in statement/transaction pooling mode.
Also please notice that with session pooling performance is better than with vanilla Postgres.
It is because with session pooling we can open more connections with out launching more backends.
It is especially noticeable at my local desktop with 4 cores: for normal Postgres optimal number of connections is about 10. But with session pooling 100 connections shows about 30% better result.
So, summarizing all above:
1. pgbouncer doesn't allows to use prepared statements and it cause up to two times performance penalty.
2. pgbouncer is single threaded and can not efficiently handle more than 1k connections.
3. pgbouncer never can provide better performance than application connected directly to Postgres with optimal number of connections. In contrast session pooling can provide better performance than vanilla Postgres with optimal number of connections.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 01/19/2018 10:52 AM, Konstantin Knizhnik wrote: > > > On 18.01.2018 18:02, Tomas Vondra wrote: >> Hi Konstantin, >> >> On 01/18/2018 03:48 PM, Konstantin Knizhnik wrote: >>> On 17.01.2018 19:09, Konstantin Knizhnik wrote: >>>> Hi hackers, >>>> >>>> ... >> I haven't looked at the code yet, but after reading your message I have >> a simple question - gow iss this going to work with SSL? If you're only >> passing a file descriptor, that does not seem to be sufficient for the >> backends to do crypto (that requires the SSL stuff from Port). >> >> Maybe I'm missing something and it already works, though ... >> >> >> regards >> > Ooops, I missed this aspect with SSL. Thank you. > New version of the patch which correctly maintain session context is > attached. > Now each session has its own allocator which should be used instead > of TopMemoryAllocator. SSL connections work now. > OK. I've looked at the code, but I have a rather hard time understanding it, because there are pretty much no comments explaining the intent of the added code :-( I strongly suggest improving that, to help reviewers. The questions I'm asking myself are mostly these: 1) When assigning a backend, we first try to get one from a pool, which happens right at the beginning of BackendStartup. If we find a usable backend, we send the info to the backend (pg_send_sock/pg_recv_sock). But AFAICS this only only happens at connection time, right? But it your initial message you say "Rescheduling is done at transaction level," which in my understanding means "transaction pooling". So, how does that part work? 2) How does this deal with backends for different databases? I don't see any checks that the requested database matches the backend database (not any code switching the backend from one db to another - which would be very tricky, I think). 3) Is there any sort of shrinking the pools? I mean, if the backend is idle for certain period of time (or when we need backends for other databases), does it get closed automatically? Furthermore, I'm rather confused about the meaning of session_pool_size. I mean, that GUC determines the number of backends in the pool, it has nothing to do with sessions per se, right? Which would mean it's a bit misleading to name it "session_..." (particularly if the pooling happens at transaction level, not session level - which is question #1). When I've been thinking about adding a built-in connection pool, my rough plan was mostly "bgworker doing something like pgbouncer" (that is, listening on a separate port and proxying everything to regular backends). Obviously, that has pros and cons, and probably would not work serve the threading use case well. But it would have some features that I find valuable - for example, it's trivial to decide which connection requests may or may not be served from a pool (by connection to the main port or pool port). That is not to say the bgworker approach is better than what you're proposing, but I wonder if that would be possible with your approach. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 19.01.2018 18:53, Tomas Vondra wrote: > > On 01/19/2018 10:52 AM, Konstantin Knizhnik wrote: >> >> On 18.01.2018 18:02, Tomas Vondra wrote: >>> Hi Konstantin, >>> >>> On 01/18/2018 03:48 PM, Konstantin Knizhnik wrote: >>>> On 17.01.2018 19:09, Konstantin Knizhnik wrote: >>>>> Hi hackers, >>>>> >>>>> ... >>> I haven't looked at the code yet, but after reading your message I have >>> a simple question - gow iss this going to work with SSL? If you're only >>> passing a file descriptor, that does not seem to be sufficient for the >>> backends to do crypto (that requires the SSL stuff from Port). >>> >>> Maybe I'm missing something and it already works, though ... >>> >>> >>> regards >>> >> Ooops, I missed this aspect with SSL. Thank you. >> New version of the patch which correctly maintain session context is >> attached. >> Now each session has its own allocator which should be used instead >> of TopMemoryAllocator. SSL connections work now. >> > OK. I've looked at the code, but I have a rather hard time understanding > it, because there are pretty much no comments explaining the intent of > the added code :-( I strongly suggest improving that, to help reviewers. Sorry, sorry, sorry... There are some comments and I will add more. > > The questions I'm asking myself are mostly these: > > 1) When assigning a backend, we first try to get one from a pool, which > happens right at the beginning of BackendStartup. If we find a usable > backend, we send the info to the backend (pg_send_sock/pg_recv_sock). > > But AFAICS this only only happens at connection time, right? But it your > initial message you say "Rescheduling is done at transaction level," > which in my understanding means "transaction pooling". So, how does that > part work? Here it is: ChooseSession: DoingCommandRead = true; /* Select which client session is ready to send new query */ if (WaitEventSetWait(SessionPool, -1, &ready_client, 1, PG_WAIT_CLIENT) != 1) ... if (ready_client.fd == SessionPoolSock) { /* Here we handle case of attaching new session */ ... } else /* and here we handle case when there is query (new transaction) from some client */ { elog(DEBUG2, "Switch to session %d in backend %d", ready_client.fd, MyProcPid); CurrentSession = (SessionContext*)ready_client.user_data; MyProcPort = CurrentSession->port; } > > 2) How does this deal with backends for different databases? I don't see > any checks that the requested database matches the backend database (not > any code switching the backend from one db to another - which would be > very tricky, I think). As I wrote in the initial mail this problem is not handled now. It is expected that all clients are connected to the same database using the same user. I only check and report an error if this assumption is violated. Definitely it should be fixed. And it is one of the main challenge with this approach! And I want to receive some advices from community about the best ways of solving it. The problem is that we get information about database/user in ProcessStartupPackage function in the beackend, when session is already assigned to the particular backend. We either have to somehow redirect session to some other backend (somehow notify postmaster that we are not able to handle it)? either obtain database/user name in postmaster. But it meas that ProcessStartupPackage should be called in postmaster and Postmaster has to read from client's socket. I afraid that postmaster can be a bottleneck in this case. The problem can be much easily solved in case of using pthread version of Postgres. In this case reassigning session to another executor (thread) can be don much easily. And there is no need to use unportable trick with passing fiel descriptor to other process. And in future I am going to combine them. The problem is that pthread version of Postgres is still in very raw state. > 3) Is there any sort of shrinking the pools? I mean, if the backend is > idle for certain period of time (or when we need backends for other > databases), does it get closed automatically? When client is disconnected, client session is closed. But backen is not terminated even if there are no more sessions at this backend. It was done intentionally, to avoid permanent spawning of new processes when there is one or few clients which frequently connect/disconnect to the database. > > > Furthermore, I'm rather confused about the meaning of session_pool_size. > I mean, that GUC determines the number of backends in the pool, it has > nothing to do with sessions per se, right? Which would mean it's a bit > misleading to name it "session_..." (particularly if the pooling happens > at transaction level, not session level - which is question #1). > Yehh, yes it is not right name. It means maximal number of backends which should be used to serve client's sessions. But "max backends" is already used and has completely different meaning. > When I've been thinking about adding a built-in connection pool, my > rough plan was mostly "bgworker doing something like pgbouncer" (that > is, listening on a separate port and proxying everything to regular > backends). Obviously, that has pros and cons, and probably would not > work serve the threading use case well. And we will get the same problem as with pgbouncer: one process will not be able to handle all connections... Certainly it is possible to start several such scheduling bgworkers... But in any case it is more efficient to multiplex session in backend themselves. > But it would have some features that I find valuable - for example, it's > trivial to decide which connection requests may or may not be served > from a pool (by connection to the main port or pool port). > > That is not to say the bgworker approach is better than what you're > proposing, but I wonder if that would be possible with your approach. > > > regards > -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
When I've been thinking about adding a built-in connection pool, my
rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.
And we will get the same problem as with pgbouncer: one process will not be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers... But in any case it is more efficient to multiplex session in backend themselves.
pgbouncer hold all time client connect. When we implement the listeners, then all work can be done by worker processes not by listeners.
Regards
Pavel
But it would have some features that I find valuable - for example, it's
trivial to decide which connection requests may or may not be served
from a pool (by connection to the main port or pool port).
That is not to say the bgworker approach is better than what you're
proposing, but I wonder if that would be possible with your approach.
regards
--
On 19.01.2018 19:28, Pavel Stehule wrote:
When I've been thinking about adding a built-in connection pool, my
rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.
And we will get the same problem as with pgbouncer: one process will not be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers... But in any case it is more efficient to multiplex session in backend themselves.pgbouncer hold all time client connect. When we implement the listeners, then all work can be done by worker processes not by listeners.
Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only once at the beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100% and it is mostly in kernel space: top of profile are kernel functions).
The same picture will be if instead of pgbouncer you will do such scheduling in one bgworker.
For the modern systems are not able to perform more than several hundreds of connection switches per second.
So with single multiplexing thread or process you can not get speed more than 100k, while at powerful NUMA system it is possible to achieve millions of TPS.
It is illustrated by the results I have sent in the previous mail: by spawning 10 instances of pgbouncer I was able to receive 7 times bigger speed.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 01/19/2018 05:17 PM, Konstantin Knizhnik wrote: > > > On 19.01.2018 18:53, Tomas Vondra wrote: >> >> ... >> >> The questions I'm asking myself are mostly these: >> >> 1) When assigning a backend, we first try to get one from a pool, which >> happens right at the beginning of BackendStartup. If we find a usable >> backend, we send the info to the backend (pg_send_sock/pg_recv_sock). >> >> But AFAICS this only only happens at connection time, right? But it your >> initial message you say "Rescheduling is done at transaction level," >> which in my understanding means "transaction pooling". So, how does that >> part work? > > Here it is: > > ChooseSession: > ... > OK, thanks. >> >> 2) How does this deal with backends for different databases? I >> don't see any checks that the requested database matches the >> backend database (not any code switching the backend from one db to >> another - which would be very tricky, I think). > As I wrote in the initial mail this problem is not handled now. > It is expected that all clients are connected to the same database using > the same user. > I only check and report an error if this assumption is violated. > Definitely it should be fixed. And it is one of the main challenge with > this approach! And I want to receive some advices from community about > the best ways of solving it. > The problem is that we get information about database/user in > ProcessStartupPackage function in the beackend, when session is already > assigned to the particular backend. > We either have to somehow redirect session to some other backend > (somehow notify postmaster that we are not able to handle it)? > either obtain database/user name in postmaster. But it meas that > ProcessStartupPackage should be called in postmaster and Postmaster has > to read from client's socket. > I afraid that postmaster can be a bottleneck in this case. > Hmmm, that's unfortunate. I guess you'll have process the startup packet in the main process, before it gets forked. At least partially. > The problem can be much easily solved in case of using pthread version > of Postgres. In this case reassigning session to another executor > (thread) can be don much easily. > And there is no need to use unportable trick with passing fiel > descriptor to other process. > And in future I am going to combine them. The problem is that pthread > version of Postgres is still in very raw state. > Yeah. Unfortunately, we're using processes now, and switching to threads will take time (assuming it happens at all). >> 3) Is there any sort of shrinking the pools? I mean, if the backend is >> idle for certain period of time (or when we need backends for other >> databases), does it get closed automatically? > > When client is disconnected, client session is closed. But backen is not > terminated even if there are no more sessions at this backend. > It was done intentionally, to avoid permanent spawning of new processes > when there is one or few clients which frequently connect/disconnect to > the database. Sure, but it means a short peak will exhaust the backends indefinitely. That's acceptable for a PoC, but I think needs to be fixed eventually. >> >> Furthermore, I'm rather confused about the meaning of session_pool_size. >> I mean, that GUC determines the number of backends in the pool, it has >> nothing to do with sessions per se, right? Which would mean it's a bit >> misleading to name it "session_..." (particularly if the pooling happens >> at transaction level, not session level - which is question #1). >> > Yehh, yes it is not right name. It means maximal number of backends > which should be used to serve client's sessions. > But "max backends" is already used and has completely different meaning. > >> When I've been thinking about adding a built-in connection pool, my >> rough plan was mostly "bgworker doing something like pgbouncer" (that >> is, listening on a separate port and proxying everything to regular >> backends). Obviously, that has pros and cons, and probably would not >> work serve the threading use case well. > > And we will get the same problem as with pgbouncer: one process will not > be able to handle all connections... > Certainly it is possible to start several such scheduling bgworkers... > But in any case it is more efficient to multiplex session in backend > themselves. > Well, I haven't said it has to be single-threaded like pgbouncer. I don't see why the bgworker could not use multiple threads internally (of course, it'd need to be not to mess the stuff that is not thread-safe). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
2018-01-19 17:53 GMT+01:00 Konstantin Knizhnik <k.knizhnik@postgrespro.ru>:
On 19.01.2018 19:28, Pavel Stehule wrote:
When I've been thinking about adding a built-in connection pool, my
rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.
And we will get the same problem as with pgbouncer: one process will not be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers... But in any case it is more efficient to multiplex session in backend themselves.pgbouncer hold all time client connect. When we implement the listeners, then all work can be done by worker processes not by listeners.
Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only once at the beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100% and it is mostly in kernel space: top of profile are kernel functions).
The same picture will be if instead of pgbouncer you will do such scheduling in one bgworker.
For the modern systems are not able to perform more than several hundreds of connection switches per second.
So with single multiplexing thread or process you can not get speed more than 100k, while at powerful NUMA system it is possible to achieve millions of TPS.
It is illustrated by the results I have sent in the previous mail: by spawning 10 instances of pgbouncer I was able to receive 7 times bigger speed.
pgbouncer is proxy sw. I don't think so native pooler should be proxy too. So the compare pgbouncer with hypothetical native pooler is not fair, because pgbouncer pass all communication
Regards
Pavel
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 01/19/2018 05:53 PM, Konstantin Knizhnik wrote: > > > On 19.01.2018 19:28, Pavel Stehule wrote: >> >> >> When I've been thinking about adding a built-in connection >> pool, my >> rough plan was mostly "bgworker doing something like >> pgbouncer" (that >> is, listening on a separate port and proxying everything to >> regular >> backends). Obviously, that has pros and cons, and probably >> would not >> work serve the threading use case well. >> >> >> And we will get the same problem as with pgbouncer: one process >> will not be able to handle all connections... >> Certainly it is possible to start several such scheduling >> bgworkers... But in any case it is more efficient to multiplex >> session in backend themselves. >> >> >> pgbouncer hold all time client connect. When we implement the >> listeners, then all work can be done by worker processes not by listeners. >> > > Sorry, I do not understand your point. > In my case pgbench establish connection to the pgbouncer only once at > the beginning of the test. > And pgbouncer spends all time in context switches (CPU usage is 100% and > it is mostly in kernel space: top of profile are kernel functions). > The same picture will be if instead of pgbouncer you will do such > scheduling in one bgworker. > For the modern systems are not able to perform more than several > hundreds of connection switches per second. > So with single multiplexing thread or process you can not get speed more > than 100k, while at powerful NUMA system it is possible to achieve > millions of TPS. > It is illustrated by the results I have sent in the previous mail: by > spawning 10 instances of pgbouncer I was able to receive 7 times bigger > speed. > AFAICS making pgbouncer multi-threaded would not be hugely complicated. A simple solution would be a fixed number of worker threads, and client connections randomly assigned to them. But this generally is not a common bottleneck in practical workloads (of course, YMMV). regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Jan 19, 2018 at 1:53 PM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
On 19.01.2018 19:28, Pavel Stehule wrote:
When I've been thinking about adding a built-in connection pool, my
rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.
And we will get the same problem as with pgbouncer: one process will not be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers... But in any case it is more efficient to multiplex session in backend themselves.pgbouncer hold all time client connect. When we implement the listeners, then all work can be done by worker processes not by listeners.
Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only once at the beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100% and it is mostly in kernel space: top of profile are kernel functions).
The same picture will be if instead of pgbouncer you will do such scheduling in one bgworker.
For the modern systems are not able to perform more than several hundreds of connection switches per second.
So with single multiplexing thread or process you can not get speed more than 100k, while at powerful NUMA system it is possible to achieve millions of TPS.
It is illustrated by the results I have sent in the previous mail: by spawning 10 instances of pgbouncer I was able to receive 7 times bigger speed.
I'm sure pgbouncer can be improved. I've seen async code handle millions of packets per second (zmq), pgbouncer shouldn't be radically different.
On 01/19/2018 06:03 PM, Claudio Freire wrote: > > > On Fri, Jan 19, 2018 at 1:53 PM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote: > > > > On 19.01.2018 19:28, Pavel Stehule wrote: >> >> >> When I've been thinking about adding a built-in connection >> pool, my >> rough plan was mostly "bgworker doing something like >> pgbouncer" (that >> is, listening on a separate port and proxying everything >> to regular >> backends). Obviously, that has pros and cons, and probably >> would not >> work serve the threading use case well. >> >> >> And we will get the same problem as with pgbouncer: one >> process will not be able to handle all connections... >> Certainly it is possible to start several such scheduling >> bgworkers... But in any case it is more efficient to multiplex >> session in backend themselves. >> >> >> pgbouncer hold all time client connect. When we implement the >> listeners, then all work can be done by worker processes not by >> listeners. >> > > Sorry, I do not understand your point. > In my case pgbench establish connection to the pgbouncer only once > at the beginning of the test. > And pgbouncer spends all time in context switches (CPU usage is 100% > and it is mostly in kernel space: top of profile are kernel functions). > The same picture will be if instead of pgbouncer you will do such > scheduling in one bgworker. > For the modern systems are not able to perform more than several > hundreds of connection switches per second. > So with single multiplexing thread or process you can not get speed > more than 100k, while at powerful NUMA system it is possible to > achieve millions of TPS. > It is illustrated by the results I have sent in the previous mail: > by spawning 10 instances of pgbouncer I was able to receive 7 times > bigger speed. > > > I'm sure pgbouncer can be improved. I've seen async code handle millions > of packets per second (zmq), pgbouncer shouldn't be radically different. > The trouble is pgbouncer is not handling individual packets. It needs to do additional processing to assemble the messages, understand the state of the connection (e.g. to do transaction pooling) etc. Or handle SSL. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 19.01.2018 19:59, Tomas Vondra wrote: > The problem can be much easily solved in case of using pthread version >> of Postgres. In this case reassigning session to another executor >> (thread) can be don much easily. >> And there is no need to use unportable trick with passing fiel >> descriptor to other process. >> And in future I am going to combine them. The problem is that pthread >> version of Postgres is still in very raw state. >> > Yeah. Unfortunately, we're using processes now, and switching to threads > will take time (assuming it happens at all). I have to agree with you. >>> 3) Is there any sort of shrinking the pools? I mean, if the backend is >>> idle for certain period of time (or when we need backends for other >>> databases), does it get closed automatically? >> When client is disconnected, client session is closed. But backen is not >> terminated even if there are no more sessions at this backend. >> It was done intentionally, to avoid permanent spawning of new processes >> when there is one or few clients which frequently connect/disconnect to >> the database. > Sure, but it means a short peak will exhaust the backends indefinitely. > That's acceptable for a PoC, but I think needs to be fixed eventually. Sorry, I do not understand it. You specify size of backends pool which will server client session. Size of this pool is chosen to provide the best performance at the particular system and workload. So number of backends will never exceed this optimal value even in case of "short peak". From my point of view terminating backends when there are no active sessions is wrong idea in any case, it was not temporary decision just for PoC. > > Well, I haven't said it has to be single-threaded like pgbouncer. I > don't see why the bgworker could not use multiple threads internally (of > course, it'd need to be not to mess the stuff that is not thread-safe). > Certainly architecture with N multiple scheduling bgworkers and M executors (backends) may be more flexible than solution when scheduling is done in executor itself. But we will have to pay extra cost for redirection. I am not sure that finally it will allow to reach better performance. More flexible solution in many cases doesn't mean more efficient solution. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, Jan 19, 2018 at 2:06 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 01/19/2018 06:03 PM, Claudio Freire wrote:
>
>
> On Fri, Jan 19, 2018 at 1:53 PM, Konstantin KnizhnikThe trouble is pgbouncer is not handling individual packets. It needs to> <k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:
>
>
>
> On 19.01.2018 19:28, Pavel Stehule wrote:
>>
>>
>> When I've been thinking about adding a built-in connection
>> pool, my
>> rough plan was mostly "bgworker doing something like
>> pgbouncer" (that
>> is, listening on a separate port and proxying everything
>> to regular
>> backends). Obviously, that has pros and cons, and probably
>> would not
>> work serve the threading use case well.
>>
>>
>> And we will get the same problem as with pgbouncer: one
>> process will not be able to handle all connections...
>> Certainly it is possible to start several such scheduling
>> bgworkers... But in any case it is more efficient to multiplex
>> session in backend themselves.
>>
>>
>> pgbouncer hold all time client connect. When we implement the
>> listeners, then all work can be done by worker processes not by
>> listeners.
>>
>
> Sorry, I do not understand your point.
> In my case pgbench establish connection to the pgbouncer only once
> at the beginning of the test.
> And pgbouncer spends all time in context switches (CPU usage is 100%
> and it is mostly in kernel space: top of profile are kernel functions).
> The same picture will be if instead of pgbouncer you will do such
> scheduling in one bgworker.
> For the modern systems are not able to perform more than several
> hundreds of connection switches per second.
> So with single multiplexing thread or process you can not get speed
> more than 100k, while at powerful NUMA system it is possible to
> achieve millions of TPS.
> It is illustrated by the results I have sent in the previous mail:
> by spawning 10 instances of pgbouncer I was able to receive 7 times
> bigger speed.
>
>
> I'm sure pgbouncer can be improved. I've seen async code handle millions
> of packets per second (zmq), pgbouncer shouldn't be radically different.
>
do additional processing to assemble the messages, understand the state
of the connection (e.g. to do transaction pooling) etc. Or handle SSL.
I understand. But zmq also has to process framing very similar to the fe protocol, so I'm still hopeful.
On 19.01.2018 20:01, Pavel Stehule wrote:
2018-01-19 17:53 GMT+01:00 Konstantin Knizhnik <k.knizhnik@postgrespro.ru>:On 19.01.2018 19:28, Pavel Stehule wrote:
When I've been thinking about adding a built-in connection pool, my
rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.
And we will get the same problem as with pgbouncer: one process will not be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers... But in any case it is more efficient to multiplex session in backend themselves.pgbouncer hold all time client connect. When we implement the listeners, then all work can be done by worker processes not by listeners.
Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only once at the beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100% and it is mostly in kernel space: top of profile are kernel functions).
The same picture will be if instead of pgbouncer you will do such scheduling in one bgworker.
For the modern systems are not able to perform more than several hundreds of connection switches per second.
So with single multiplexing thread or process you can not get speed more than 100k, while at powerful NUMA system it is possible to achieve millions of TPS.
It is illustrated by the results I have sent in the previous mail: by spawning 10 instances of pgbouncer I was able to receive 7 times bigger speed.pgbouncer is proxy sw. I don't think so native pooler should be proxy too. So the compare pgbouncer with hypothetical native pooler is not fair, because pgbouncer pass all communication
If we will have separate scheduling bgworker(s) as Tomas proposed, then in any case we will have to do some kind of redirection.
It can be done in more efficient way than using Unix sockets (as it is in case of locally installed pgbouncer), but even if we use shared memory queue then
performance will be comparable and limited by number of context switches. It is possible to increase it by combining several requests into one parcel.
But it even more complicate communication protocol between clients, scheduling proxies and executors.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, Jan 19, 2018 at 2:07 PM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
Well, I haven't said it has to be single-threaded like pgbouncer. I
don't see why the bgworker could not use multiple threads internally (of
course, it'd need to be not to mess the stuff that is not thread-safe).
Certainly architecture with N multiple scheduling bgworkers and M executors (backends) may be more flexible
than solution when scheduling is done in executor itself. But we will have to pay extra cost for redirection.
I am not sure that finally it will allow to reach better performance.
More flexible solution in many cases doesn't mean more efficient solution.
I think you can take the best of both worlds.
You can take your approach of passing around fds, and build a "load balancing protocol" in a bgworker.
The postmaster sends the socket to the bgworker, the bgworker waits for a command as pgbouncer does, but instead of proxying everything, when commands arrive, it passes the socket to a backend to handle.
That way, the bgworker can do what pgbouncer does, handle different pooling modes, match backends to databases, etc, but it doesn't have to proxy all data, it just delegates handling of a command to a backend, and forgets about that socket.
Sounds like it could work.
On 19.01.2018 20:03, Claudio Freire wrote:
With pgbouncer you will never be able to use prepared statements which slows down simple queries almost twice (unless my patch with autoprepared statements is committed).On Fri, Jan 19, 2018 at 1:53 PM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:On 19.01.2018 19:28, Pavel Stehule wrote:
When I've been thinking about adding a built-in connection pool, my
rough plan was mostly "bgworker doing something like pgbouncer" (that
is, listening on a separate port and proxying everything to regular
backends). Obviously, that has pros and cons, and probably would not
work serve the threading use case well.
And we will get the same problem as with pgbouncer: one process will not be able to handle all connections...
Certainly it is possible to start several such scheduling bgworkers... But in any case it is more efficient to multiplex session in backend themselves.pgbouncer hold all time client connect. When we implement the listeners, then all work can be done by worker processes not by listeners.
Sorry, I do not understand your point.
In my case pgbench establish connection to the pgbouncer only once at the beginning of the test.
And pgbouncer spends all time in context switches (CPU usage is 100% and it is mostly in kernel space: top of profile are kernel functions).
The same picture will be if instead of pgbouncer you will do such scheduling in one bgworker.
For the modern systems are not able to perform more than several hundreds of connection switches per second.
So with single multiplexing thread or process you can not get speed more than 100k, while at powerful NUMA system it is possible to achieve millions of TPS.
It is illustrated by the results I have sent in the previous mail: by spawning 10 instances of pgbouncer I was able to receive 7 times bigger speed.I'm sure pgbouncer can be improved. I've seen async code handle millions of packets per second (zmq), pgbouncer shouldn't be radically different.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 01/19/2018 06:07 PM, Konstantin Knizhnik wrote: > > ... > >>>> 3) Is there any sort of shrinking the pools? I mean, if the backend is >>>> idle for certain period of time (or when we need backends for other >>>> databases), does it get closed automatically? >>>> >>> When client is disconnected, client session is closed. But backen is not >>> terminated even if there are no more sessions at this backend. >>> It was done intentionally, to avoid permanent spawning of new processes >>> when there is one or few clients which frequently connect/disconnect to >>> the database. >>> >> Sure, but it means a short peak will exhaust the backends indefinitely. >> That's acceptable for a PoC, but I think needs to be fixed eventually. >> > Sorry, I do not understand it. > You specify size of backends pool which will server client session. > Size of this pool is chosen to provide the best performance at the > particular system and workload. > So number of backends will never exceed this optimal value even in case > of "short peak". > From my point of view terminating backends when there are no active > sessions is wrong idea in any case, it was not temporary decision just > for PoC. > That is probably true when there is just a single pool (for one database/user). But when there are multiple such pools, it forces you to keep the sum(pool_size) below max_connections. Which seems strange. I do think the ability to evict backends after some timeout, or when there is pressure in other pools (different user/database) is rather useful. >> >> Well, I haven't said it has to be single-threaded like pgbouncer. I >> don't see why the bgworker could not use multiple threads internally (of >> course, it'd need to be not to mess the stuff that is not thread-safe). >> > > Certainly architecture with N multiple scheduling bgworkers and M > executors (backends) may be more flexible > than solution when scheduling is done in executor itself. But we will > have to pay extra cost for redirection. > > I am not sure that finally it will allow to reach better performance. > More flexible solution in many cases doesn't mean more efficient solution. > Sure, I wasn't really suggesting it's a clear win. I was responding to your argument that pgbouncer in some cases reaches 100% CPU utilization - that can be mitigated to a large extent by adding threads. Of course, the cost for extra level of indirection is not zero. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 01/19/2018 06:13 PM, Claudio Freire wrote: > > > On Fri, Jan 19, 2018 at 2:07 PM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote: > > > > > Well, I haven't said it has to be single-threaded like pgbouncer. I > don't see why the bgworker could not use multiple threads > internally (of > course, it'd need to be not to mess the stuff that is not > thread-safe). > > > Certainly architecture with N multiple scheduling bgworkers and M > executors (backends) may be more flexible > than solution when scheduling is done in executor itself. But we > will have to pay extra cost for redirection. > I am not sure that finally it will allow to reach better performance. > More flexible solution in many cases doesn't mean more efficient > solution. > > > I think you can take the best of both worlds. > > You can take your approach of passing around fds, and build a "load > balancing protocol" in a bgworker. > > The postmaster sends the socket to the bgworker, the bgworker waits for > a command as pgbouncer does, but instead of proxying everything, when > commands arrive, it passes the socket to a backend to handle. > > That way, the bgworker can do what pgbouncer does, handle different > pooling modes, match backends to databases, etc, but it doesn't have to > proxy all data, it just delegates handling of a command to a backend, > and forgets about that socket. > > Sounds like it could work. > How could it do all that without actually processing all the data? For example, how could it determine the statement/transaction boundaries? -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 01/19/2018 06:19 PM, Konstantin Knizhnik wrote: > > > On 19.01.2018 20:03, Claudio Freire wrote: >> >> >> On Fri, Jan 19, 2018 at 1:53 PM, Konstantin Knizhnik >> <k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote: >> >> >> >> On 19.01.2018 19:28, Pavel Stehule wrote: >>> >>> >>> When I've been thinking about adding a built-in >>> connection pool, my >>> rough plan was mostly "bgworker doing something like >>> pgbouncer" (that >>> is, listening on a separate port and proxying everything >>> to regular >>> backends). Obviously, that has pros and cons, and >>> probably would not >>> work serve the threading use case well. >>> >>> >>> And we will get the same problem as with pgbouncer: one >>> process will not be able to handle all connections... >>> Certainly it is possible to start several such scheduling >>> bgworkers... But in any case it is more efficient to >>> multiplex session in backend themselves. >>> >>> >>> pgbouncer hold all time client connect. When we implement the >>> listeners, then all work can be done by worker processes not by >>> listeners. >>> >> >> Sorry, I do not understand your point. >> In my case pgbench establish connection to the pgbouncer only >> once at the beginning of the test. >> And pgbouncer spends all time in context switches (CPU usage is >> 100% and it is mostly in kernel space: top of profile are kernel >> functions). >> The same picture will be if instead of pgbouncer you will do such >> scheduling in one bgworker. >> For the modern systems are not able to perform more than several >> hundreds of connection switches per second. >> So with single multiplexing thread or process you can not get >> speed more than 100k, while at powerful NUMA system it is possible >> to achieve millions of TPS. >> It is illustrated by the results I have sent in the previous mail: >> by spawning 10 instances of pgbouncer I was able to receive 7 >> times bigger speed. >> >> >> I'm sure pgbouncer can be improved. I've seen async code handle >> millions of packets per second (zmq), pgbouncer shouldn't be radically >> different. >> >> > With pgbouncer you will never be able to use prepared statements which > slows down simple queries almost twice (unless my patch with > autoprepared statements is committed). > I don't see why that wouldn't be possible? Perhaps not for prepared statements with simple protocol, but I'm pretty sure it's doable for extended protocol (which seems like a reasonable limitation). That being said, I think it's a mistake to turn this thread into a pgbouncer vs. the world battle. I could name things that are possible only with standalone connection pool - e.g. pausing connections and restarting the database without interrupting the clients. But that does not mean built-in connection pool is not useful. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Jan 19, 2018 at 2:22 PM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote:
On 01/19/2018 06:13 PM, Claudio Freire wrote:
>
>
> On Fri, Jan 19, 2018 at 2:07 PM, Konstantin KnizhnikHow could it do all that without actually processing all the data? For> <k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru>> wrote:
>
>
>
>
> Well, I haven't said it has to be single-threaded like pgbouncer. I
> don't see why the bgworker could not use multiple threads
> internally (of
> course, it'd need to be not to mess the stuff that is not
> thread-safe).
>
>
> Certainly architecture with N multiple scheduling bgworkers and M
> executors (backends) may be more flexible
> than solution when scheduling is done in executor itself. But we
> will have to pay extra cost for redirection.
> I am not sure that finally it will allow to reach better performance.
> More flexible solution in many cases doesn't mean more efficient
> solution.
>
>
> I think you can take the best of both worlds.
>
> You can take your approach of passing around fds, and build a "load
> balancing protocol" in a bgworker.
>
> The postmaster sends the socket to the bgworker, the bgworker waits for
> a command as pgbouncer does, but instead of proxying everything, when
> commands arrive, it passes the socket to a backend to handle.
>
> That way, the bgworker can do what pgbouncer does, handle different
> pooling modes, match backends to databases, etc, but it doesn't have to
> proxy all data, it just delegates handling of a command to a backend,
> and forgets about that socket.
>
> Sounds like it could work.
>
example, how could it determine the statement/transaction boundaries?
It only needs to determine statement/transaction start.
After that, it hands off the connection to a backend, and the backend determines when to give it back.
So instead of processing all the data, it only processes a tiny part of it.
On 01/19/2018 07:35 PM, Claudio Freire wrote: > > > On Fri, Jan 19, 2018 at 2:22 PM, Tomas Vondra > <tomas.vondra@2ndquadrant.com <mailto:tomas.vondra@2ndquadrant.com>> wrote: > > > > On 01/19/2018 06:13 PM, Claudio Freire wrote: > > > > > > On Fri, Jan 19, 2018 at 2:07 PM, Konstantin Knizhnik > > <k.knizhnik@postgrespro.ru <mailto:k.knizhnik@postgrespro.ru> > <mailto:k.knizhnik@postgrespro.ru > <mailto:k.knizhnik@postgrespro.ru>>> wrote: > > > > > > > > > > Well, I haven't said it has to be single-threaded like > pgbouncer. I > > don't see why the bgworker could not use multiple threads > > internally (of > > course, it'd need to be not to mess the stuff that is not > > thread-safe). > > > > > > Certainly architecture with N multiple scheduling bgworkers and M > > executors (backends) may be more flexible > > than solution when scheduling is done in executor itself. But we > > will have to pay extra cost for redirection. > > I am not sure that finally it will allow to reach better > performance. > > More flexible solution in many cases doesn't mean more efficient > > solution. > > > > > > I think you can take the best of both worlds. > > > > You can take your approach of passing around fds, and build a "load > > balancing protocol" in a bgworker. > > > > The postmaster sends the socket to the bgworker, the bgworker > waits for > > a command as pgbouncer does, but instead of proxying everything, when > > commands arrive, it passes the socket to a backend to handle. > > > > That way, the bgworker can do what pgbouncer does, handle different > > pooling modes, match backends to databases, etc, but it doesn't > have to > > proxy all data, it just delegates handling of a command to a backend, > > and forgets about that socket. > > > > Sounds like it could work. > > > > How could it do all that without actually processing all the data? For > example, how could it determine the statement/transaction boundaries? > > > It only needs to determine statement/transaction start. > > After that, it hands off the connection to a backend, and the > backend determines when to give it back. > > So instead of processing all the data, it only processes a tiny part of it. > How exactly would the backend "give back" the connection? The only way for the backend and pgbouncer to communicate is by embedding information in the data stream. Which means pgbouncer still has to parse it. Furthermore, those are not the only bits of information pgbouncer may need. For example, if pgbouncer gets improved to handle prepared statements (which is likely) it'd need to handle PARSE/BIND/EXECUTE. And it already needs to handle SET parameters. And so on. In any case, this discussion is somewhat off topic in this thread, so let's not hijack it. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On 19.01.2018 20:28, Tomas Vondra wrote: >> >> With pgbouncer you will never be able to use prepared statements which >> slows down simple queries almost twice (unless my patch with >> autoprepared statements is committed). >> > I don't see why that wouldn't be possible? Perhaps not for prepared > statements with simple protocol, but I'm pretty sure it's doable for > extended protocol (which seems like a reasonable limitation). > > That being said, I think it's a mistake to turn this thread into a > pgbouncer vs. the world battle. I could name things that are possible > only with standalone connection pool - e.g. pausing connections and > restarting the database without interrupting the clients. > > But that does not mean built-in connection pool is not useful. > > > regards > Sorry, I do not understand how extended protocol can help to handle prepared statements without shared prepared statement cache or built-in connection pooling. The problems is that now in Postgres most of caches including catalog cache, relation cache, prepared statements cache are private to a backend. There is certainly one big advantage of such approach: no need to synchronize access to the cache. But it seems to be the only advantage. And there are a lot of drawbacks: inefficient use of memory, complex invalidation mechanism, not compatible with connection pooling... So there are three possible ways (may be more, but I know only three): 1. Implement built-in connection pooling which will be aware of proper use of local caches. This is what I have implemented with the proposed approach. 2. Implicit autoprepare. Clients will not be able to use standard Postgres prepare mechanism, but executor will try to generate generic plan for ordinary queries. My implementation of this approach is at commit fest. 3. Global caches. It seems to be the best solution but the most difficult to implement. Actually I think that the discussion about the value of built-in connection pooling is very important. Yes, external connection pooling is more flexible. It allows to perform pooling either at client side either at server side (or even combine two approaches). Also external connection pooling for PostgreSQL is not limited by pgbouncer/pgpool. There are many frameworks maintaining their own connection pool, for example J2EE, jboss, hibernate,... I have a filling than about 70% of enterprise systems working with databases are written in Java and doing connection pooling in their own way. So may be embedded connection pooling is not needed for such applications... But what I have heard from main people is that Postgres' poor connection pooling is one of the main drawbacks of Postgres complicating it's usage in enterprise environments. In any case please find updated patch with some code cleanup and more comments added. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 01/22/2018 05:05 PM, Konstantin Knizhnik wrote: > > > On 19.01.2018 20:28, Tomas Vondra wrote: >>> >>> With pgbouncer you will never be able to use prepared statements which >>> slows down simple queries almost twice (unless my patch with >>> autoprepared statements is committed). >>> >> I don't see why that wouldn't be possible? Perhaps not for prepared >> statements with simple protocol, but I'm pretty sure it's doable for >> extended protocol (which seems like a reasonable limitation). >> >> That being said, I think it's a mistake to turn this thread into a >> pgbouncer vs. the world battle. I could name things that are possible >> only with standalone connection pool - e.g. pausing connections and >> restarting the database without interrupting the clients. >> >> But that does not mean built-in connection pool is not useful. >> >> >> regards >> > > Sorry, I do not understand how extended protocol can help to handle > prepared statements without shared prepared statement cache or > built-in connection pooling. > The extended protocol makes it easy for pgbouncer (or any other proxy) to identify prepared statements, so that it can track (a) which prepared statements a client defined, and (b) what prepared statements are defined on a connection. And then do something when a client gets assigned a connection missing some of those. I do not claim doing this would be trivial, but I don't see why would that be impossible. Of course, the built-in pool can handle this in different ways, as it has access to the internal caches. > The problems is that now in Postgres most of caches including catalog > cache, relation cache, prepared statements cache are private to a backend. True. I wouldn't say it's a "problem" but it's certainly a challenge for certain features. > There is certainly one big advantage of such approach: no need to > synchronize access to the cache. But it seems to be the only advantage. > And there are a lot of drawbacks: > inefficient use of memory, complex invalidation mechanism, not > compatible with connection pooling... > Perhaps. I personally see the minimal synchronization as a quite valuable feature. > So there are three possible ways (may be more, but I know only three): > 1. Implement built-in connection pooling which will be aware of proper > use of local caches. This is what I have implemented with the proposed > approach. > 2. Implicit autoprepare. Clients will not be able to use standard > Postgres prepare mechanism, but executor will try to generate generic > plan for ordinary queries. My implementation of this approach is at > commit fest. > 3. Global caches. It seems to be the best solution but the most > difficult to implement. > Perhaps. > Actually I think that the discussion about the value of built-in > connection pooling is very important. I agree, and I wasn't speaking against built-in connection pooling. > Yes, external connection pooling is more flexible. It allows to > perform pooling either at client side either at server side (or even > combine two approaches).> > Also external connection pooling for PostgreSQL is not limited by > pgbouncer/pgpool.> > There are many frameworks maintaining their own connection pool, for > example J2EE, jboss, hibernate,...> > I have a filling than about 70% of enterprise systems working with > databases are written in Java and doing connection pooling in their > own way.> True, but that does not really mean we don't need "our" connection pooling (built-in or not). The connection pools are usually built into the application servers, so each application server has their own independent pool. With larger deployments (a couple of application servers) that quickly causes problems with max_connections. > So may be embedded connection pooling is not needed for such > applications... > > But what I have heard from main people is that Postgres' poor > connection pooling is one of the main drawbacks of Postgres > complicating it's usage in enterprise environments. > Maybe. I'm sure there's room for improvement. That being said, when enterprise developers tell me PostgreSQL is missing some feature, 99% of the time it turns out they're doing something quite stupid. > In any case please find updated patch with some code cleanup and > more comments added. > OK, will look. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Mon, Jan 22, 2018 at 06:51:08PM +0100, Tomas Vondra wrote: > > Yes, external connection pooling is more flexible. It allows to > > perform pooling either at client side either at server side (or even > > combine two approaches).> > > Also external connection pooling for PostgreSQL is not limited by > > pgbouncer/pgpool.> > > There are many frameworks maintaining their own connection pool, for > > example J2EE, jboss, hibernate,...> > > I have a filling than about 70% of enterprise systems working with > > databases are written in Java and doing connection pooling in their > > own way.> > > True, but that does not really mean we don't need "our" connection > pooling (built-in or not). The connection pools are usually built into > the application servers, so each application server has their own > independent pool. With larger deployments (a couple of application > servers) that quickly causes problems with max_connections. I found this thread and the pthread thread very interesting. Konstantin, thank you for writing prototypes and giving us very useful benchmarks for ideas I thought I might never see. As much as I would like to move forward with coding, I would like to back up and understand where we need to go with these ideas. First, it looks like pthreads and a builtin pooler help mostly with 1000+ connections. It seems like you found that pthreads wasn't sufficient and the builtin pooler was better. Is that correct? Is there anything we can do differently about allowing long-idle connections to reduce their resource usage, e.g. free their caches? Remove from PGPROC? Could we do it conditionally, e.g. only sessions that don't have open transactions or cursors? It feels like user and db mismatches are always going to cause pooling problems. Could we actually exit and restart connections that have default session state? Right now, if you hit max_connections, we start rejecting new connections. Would it make sense to allow an option to exit idle connections when this happens so new users can connect? I know we have relied on external connection poolers to solve all the high connection problems but it seems there might be simple things we can do to improve matters. FYI, I did write a blog entry comparing external and internal connection poolers: https://momjian.us/main/blogs/pgblog/2017.html#April_21_2017 -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Sat, Jan 27, 2018 at 4:40 PM, Bruce Momjian <bruce@momjian.us> wrote:
On Mon, Jan 22, 2018 at 06:51:08PM +0100, Tomas Vondra wrote:
Right now, if you hit max_connections, we start rejecting new
connections. Would it make sense to allow an option to exit idle
connections when this happens so new users can connect?
A lot of users have bash scripts to check the system periodically and canel idle connections to prevent other users from getting rejected by max connections. They do this on a timer, like if the session appears to be idle more than 10 minutes.
I know we have relied on external connection poolers to solve all the
high connection problems but it seems there might be simple things we
can do to improve matters. FYI, I did write a blog entry comparing
external and internal connection poolers:
Yes, that would be great.
The simplest thing sounds like a GUC that will automitcally end a connection idle for X seconds.
Another option could be as you suggested, Bruce, if a user would have failed because of max connections already reached, then terminate the connection that has been idle the longest and allow a new connection to come in.
These would greatly improve user experience as most folks have to automate this all themselves anyway.
Cheers,
Ivan
On Sun, Jan 28, 2018 at 02:01:07PM -0800, Ivan Novick wrote: > On Sat, Jan 27, 2018 at 4:40 PM, Bruce Momjian <bruce@momjian.us> wrote: > > On Mon, Jan 22, 2018 at 06:51:08PM +0100, Tomas Vondra wrote: > Right now, if you hit max_connections, we start rejecting new > connections. Would it make sense to allow an option to exit idle > connections when this happens so new users can connect? > > > > A lot of users have bash scripts to check the system periodically and canel > idle connections to prevent other users from getting rejected by max > connections. They do this on a timer, like if the session appears to be idle > more than 10 minutes. > > > I know we have relied on external connection poolers to solve all the > high connection problems but it seems there might be simple things we > can do to improve matters. FYI, I did write a blog entry comparing > external and internal connection poolers: > > > Yes, that would be great. > > The simplest thing sounds like a GUC that will automitcally end a connection > idle for X seconds. Uh, we already have idle_in_transaction_session_timeout so we would just need a simpler version. > Another option could be as you suggested, Bruce, if a user would have failed > because of max connections already reached, then terminate the connection that > has been idle the longest and allow a new connection to come in. > > These would greatly improve user experience as most folks have to automate this > all themselves anyway. Plus the ability to auto-free resources like cached system tables if the backend is idle for a specified duration. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
> The simplest thing sounds like a GUC that will automitcally end a connection
> idle for X seconds.
Uh, we already have idle_in_transaction_session_timeout so we would just
need a simpler version.
Oh i see its in 9.6, AWESOME!
Cheers
On Sun, Jan 28, 2018 at 03:11:25PM -0800, Ivan Novick wrote: > > The simplest thing sounds like a GUC that will automitcally end a connection > > > idle for X seconds. > > Uh, we already have idle_in_transaction_session_timeout so we would just > need a simpler version. > > > Oh i see its in 9.6, AWESOME! In summary, the good news is that adding an idle-session-timeout GUC, a max_connections limit hit cancels idle connections GUC, and a GUC for idle connections to reduce their resource usage shouldn't be too hard to implement and will provide useful benefits. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 28.01.2018 03:40, Bruce Momjian wrote:
On Mon, Jan 22, 2018 at 06:51:08PM +0100, Tomas Vondra wrote:Yes, external connection pooling is more flexible. It allows to perform pooling either at client side either at server side (or even combine two approaches).> Also external connection pooling for PostgreSQL is not limited by pgbouncer/pgpool.> There are many frameworks maintaining their own connection pool, for example J2EE, jboss, hibernate,...> I have a filling than about 70% of enterprise systems working with databases are written in Java and doing connection pooling in their own way.>True, but that does not really mean we don't need "our" connection pooling (built-in or not). The connection pools are usually built into the application servers, so each application server has their own independent pool. With larger deployments (a couple of application servers) that quickly causes problems with max_connections.I found this thread and the pthread thread very interesting. Konstantin, thank you for writing prototypes and giving us very useful benchmarks for ideas I thought I might never see. As much as I would like to move forward with coding, I would like to back up and understand where we need to go with these ideas. First, it looks like pthreads and a builtin pooler help mostly with 1000+ connections. It seems like you found that pthreads wasn't sufficient and the builtin pooler was better. Is that correct?
Brief answer is yes.
Pthreads allows to minimize per-connection overhead and make it possible to obtain better results for large number of connections.
But there is a principle problem: Postgres connection is "heave weight" object: each connection maintains it own private cache of catalog, relations, temporary
table pages, prepared statements,... So even through pthreads allows to minimize per-connection memory usage, it is negligible comparing with all this connection
private memory resources. It means that we still need to use connection pooling.
Pthreads provides two main advantages:
1. Simplify interaction between different workers: on need to use shared memory with it's fixed size limitation and
impossibility to use normal pointer for dynamic shared memory. Also no need to implement specialized memory allocator for shared memory.
It makes implementation of parallel query execution and built-on connection pooling much easier.
2. Optimize virtual-to-physical address translation. There is no need to maintain separate address space for each backend, so TLB (translation lookaside buffercan) becomes more efficient.
So it is not completely correct to consider session pooling as alternative to pthreads.
Ideally this two approaches should be combined.
I think that the best approach is to switch to global (shared) caches for execution plans, catalog,...Is there anything we can do differently about allowing long-idle connections to reduce their resource usage, e.g. free their caches? Remove from PGPROC? Could we do it conditionally, e.g. only sessions that don't have open transactions or cursors?
Most of the time this metadata caches are used to be identical for all clients. So it is just waste of memory and time to maintain them separately in each backend.
Certainly shared cached requires some synchronization when can be a point of contention and cause significant degrade of performance.
But taking in account that metadata is updated much rarely than data, I hope using copy-on-write and atomic operations can help to solve this problems.
And in can give a lot of different advantages. For example it will be possible to spend more time in optimizer for detecting optimal execution plan and store manually plans for
future use.
It feels like user and db mismatches are always going to cause pooling problems. Could we actually exit and restart connections that have default session state?
Well, combining multiuser access and connection pooling is really a challenged problem.
I do not know the best solution for it now. It will be much simpler to find solution with pthreads model...
Most of enterprise systems are using pgbouncer or similar connection pooler. In pgbouncer in statement/transaction pooling mode access to the database is performed under the same user. So it means that many existed statements are built in the assumption that database is accessed in this manner.
Concerning "default session state": one of the main drawbacks of pgbouncer and other external poolers is that them do not allow to use prepared statements.
And it leads to up to two times performance penalty on typical OLTP queries. One of the main ideads of built-on session pooling was to eliminate such limitation.
It will require changes in client applications, will not it? Them should be ready that connection can be dropped by server at any moment of time.Right now, if you hit max_connections, we start rejecting new connections. Would it make sense to allow an option to exit idle connections when this happens so new users can connect?
I do not know it is possible to drop idle connection and hide this fact from the client. In my implementation each session keeps minimal necessary information requires for interaction with client (session context). It includes socket, struct Port and session memory context which should be used instead of TopMemoryContext for session specific data.
I know we have relied on external connection poolers to solve all the high connection problems but it seems there might be simple things we can do to improve matters. FYI, I did write a blog entry comparing external and internal connection poolers: https://momjian.us/main/blogs/pgblog/2017.html#April_21_2017
I completely agree with your arguments in this post.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Mon, Jan 29, 2018 at 11:57:36AM +0300, Konstantin Knizhnik wrote: > Right now, if you hit max_connections, we start rejecting new > connections. Would it make sense to allow an option to exit idle > connections when this happens so new users can connect? > > It will require changes in client applications, will not it? Them should be > ready that connection can be dropped by server at any moment of time. > I do not know it is possible to drop idle connection and hide this fact from > the client. In my implementation each session keeps minimal necessary > information requires for interaction with client (session context). It > includes socket, struct Port and session memory context which should be used > instead of TopMemoryContext for session specific data. Yes, it would impact applications and you are right most applications could not handle that cleanly. It is probably better to look into freeing resources for idle connections instead and keep the socket open. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Bruce>Yes, it would impact applications and you are right most applications
could not handle that cleanly.
could not handle that cleanly.
I would disagree here.
We are discussing applications that produce "lots of idle" connections, aren't we? That typically comes from an application-level connection pool.
Most of the connection pools have a setting that would "validate" connection in case it was not used for a certain period of time.
That plays nicely in case server drops "idle, not in a transaction" connection.
Of course, there are cases when application just grabs a connection from a pool and uses it in a non-transacted way (e.g. does some action once an hour and commits immediately). However that kind of application would already face firewalls, etc. I mean the application should already be prepared to handle "network issues".
Bruce> It is probably better to look into
Bruce>freeing resources for idle connections instead and keep the socket open.The application might expect for the session-specific data to be present, so it might be even worse if the database deallocates all the things but TCP connection.
For instance: application might expect for the server-prepared statements to be there. Would you deallocate server-prepared statements for those "idle" connections? The app would just break. There's no way (currently) for the application to know that the statement expired unexpectedly.
Vladimir
On Mon, Jan 29, 2018 at 04:02:22PM +0000, Vladimir Sitnikov wrote: > Bruce>Yes, it would impact applications and you are right most applications > could not handle that cleanly. > > I would disagree here. > We are discussing applications that produce "lots of idle" connections, aren't > we? That typically comes from an application-level connection pool. > Most of the connection pools have a setting that would "validate" connection in > case it was not used for a certain period of time. > > That plays nicely in case server drops "idle, not in a transaction" connection. Well, we could have the connection pooler disconnect those, right? > Of course, there are cases when application just grabs a connection from a pool > and uses it in a non-transacted way (e.g. does some action once an hour and > commits immediately). However that kind of application would already face > firewalls, etc. I mean the application should already be prepared to handle > "network issues". > > Bruce> It is probably better to look into > Bruce>freeing resources for idle connections instead and keep the socket open. > > The application might expect for the session-specific data to be present, so it > might be even worse if the database deallocates all the things but TCP > connection. > > For instance: application might expect for the server-prepared statements to be > there. Would you deallocate server-prepared statements for those "idle" > connections? The app would just break. There's no way (currently) for the > application to know that the statement expired unexpectedly. I don't know what we would deallocate yet. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
Bruce>Well, we could have the connection pooler disconnect those, right?
I agree. Do you think we could rely on all the applications being configured in a sane way?
A fallback configuration at DB level could still be useful to ensure the DB keeps running in case multiple applications access it. It might be non-trivial to ensure proper configurations across all the apps.
What I do like is the behaviour of dropping connections should already be considered by existing applications, so it should fit naturally to the existing apps.
Alternative approach might be to dump to disk relevant resources for inactive sessions, so the session could be recreated in case the connection is requested again after a long pause (e.g. reprepare all the statements), however it sounds scary.
Vladimir
I have obtained more results with YCSB benchmark and built-in connection pooling.
Explanation of the benchmark and all results for vanilla Postgres and Mongo are available in Oleg Bartunov presentation about JSON (at the end of presentation):
http://www.sai.msu.su/~megera/postgres/talks/sqljson-pgconf.eu-2017.pdf
as you can see, Postgres shows significant slow down with increasing number of connections in case of conflicting updates.
Built-in connection pooling can somehow eliminate this problem:
Workload-B (5% of updates) ops/sec:
Here the maximum is obtained near 70 backends which corresponds to the number of physical cores at the target system.
But for workload A (50% of updates), optimum is achieved at much smaller number of backends, after which we get very fast performance degradation:
Here the maximum is reached at 32 backends and with 70 backends performance is 6 times worser.
It means that it is difficult to find optimal size of session pool if we have varying workload.
If we set it too large, then we get high contention of conflicting update queries, if it is too small, then we do not utilize all system resource on read-only or not conflicting queries.
Look like we have to do something with Postgres locking mechanism and may be implement some contention aware scheduler as described here:
http://www.vldb.org/pvldb/vol11/p648-tian.pdf
But this is a different story, not related to built-in connection pooling.
Explanation of the benchmark and all results for vanilla Postgres and Mongo are available in Oleg Bartunov presentation about JSON (at the end of presentation):
http://www.sai.msu.su/~megera/postgres/talks/sqljson-pgconf.eu-2017.pdf
as you can see, Postgres shows significant slow down with increasing number of connections in case of conflicting updates.
Built-in connection pooling can somehow eliminate this problem:
Workload-B (5% of updates) ops/sec:
Session pool size/clients | 250 | 500 | 750 | 1000 |
0 | 151511 | 78078 | 48742 | 30186 |
32 | 522347 | 543863 | 546971 | 540462 |
64 | 736323 | 770010 | 763649 | 763358 |
128 | 245167 | 241377 | 243322 | 232482 |
256 | 144964 | 146723 | 149317 | 141049 |
Here the maximum is obtained near 70 backends which corresponds to the number of physical cores at the target system.
But for workload A (50% of updates), optimum is achieved at much smaller number of backends, after which we get very fast performance degradation:
Session pool size | kops/sec |
16 | 220 |
30 | 353 |
32 | 362 |
40 | 120 |
70 | 53 |
256 | 20 |
Here the maximum is reached at 32 backends and with 70 backends performance is 6 times worser.
It means that it is difficult to find optimal size of session pool if we have varying workload.
If we set it too large, then we get high contention of conflicting update queries, if it is too small, then we do not utilize all system resource on read-only or not conflicting queries.
Look like we have to do something with Postgres locking mechanism and may be implement some contention aware scheduler as described here:
http://www.vldb.org/pvldb/vol11/p648-tian.pdf
But this is a different story, not related to built-in connection pooling.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Konstantin>I have obtained more results with YCSB benchmark and built-in connection pooling
Could you provide more information on the benchmark setup you have used?
For instance: benchmark library versions, PostgreSQL client version, additional/default benchmark parameters.
Konstantin>Built-in connection pooling can somehow eliminate this problem
Can you please clarify how connection pooling eliminates slow down?
Is the case as follows?
1) The application updates multiple of rows in a single transaction
2) There are multiple concurrent threads
3) The threads update the same rows at the same time
If that is the case, then the actual workload is different each time you vary connection pool size.
For instance, if you use 1 thread, then the writes become uncontended.
Of course, you might use just it as a "black box" workload, however I wonder if that kind of workload ever appears in a real-life applications. I would expect for the applications to update the same row multiple times, however I would expect the app is doing subsequent updates, not the concurrent ones.
On the other hand, as you vary the pool size, the workload varies as well (the resulting database contents is different), so it looks like comparing apples to oranges.
Vladimir
On 01.02.2018 15:21, Vladimir Sitnikov wrote:
Konstantin>I have obtained more results with YCSB benchmark and built-in connection poolingCould you provide more information on the benchmark setup you have used?For instance: benchmark library versions, PostgreSQL client version, additional/default benchmark parameters.
I am using the latest Postgres sources with applied connection pooling patch.
I have not built YCSB myself, use existed installation.
To launch tests I used the following YCSB command line:
To load data:
YCSB_MAXRUNTIME=60 YCSB_OPS=1000000000 YCSB_DBS="pgjsonb-local" YCSB_CFG="bt" YCSB_CLIENTS="250" YCSB_WORKLOADS="load_a" ./ycsb.sh
To run test:
YCSB_MAXRUNTIME=60 YCSB_OPS=1000000000 YCSB_DBS="pgjsonb-local" YCSB_CFG="bt" YCSB_CLIENTS="250 500 750 1000" YCSB_WORKLOADS="run_a" ./ycsb.sh
$ cat config/pgjsonb-local.dat
db.driver=org.postgresql.Driver
db.url=jdbc:postgresql://localhost:5432/ycsb
db.user=ycsb
db.passwd=ycsb
db.batchsize=100
jdbc.batchupdateapi=true
table=usertable
Sorry, I am not sure that I completely understand your question.Konstantin>Postgres shows significant slow down with increasing number of connections in case of conflicting updates.
Konstantin>Built-in connection pooling can somehow eliminate this problemCan you please clarify how connection pooling eliminates slow down?Is the case as follows?1) The application updates multiple of rows in a single transaction2) There are multiple concurrent threads3) The threads update the same rows at the same timeIf that is the case, then the actual workload is different each time you vary connection pool size.For instance, if you use 1 thread, then the writes become uncontended.Of course, you might use just it as a "black box" workload, however I wonder if that kind of workload ever appears in a real-life applications. I would expect for the applications to update the same row multiple times, however I would expect the app is doing subsequent updates, not the concurrent ones.On the other hand, as you vary the pool size, the workload varies as well (the resulting database contents is different), so it looks like comparing apples to oranges.Vladimir
YCSB (Yahoo! Cloud Serving Benchmark) framework is essentially multiclient benchmark which assumes larger number concurrent requests to the database.
Requests themselves are used to be very simple (benchmark emulates key-vlaue storage).
In my tests I perform measurements for 250, 500, 750 and 1000 connections.
One of the main problems of Postgres is significant degrade of performance in case of concurrent write access by multiple transactions to the same sows.
This is why performance of pgbench and YCSB benchmark significantly (more then linear) degrades with increasing number of client connections especially in case o Zipf distribution
(significantly increasing possibility of conflict).
Connection pooling allows to fix number of backends and serve almost any number of connections using fixed size of backends.
So results are almost the same for 250, 500, 750 and 1000 connections.
The problem is choosing optimal number of backends.
For readonly pgbench best results are achieved for 300 backends, for YCSB with 5% of updates - for 70 backends, for YCSB with 50% of updates - for 30 backends.
So something definitely need to be changes in Postgres locking mechanism.
Connection pooling allows to minimize contention on resource and degrade of performance caused by such contention.
But unfortunately it is not a silver bullet fixing all Postgres scalability problems.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Konstantin>I have not built YCSB myself, use existed installation.
Which pgjdbc version was in use?
Konstantin>One of the main problems of Postgres is significant degrade of performance in case of concurrent write access by multiple transactions to the same sows.
I would consider that a workload "problem" rather than PostgreSQL problem.
That is, if an application (e.g. YCSB) is trying to update the same rows in multiple transactions concurrently, then the outcome of such updates is likely to be unpredictable. Does it make sense?
At least, I do not see why Mongo would degrade in a different way there. Oleg's charts suggest that Mongo does not degrade there, so I wonder if we compare apples to apples in the first place.
Vladimir
On 01.02.2018 16:33, Vladimir Sitnikov wrote:
postgresql-9.4.1212.jarKonstantin>I have not built YCSB myself, use existed installation.Which pgjdbc version was in use?
Konstantin>One of the main problems of Postgres is significant degrade of performance in case of concurrent write access by multiple transactions to the same sows.I would consider that a workload "problem" rather than PostgreSQL problem.That is, if an application (e.g. YCSB) is trying to update the same rows in multiple transactions concurrently, then the outcome of such updates is likely to be unpredictable. Does it make sense?
I can't agree with you.
Yes, there are workloads where updates are more or less local: clients are used to update their own private data.
But there are many systems with "shared" resources which are concurrently accessed by different users. They may just increment access count or perform deposit/withdraw...
Just simple example: consider that you have something like AppStore and there is some popular application which is bought by a lot of users.
From DBMS point of view a lot of clients perform concurrent update of the same record.
So performance on such workload is also very important. And unfortunately here Postgres loses to the competition with mySQL and most of other DBMSes.
At least, I do not see why Mongo would degrade in a different way there. Oleg's charts suggest that Mongo does not degrade there, so I wonder if we compare apples to apples in the first place.
Postgres locks tuples in very inefficient way in case of high contention.
It first lock buffer and checks if tuple is locked by some other backend.
Then it tries to set heavy weight lock on the tuple's tcid. If there are several processes trying update this tuple, then all of them will be queued on this heavy-weight tuple lock.
After getting this tuple lock, backend tries to lock tid of transaction which updated the tuple.
Once transaction updated this tuple is completed, Postgres unblocks backends waiting for this transaction. It checks status of the tuple and release tuple's lock, awaken one of waiting clients.
As far as Postgres using MVCC, it creates new version of the tuple on each update.
So the tuple all clients are waiting for is not the last version of of the tuple any more.
Depending on isolation policy them either need to report error (in case of repeatable read) or update snapshot and repeat search with new snapshot...
and perform all checks and locks mentioned above once again.
I hope that it is clear from this brief and not so precise explanation that Postgres has to do a lot of redundant work if several client are competing for the same tuple.
There is well known rule that pessimistic locking is more efficient than optimistic in case of high contention.
So Postgres can provide better performance on this workload if it be more pessimistic:
set lock not on TCID (identifier of particular tuple version), but on tuple's PK (primary key) and hold it till end of the transaction (because until transaction is completed nobody still be
able to update this tuple). This trick with locking PK really helps to improve performance on this workload, but unfortunately can not reverse the trend with the degradation of performance with increasing number of competing transactions.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
> config/pgjsonb-local.dat
Do you use standard "workload" configuration values? (e.g. recordcount=1000, maxscanlength=100)
Could you share ycsb output (e.g. for workload a)?
I mean lines like
[TOTAL_GC_TIME], Time(ms), xxx
[TOTAL_GC_TIME_%], Time(%), xxx
>postgresql-9.4.1212.jar
Ok, you have relevant performance fixes then.
Konstantin>From DBMS point of view a lot of clients perform concurrent update of the same record.
I thought YCSB updated *multiple rows* per transaction. It turns out all the default YCSB workloads update just one row per transaction. There is no batching, etc. Batch-related parameters are used at "DB initial load" time only.
Konstantin>Postgres locks tuples in very inefficient way in case of high contention
Thank you for the explanation.
Vladimir
On 01.02.2018 23:28, Vladimir Sitnikov wrote:
> config/pgjsonb-local.datDo you use standard "workload" configuration values? (e.g. recordcount=1000, maxscanlength=100)
Yes, I used default value for workload. For example, workload-A has the following settings:
# Yahoo! Cloud System Benchmark
# Workload A: Update heavy workload
# Application example: Session store recording recent actions
#
# Read/update ratio: 50/50
# Default data size: 1 KB records (10 fields, 100 bytes each, plus key)
# Request distribution: zipfian
recordcount=1000
operationcount=1000
workload=com.yahoo.ycsb.workloads.CoreWorkload
readallfields=true
readproportion=0.5
updateproportion=0.5
scanproportion=0
insertproportion=0
requestdistribution=zipfian
$ cat results/last/run_pgjsonb-local_workloada_70_bt.outCould you share ycsb output (e.g. for workload a)?I mean lines like[TOTAL_GC_TIME], Time(ms), xxx[TOTAL_GC_TIME_%], Time(%), xxx
[OVERALL], RunTime(ms), 60099.0
[OVERALL], Throughput(ops/sec), 50444.83269272367
[TOTAL_GCS_PS_Scavenge], Count, 6.0
[TOTAL_GC_TIME_PS_Scavenge], Time(ms), 70.0
[TOTAL_GC_TIME_%_PS_Scavenge], Time(%), 0.11647448376844872
[TOTAL_GCS_PS_MarkSweep], Count, 0.0
[TOTAL_GC_TIME_PS_MarkSweep], Time(ms), 0.0
[TOTAL_GC_TIME_%_PS_MarkSweep], Time(%), 0.0
[TOTAL_GCs], Count, 6.0
[TOTAL_GC_TIME], Time(ms), 70.0
[TOTAL_GC_TIME_%], Time(%), 0.11647448376844872
[READ], Operations, 1516174.0
[READ], AverageLatency(us), 135.802076146933
[READ], MinLatency(us), 57.0
[READ], MaxLatency(us), 23327.0
[READ], 95thPercentileLatency(us), 382.0
[READ], 99thPercentileLatency(us), 828.0
[READ], Return=OK, 1516174
[CLEANUP], Operations, 70.0
[CLEANUP], AverageLatency(us), 134.21428571428572
[CLEANUP], MinLatency(us), 55.0
[CLEANUP], MaxLatency(us), 753.0
[CLEANUP], 95thPercentileLatency(us), 728.0
[CLEANUP], 99thPercentileLatency(us), 750.0
[UPDATE], Operations, 1515510.0
[UPDATE], AverageLatency(us), 2622.6653258639008
[UPDATE], MinLatency(us), 86.0
[UPDATE], MaxLatency(us), 1059839.0
[UPDATE], 95thPercentileLatency(us), 1261.0
[UPDATE], 99thPercentileLatency(us), 87039.0
[UPDATE], Return=OK, 1515510
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Nikita Glukhov has added results of YCSB benchmark with connection pooling to the common diagram (attached). As you can see connection pooling provides stable results for all number backends, except Workload-E. I do not have explanation of performance degradation in case of this particular workload. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Konstantin>I do not have explanation of performance degradation in case of this
particular workload.
particular workload.
A) Mongo Java Client uses a connection-pool of 100 connections by default. That is it does not follow "connection per client" (in YCSB terms), but it is capped by 100 connections. I think it can be adjusted by adding ?maxPoolSize=100500 or ?maxpoolsize=100500 to the Mongo URL
I wonder if you could try to vary that parameter and see if it changes Mongo results.
B) There's a bug in JDBC client of YCSB (it might affect PostgreSQL results, however I'm not sure if the impact would be noticeable). The default configuration is readallfields=true, however Jdbc client just discards the results instead of accessing the columns. I've filed https://github.com/brianfrankcooper/YCSB/issues/1087 for that.
C) I might miss something, however my local (Macbook) benchmarks show that PostgreSQL 9.6 somehow uses Limit->Sort->BitmapScan kind of plans.
I have picked a "bad" userid value via auto_explain.
Jdbc client uses prepared statements, so a single bind might spoil the whole thing causing bad plans for all the values afterwards.
Does it make sense to disable bitmap scan somehow?
For instance:
explain (analyze, buffers) select * From usertable where YCSB_KEY>='user884845140610037639' order by YCSB_KEY limit 100;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=320.99..321.24 rows=100 width=1033) (actual time=1.408..1.429 rows=100 loops=1)
Buffers: shared hit=140
-> Sort (cost=320.99..321.33 rows=135 width=1033) (actual time=1.407..1.419 rows=100 loops=1)
Sort Key: ycsb_key
Sort Method: quicksort Memory: 361kB
Buffers: shared hit=140
-> Bitmap Heap Scan on usertable (cost=9.33..316.22 rows=135 width=1033) (actual time=0.186..0.285 rows=167 loops=1)
Recheck Cond: ((ycsb_key)::text >= 'user884845140610037639'::text)
Heap Blocks: exact=137
Buffers: shared hit=140
-> Bitmap Index Scan on usertable_pkey (cost=0.00..9.29 rows=135 width=0) (actual time=0.172..0.172 rows=167 loops=1)
Index Cond: ((ycsb_key)::text >= 'user884845140610037639'::text)
Buffers: shared hit=3
Planning time: 0.099 ms
Execution time: 1.460 ms
vs
explain (analyze, buffers) select * From usertable where YCSB_KEY>='user184845140610037639' order by YCSB_KEY limit 100;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.28..89.12 rows=100 width=1033) (actual time=0.174..0.257 rows=100 loops=1)
Buffers: shared hit=102
-> Index Scan using usertable_pkey on usertable (cost=0.28..2154.59 rows=2425 width=1033) (actual time=0.173..0.246 rows=100 loops=1)
Index Cond: ((ycsb_key)::text >= 'user184845140610037639'::text)
Buffers: shared hit=102
Planning time: 0.105 ms
Execution time: 0.277 ms
Vladimir
Attached please find new version of built-in connection pooling supporting temporary tables and session GUCs. Also Win32 support was added. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Am a bit late to this thread, sorry if I'm slightly rehashing things. I'd like to go back to the basic on this.
Unless I'm mistaken, at least in the Java and .NET world, clients are almost always expected to have their own connection pooling, either implemented inside the driver (ADO.NET model) or as a separate modular component (JDBC). This approach has a few performance advantages:
1. "Opening" a new pooled connection is virtually free - no TCP connection needs to be opened, no I/O, no startup packet, nothing (only a tiny bit of synchronization).
2. Important client state can be associated to physical connections. For example, prepared statements can be tracked on the physical connection, and persisted when the connection is returned to the pool. The next time the physical connection is returned from the pool, if the user tries to server-prepare a statement, we can check on the connection if it has already been prepared in a "previous lifetime", and if so, no need to prepare again. This is vital for scenarios with short-lived (pooled) connections, such as web. Npgsql does this.
Regarding the problem of idle connections being kept open by clients, I'd argue it's a client-side problem. If the client is using a connection pool, the pool should be configurable to close idle connections after a certain time (I think this is relatively standard behavior). If the client isn't using a pool, it seems to be the application's responsibility to release connections when they're no longer needed.
The one drawback is that the pooling is application-specific, so it can't be shared by multiple applications/hosts. So in some scenarios it may make sense to use both client pooling and proxy/server pooling.
To sum it up, I would argue that connection pooling should first and foremost be considered as a client feature, rather than a proxy feature (pgpool) or server feature (the PostgreSQL pooling being discussed here). This isn't to say server-side pooling has no value though.
On Fri, Feb 9, 2018 at 4:14 PM, Shay Rojansky <roji@roji.org> wrote:
Am a bit late to this thread, sorry if I'm slightly rehashing things. I'd like to go back to the basic on this.Unless I'm mistaken, at least in the Java and .NET world, clients are almost always expected to have their own connection pooling, either implemented inside the driver (ADO.NET model) or as a separate modular component (JDBC). This approach has a few performance advantages:1. "Opening" a new pooled connection is virtually free - no TCP connection needs to be opened, no I/O, no startup packet, nothing (only a tiny bit of synchronization).2. Important client state can be associated to physical connections. For example, prepared statements can be tracked on the physical connection, and persisted when the connection is returned to the pool. The next time the physical connection is returned from the pool, if the user tries to server-prepare a statement, we can check on the connection if it has already been prepared in a "previous lifetime", and if so, no need to prepare again. This is vital for scenarios with short-lived (pooled) connections, such as web. Npgsql does this.Regarding the problem of idle connections being kept open by clients, I'd argue it's a client-side problem. If the client is using a connection pool, the pool should be configurable to close idle connections after a certain time (I think this is relatively standard behavior). If the client isn't using a pool, it seems to be the application's responsibility to release connections when they're no longer needed.The one drawback is that the pooling is application-specific, so it can't be shared by multiple applications/hosts. So in some scenarios it may make sense to use both client pooling and proxy/server pooling.To sum it up, I would argue that connection pooling should first and foremost be considered as a client feature, rather than a proxy feature (pgpool) or server feature (the PostgreSQL pooling being discussed here). This isn't to say server-side pooling has no value though.
Recently, I did a large amount of parallel data processing where the results were stored in PG. I had about 1000 workers each with their own PG connection. As you pointed out, application pooling doesn't make sense in this scenario. I tried pgpool and pgbouncer, and both ended up as the bottleneck. Overall throughput was not great but it was highest without a pooler. That aligns with Konstantin's benchmarks too. As far as I know, server pooling is the only solution to increase throughput, without upgrading hardware, for this use case.
I hope this PR gets accepted!
Attached please find new version of the patch with several bug fixes + support of more than one session pools associated with different ports. Now it is possible to make postmaster listen several ports for accepting pooled connections, while leaving main Postgres port for dedicated backends. Each session pool is intended to be used for particular database/user combination. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 06.04.2018 20:00, Konstantin Knizhnik wrote: > Attached please find new version of the patch with several bug fixes > + support of more than one session pools associated with different ports. > Now it is possible to make postmaster listen several ports for > accepting pooled connections, while leaving main Postgres port for > dedicated backends. > Each session pool is intended to be used for particular database/user > combination. > Sorry, wrong patch was attached. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 06.04.2018 20:03, Konstantin Knizhnik wrote:
On 06.04.2018 20:00, Konstantin Knizhnik wrote:Attached please find new version of the patch with several bug fixes + support of more than one session pools associated with different ports.Sorry, wrong patch was attached.
Now it is possible to make postmaster listen several ports for accepting pooled connections, while leaving main Postgres port for dedicated backends.
Each session pool is intended to be used for particular database/user combination.
Development in built-in connection pooling will be continued in https://github.com/postgrespro/postgresql.builtin_pool.git
I am not going to send new patches to hackers mailing list any more.
The last added feature is support of idle_in_transaction_session_timeout which is especially critical for builtin pool with transaction-level scheduling because long transaction can block other sessions executed at this backend.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, Apr 13, 2018 at 2:59 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:
Development in built-in connection pooling will be continued in https://github.com/postgrespro/postgresql. builtin_pool.git
I am not going to send new patches to hackers mailing list any more.
Why?
On 13.04.2018 19:07, Nikolay Samokhvalov wrote:
On Fri, Apr 13, 2018 at 2:59 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:Development in built-in connection pooling will be continued in https://github.com/postgrespro/postgresql. builtin_pool.git
I am not going to send new patches to hackers mailing list any more.Why?
Just do not want to spam hackers with a lot of patches.
Also since I received few feedbacks in this thread, I consider that this topic is not so interesting for community.
Please notice that built-in connection pool is conn_pool branch.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Understood.
One more question. Have you considered creation of pooling tool as a separate, not built-in tool, but being shipped with Postgres — like psql is shipped in packages usually called “postgresql-client-XX” which makes psql the default tool to work in terminal? I constantly hear opinion from various users, that Postgres needs “default”/official pooling tool.
вт, 17 апр. 2018 г. в 0:44, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>:
On 13.04.2018 19:07, Nikolay Samokhvalov wrote:On Fri, Apr 13, 2018 at 2:59 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:Development in built-in connection pooling will be continued in https://github.com/postgrespro/postgresql.builtin_pool.git
I am not going to send new patches to hackers mailing list any more.Why?Just do not want to spam hackers with a lot of patches.
Also since I received few feedbacks in this thread, I consider that this topic is not so interesting for community.
Please notice that built-in connection pool is conn_pool branch.-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 17.04.2018 20:09, Nikolay Samokhvalov wrote:
From the point of view of many PostgreSQL users which I know myself, lack of standard (built-in?) connection pooling is one of the main drawbacks of PostgreSQL.
Right now we have pgbouncer which is small, fast and reliable but
- Doesn't allow you to use prepared statements, temporary table and session variables.
- Is single threaded, so becomes bottleneck for large (>100) number of active connections
- Can not be used for load balancing for hot standby replicas
So if you have a lot of active connections, you will have to setup pool of pgbouncers.
There is also pgpool which supports load balancing, but doesn't perform session pooling. So it has to be used together with pgbouncer.
So to be able to use Postgres in enterprise system you will have to setup very complex pipeline of different tools.
Definitely we need some standard solution for it. As far as I know, Yandex is now working on their own version of external connection pooler which can eliminate single-threaded limitation of pgbouncer. Unfortunately their presentation was not accepted for pgconf (as well as my presentation about built-in connection pooling).
External connection pooler definitely provides more flexibility than built-in connection pooler. It can be installed either at client side, either at server side, either somewhere between them.
Alos it is more reliable, because it changes nothing in Postgres architecture.
But there are still use cases which can not be covered y external connection pooler.
1C company (Russian SAP) at presentation at PgConf.ru 2018 mentioned that lack of internal pooling is the main limitationg factor for replacing MS-SQL with Postgres.
They have a lot of clients which never close connections. And they need persistent session because of wide use of temporary tables.
This is why 1C can not use pgbouncer. We now try to provide to them prototype version of Postgres with builtin connection pool.
If results of such experiments will be successful, we will propose this connection pooler to community (but it available right now, so anybody who want can test it).
There were a lot of discussions in hackers and in other mailing lists/forums concerning PostgreSQL and connection pooling.Understood.One more question. Have you considered creation of pooling tool as a separate, not built-in tool, but being shipped with Postgres — like psql is shipped in packages usually called “postgresql-client-XX” which makes psql the default tool to work in terminal? I constantly hear opinion from various users, that Postgres needs “default”/official pooling tool.
From the point of view of many PostgreSQL users which I know myself, lack of standard (built-in?) connection pooling is one of the main drawbacks of PostgreSQL.
Right now we have pgbouncer which is small, fast and reliable but
- Doesn't allow you to use prepared statements, temporary table and session variables.
- Is single threaded, so becomes bottleneck for large (>100) number of active connections
- Can not be used for load balancing for hot standby replicas
So if you have a lot of active connections, you will have to setup pool of pgbouncers.
There is also pgpool which supports load balancing, but doesn't perform session pooling. So it has to be used together with pgbouncer.
So to be able to use Postgres in enterprise system you will have to setup very complex pipeline of different tools.
Definitely we need some standard solution for it. As far as I know, Yandex is now working on their own version of external connection pooler which can eliminate single-threaded limitation of pgbouncer. Unfortunately their presentation was not accepted for pgconf (as well as my presentation about built-in connection pooling).
External connection pooler definitely provides more flexibility than built-in connection pooler. It can be installed either at client side, either at server side, either somewhere between them.
Alos it is more reliable, because it changes nothing in Postgres architecture.
But there are still use cases which can not be covered y external connection pooler.
1C company (Russian SAP) at presentation at PgConf.ru 2018 mentioned that lack of internal pooling is the main limitationg factor for replacing MS-SQL with Postgres.
They have a lot of clients which never close connections. And they need persistent session because of wide use of temporary tables.
This is why 1C can not use pgbouncer. We now try to provide to them prototype version of Postgres with builtin connection pool.
If results of such experiments will be successful, we will propose this connection pooler to community (but it available right now, so anybody who want can test it).
вт, 17 апр. 2018 г. в 0:44, Konstantin Knizhnik <k.knizhnik@postgrespro.ru>:On 13.04.2018 19:07, Nikolay Samokhvalov wrote:On Fri, Apr 13, 2018 at 2:59 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote:Development in built-in connection pooling will be continued in https://github.com/postgrespro/postgresql.builtin_pool.git
I am not going to send new patches to hackers mailing list any more.Why?Just do not want to spam hackers with a lot of patches.
Also since I received few feedbacks in this thread, I consider that this topic is not so interesting for community.
Please notice that built-in connection pool is conn_pool branch.-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 18/04/18 06:10, Konstantin Knizhnik wrote: > But there are still use cases which can not be covered y external > connection pooler. Can you name some? I understand that the existing external connection poolers all have their limitations. But are there some fundamental issues that can *only* be addressed by a built-in implementation? For the record, I think an internal connection pool might be a good idea. It would presumably be simpler to set up than an external one, for example. But it depends a lot on the implementation. If we had an internal connection pool, I would expect it to be very transparent to the user, be simple to set up, and not have annoying limitations with prepared statements, temporary tables, etc. that the existing external ones have. However, I suspect that dealing with *all* of the issues is going to be hard and tedious. And if there are any significant gaps, things that don't work correctly with the pooler, the patch will almost certainly be rejected. I'd recommend that you put your effort in improving the existing external connection poolers. Which one is closest to suit your needs? What's missing? There are probably things we could do in the server, to help external connection poolers. For example, some kind of a proxy authentication, where the connection pooler could ask the backend to do authentication on its behalf, so that you wouldn't need to re-implement the server-side authentication code in the external pooler. Things like that. - Heikki
On 18.04.2018 13:36, Heikki Linnakangas wrote:
On 18/04/18 06:10, Konstantin Knizhnik wrote:But there are still use cases which can not be covered y external
connection pooler.
Can you name some? I understand that the existing external connection poolers all have their limitations. But are there some fundamental issues that can *only* be addressed by a built-in implementation?
Well, may be I missed something, but i do not know how to efficiently support
1. Temporary tables
2. Prepared statements
3. Sessoin GUCs
with any external connection pooler (with pooling level other than session).
The problem with GUCs seems to be the easiest from this thee: we can just keep list of GUC assignments and prepend it to each statement. But it is not so efficient and can cause some problems (for example there are some statements, which can not be executed in multistatement context).
Prepared statement problem can be fixed either by implementing shared plan cache, either by autoprepare (I have proposed patch for it).
But concerning temporary table I do not know any acceptable solution.
For the record, I think an internal connection pool might be a good idea. It would presumably be simpler to set up than an external one, for example. But it depends a lot on the implementation. If we had an internal connection pool, I would expect it to be very transparent to the user, be simple to set up, and not have annoying limitations with prepared statements, temporary tables, etc. that the existing external ones have.
However, I suspect that dealing with *all* of the issues is going to be hard and tedious. And if there are any significant gaps, things that don't work correctly with the pooler, the patch will almost certainly be rejected.
I'd recommend that you put your effort in improving the existing external connection poolers. Which one is closest to suit your needs? What's missing?
Yandex team is following this approach with their Odysseus (multithreaded version of pgbouncer with many of pgbouncer issues fixed).
But it will not work for 1C which needs to keeps sessions (with temporary tables, e.t.c) for large number of clients which never closes connections.
There are probably things we could do in the server, to help external connection poolers. For example, some kind of a proxy authentication, where the connection pooler could ask the backend to do authentication on its behalf, so that you wouldn't need to re-implement the server-side authentication code in the external pooler. Things like that.
As far as I know most of DBMSes have some kind of internal connection pooling.
Oracle, for example, you can create dedicated and non-dedicated backends.
I wonder why we do not want to have something similar in Postgres.
Any external connection pooler will be less convenient for users than internal pooler.
It may be more flexible, more error protected, more scalable, .... But still it is an extra entity which adds extra overhead and can also be bottleneck or SPoF.
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 18 April 2018 at 19:52, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > As far as I know most of DBMSes have some kind of internal connection > pooling. > Oracle, for example, you can create dedicated and non-dedicated backends. > I wonder why we do not want to have something similar in Postgres. I want to, and I know many others to. But the entire PostgreSQL architecture makes it hard to do well, and means it requires heavy changes to do it in a way that will be maintainable and reliable. Making it work, and making something maintainble and mergeable, are two different things. Something I continue to struggle with myself. -- Craig Ringer http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Apr 18, 2018 at 02:52:39PM +0300, Konstantin Knizhnik wrote: > Yandex team is following this approach with theirOdysseus > (multithreaded version of pgbouncer with many of pgbouncer issues > fixed). Have they opened the source to Odysseus? If not, do they have plans to? Best, David. -- David Fetter <david(at)fetter(dot)org> http://fetter.org/ Phone: +1 415 235 3778 Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
On 18.04.2018 16:09, Craig Ringer wrote: > On 18 April 2018 at 19:52, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: > >> As far as I know most of DBMSes have some kind of internal connection >> pooling. >> Oracle, for example, you can create dedicated and non-dedicated backends. >> I wonder why we do not want to have something similar in Postgres. > I want to, and I know many others to. > > But the entire PostgreSQL architecture makes it hard to do well, and > means it requires heavy changes to do it in a way that will be > maintainable and reliable. > > Making it work, and making something maintainble and mergeable, are > two different things. Something I continue to struggle with myself. > Here I completely agree with you. Now my prototype "works": it is able to correctly handle errors, transaction rollbacks, long living transactions,... but I am completely sure that there are a lot of not tested cases when it will work incorrectly. But still I do not think that making built-in connection pooling really reliable is something unreachable. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 18.04.2018 16:24, David Fetter wrote: > On Wed, Apr 18, 2018 at 02:52:39PM +0300, Konstantin Knizhnik wrote: >> Yandex team is following this approach with theirOdysseus >> (multithreaded version of pgbouncer with many of pgbouncer issues >> fixed). > Have they opened the source to Odysseus? If not, do they have plans to? It is better to ask Valdimir Borodin (Yandex) about it. But as far as I know - the answer is yes. The Yandex policy is to make there products available for community. I just wonder why it was not interested to community to know details of this project... -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 18/04/18 07:52, Konstantin Knizhnik wrote: > > > On 18.04.2018 13:36, Heikki Linnakangas wrote: >> On 18/04/18 06:10, Konstantin Knizhnik wrote: >>> But there are still use cases which can not be covered y external >>> connection pooler. >> >> Can you name some? I understand that the existing external connection >> poolers all have their limitations. But are there some fundamental >> issues that can *only* be addressed by a built-in implementation? > > Well, may be I missed something, but i do not know how to efficiently > support > 1. Temporary tables > 2. Prepared statements > 3. Sessoin GUCs > with any external connection pooler (with pooling level other than session). Me neither. What makes it easier to do these things in an internal connection pooler? What could the backend do differently, to make these easier to implement in an external pooler? - Heikki
On 18.04.2018 16:41, Heikki Linnakangas wrote: > On 18/04/18 07:52, Konstantin Knizhnik wrote: >> >> >> On 18.04.2018 13:36, Heikki Linnakangas wrote: >>> On 18/04/18 06:10, Konstantin Knizhnik wrote: >>>> But there are still use cases which can not be covered y external >>>> connection pooler. >>> >>> Can you name some? I understand that the existing external connection >>> poolers all have their limitations. But are there some fundamental >>> issues that can *only* be addressed by a built-in implementation? >> >> Well, may be I missed something, but i do not know how to efficiently >> support >> 1. Temporary tables >> 2. Prepared statements >> 3. Sessoin GUCs >> with any external connection pooler (with pooling level other than >> session). > > Me neither. What makes it easier to do these things in an internal > connection pooler? What could the backend do differently, to make > these easier to implement in an external pooler? All this things are addressed now in my builtin connection pool implementation: 1. Temporary tables are maintained by creation of private temporary namespace for each session 2. Prepared statements are supported by adding unique session prefix to each prepared statement name (so there is single prepare statement cache in backend, but each session has its own prepared statements) 3. Each session maintains list of updated GUCs and them are saved/restored on reschedule. It was not so difficult to implement all this stuff (the main troubles I had with GUCs), but looks like none of them are possible fort external connection pooler. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
18 апр. 2018 г., в 16:24, David Fetter <david@fetter.org> написал(а):On Wed, Apr 18, 2018 at 02:52:39PM +0300, Konstantin Knizhnik wrote:Yandex team is following this approach with theirOdysseus
(multithreaded version of pgbouncer with many of pgbouncer issues
fixed).
Have they opened the source to Odysseus? If not, do they have plans to?
No, we haven't yet. Yep, we plan to do it until end on May.
Best,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
From: Konstantin Knizhnik [mailto:k.knizhnik@postgrespro.ru] Oracle, for example, you can create dedicated and non-dedicated backends. > I wonder why we do not want to have something similar in Postgres. Yes, I want it, too. In addition to dedicated and shared server processes, Oracle provides Database Resident ConnectionPooling (DRCP). I guessed you were inspired by this. https://docs.oracle.com/cd/B28359_01/server.111/b28310/manproc002.htm#ADMIN12348 BTW, you are doing various great work -- autoprepare, multithreaded Postgres, built-in connection pooling, etc. etc., aren'tyou? Are you doing all of these alone? Regards Takayuki Tsunakawa
On 19.04.2018 07:46, Tsunakawa, Takayuki wrote: > From: Konstantin Knizhnik [mailto:k.knizhnik@postgrespro.ru] > Oracle, for example, you can create dedicated and non-dedicated backends. >> I wonder why we do not want to have something similar in Postgres. > Yes, I want it, too. In addition to dedicated and shared server processes, Oracle provides Database Resident ConnectionPooling (DRCP). I guessed you were inspired by this. > > https://docs.oracle.com/cd/B28359_01/server.111/b28310/manproc002.htm#ADMIN12348 It seems to be that my connection pooling is more close to DRCP than to shared servers. It is not clear from this article what this 35KB per client connection are used for... It seems to be some thing similar with session context used to suspend/resume session. In my prototype I also maintain some per-session context to keep values of session specific GUCs, temporary namespace, ... Definitely pooled session memory footprint depends on size of catalog, prepared statements, updated GUCs,... but 10-100kb seems to be a reasonable estimation. > > BTW, you are doing various great work -- autoprepare, multithreaded Postgres, built-in connection pooling, etc. etc., aren'tyou? Are you doing all of these alone? Yes, but there is huge distance from prototype till product-ready solution. And definitely I need some help here. This is why I have to suspend future development of multithreaded version of Postgres (looks like it is not considered as some realistic project by community). But with builtin connection pooling situation is better and I am going to tests it with some our clients which are interested in this feature. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Thu, Apr 19, 2018, 9:24 AM Konstantin Knizhnik, <k.knizhnik@postgrespro.ru> wrote:
On 19.04.2018 07:46, Tsunakawa, Takayuki wrote:
> From: Konstantin Knizhnik [mailto:k.knizhnik@postgrespro.ru]
> Oracle, for example, you can create dedicated and non-dedicated backends.
>> I wonder why we do not want to have something similar in Postgres.
> Yes, I want it, too. In addition to dedicated and shared server processes, Oracle provides Database Resident Connection Pooling (DRCP). I guessed you were inspired by this.
>
> https://docs.oracle.com/cd/B28359_01/server.111/b28310/manproc002.htm#ADMIN12348
It seems to be that my connection pooling is more close to DRCP than to
shared servers.
It is not clear from this article what this 35KB per client connection
are used for...
It seems to be some thing similar with session context used to
suspend/resume session.
In my prototype I also maintain some per-session context to keep values
of session specific GUCs, temporary namespace, ...
Definitely pooled session memory footprint depends on size of catalog,
prepared statements, updated GUCs,... but 10-100kb seems to be a
reasonable estimation.
>
> BTW, you are doing various great work -- autoprepare, multithreaded Postgres, built-in connection pooling, etc. etc., aren't you? Are you doing all of these alone?
Yes, but there is huge distance from prototype till product-ready
solution. And definitely I need some help here. This is why I have to
suspend future development of multithreaded version of Postgres (looks
like it is not considered as some realistic project by community).
But with builtin connection pooling situation is better and I am going
to tests it with some our clients which are interested in this feature.
Konstantin
It would be useful to test with the JDBC driver
We run into issues with many pool implementations due to our opinionated nature
Thanks
Dave
--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
On 2018-04-18 06:36:38 -0400, Heikki Linnakangas wrote: > On 18/04/18 06:10, Konstantin Knizhnik wrote: > > But there are still use cases which can not be covered y external > > connection pooler. > > Can you name some? I understand that the existing external connection > poolers all have their limitations. But are there some fundamental issues > that can *only* be addressed by a built-in implementation? > > For the record, I think an internal connection pool might be a good idea. It > would presumably be simpler to set up than an external one, for example. But > it depends a lot on the implementation. If we had an internal connection > pool, I would expect it to be very transparent to the user, be simple to set > up, and not have annoying limitations with prepared statements, temporary > tables, etc. that the existing external ones have. > > However, I suspect that dealing with *all* of the issues is going to be hard > and tedious. And if there are any significant gaps, things that don't work > correctly with the pooler, the patch will almost certainly be rejected. > > I'd recommend that you put your effort in improving the existing external > connection poolers. Which one is closest to suit your needs? What's missing? > > There are probably things we could do in the server, to help external > connection poolers. For example, some kind of a proxy authentication, where > the connection pooler could ask the backend to do authentication on its > behalf, so that you wouldn't need to re-implement the server-side > authentication code in the external pooler. Things like that. FWIW, I think that's not the right course. We should work towards an in-core pooler. There's very few postgres installations that don't need one, and there's a lot of things that are very hard to do without closer integration. Greetings, Andres Freund
Greetings, * Andres Freund (andres@anarazel.de) wrote: > On 2018-04-18 06:36:38 -0400, Heikki Linnakangas wrote: > > On 18/04/18 06:10, Konstantin Knizhnik wrote: > > > But there are still use cases which can not be covered y external > > > connection pooler. > > > > Can you name some? I understand that the existing external connection > > poolers all have their limitations. But are there some fundamental issues > > that can *only* be addressed by a built-in implementation? > > > > For the record, I think an internal connection pool might be a good idea. It > > would presumably be simpler to set up than an external one, for example. But > > it depends a lot on the implementation. If we had an internal connection > > pool, I would expect it to be very transparent to the user, be simple to set > > up, and not have annoying limitations with prepared statements, temporary > > tables, etc. that the existing external ones have. > > > > However, I suspect that dealing with *all* of the issues is going to be hard > > and tedious. And if there are any significant gaps, things that don't work > > correctly with the pooler, the patch will almost certainly be rejected. > > > > I'd recommend that you put your effort in improving the existing external > > connection poolers. Which one is closest to suit your needs? What's missing? > > > > There are probably things we could do in the server, to help external > > connection poolers. For example, some kind of a proxy authentication, where > > the connection pooler could ask the backend to do authentication on its > > behalf, so that you wouldn't need to re-implement the server-side > > authentication code in the external pooler. Things like that. > > FWIW, I think that's not the right course. We should work towards an > in-core pooler. There's very few postgres installations that don't need > one, and there's a lot of things that are very hard to do without closer > integration. I tend to agree with this and things like trying to proxy authentication are really not ideal, since it involves necessairly trusting another system. Perhaps it'd be nice to be able to proxy auth cleanly, and in some cases it may be required to have another system involved (I've certainly seen cases of multi-layered pgbouncer), but I'd rather only do that when we need to instead of almost immediately... Thanks! Stephen
Attachment
Stephen Frost <sfrost@snowman.net> writes: > Greetings, > * Andres Freund (andres@anarazel.de) wrote: >> On 2018-04-18 06:36:38 -0400, Heikki Linnakangas wrote: >>> However, I suspect that dealing with *all* of the issues is going to be hard >>> and tedious. And if there are any significant gaps, things that don't work >>> correctly with the pooler, the patch will almost certainly be rejected. >> FWIW, I think that's not the right course. We should work towards an >> in-core pooler. There's very few postgres installations that don't need >> one, and there's a lot of things that are very hard to do without closer >> integration. > I tend to agree with this and things like trying to proxy authentication > are really not ideal, since it involves necessairly trusting another > system. FWIW, I concur with Heikki's position that we're going to have very high standards for the transparency of any in-core pooler. Before trying to propose a patch, it'd be a good idea to try to fix the perceived shortcomings of some existing external pooler. Only after you can say "there's nothing wrong with this that isn't directly connected to its not being in-core" does it make sense to try to push the logic into core. regards, tom lane
On Thu, 19 Apr 2018 at 10:27, Dave Cramer <davecramer@gmail.com> wrote: > It would be useful to test with the JDBC driver > We run into issues with many pool implementations due to our opinionated nature Absolutely. And Java developers frequently have a further opinionated nature on this... A bunch of Java frameworks include connection pools... 1. BoneCP, which claims to be the "tuned to be fast" one http://jolbox.com/ 2. Apache Commons DBCP, which is the "we're Apache, we're everywhere!" one http://commons.apache.org/dbcp/ 3. c3p0 - easy-to-add Connection Pool http://www.mchange.com/projects/c3p0 One of the things that they find likable is that by having the connection pool live in the framework alongside the application is that this makes it easy to attach hooks so that the pool can do intelligent things based on application-aware logic. When we're in "DB server centric" mindset, we'll have some particular ideas as to what a pool ought to be able to do; if that doesn't include their ideas at all, it'll lead to the Java guys thinking that what we have is quaint and uninteresting. I suspect that this disconnect and somewhat "great divide" is an extra reason why proposals to bring connection pooling into core don't get too far. -- When confronted by a difficult problem, solve it by reducing it to the question, "How would the Lone Ranger handle this?"
On 2018-04-19 15:01:24 -0400, Tom Lane wrote: > Only after you can say "there's nothing wrong with this that isn't > directly connected to its not being in-core" does it make sense to try > to push the logic into core. I think there's plenty things that don't really make sense solving outside of postgres: - additional added hop / context switches due to external pooler - temporary tables - prepared statements - GUCs and other session state I think there's at least one thing that we should attempt to make easier for external pooler: - proxy authorization I think in an "ideal world" there's two kinds of poolers: Dumb ones further out from the database (for short lived processes, keeping the total number of connections sane, etc) and then more intelligent one closer to the database. Greetings, Andres Freund
> I think there's plenty things that don't really make sense solving > outside of postgres: > - additional added hop / context switches due to external pooler This is only applied to external process type pooler (like Pgpool-II). > - temporary tables > - prepared statements > - GUCs and other session state These are only applied to "non session based" pooler; sharing a database connection with multiple client connections. "Session based" connection pooler like Pgpool-II does not have the shortcomings. One thing either built-in or application library type pooler (like JDBC) cannot do is, handle multiple PostgreSQL servers. > I think there's at least one thing that we should attempt to make > easier for external pooler: > - proxy authorization Yeah. Since SCRAM auth is implemented, some connection poolers including Pgpool-II are struggling to adopt it. Another thing PostgreSQL can do to make external pooler's life easier is, enhancing frontend/backend protocol so that reply messages of prepare etc. include portal/statement info. But apparently this needs protocol changes. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
On Fri, Apr 20, 2018 at 07:58:00AM +0900, Tatsuo Ishii wrote: > Yeah. Since SCRAM auth is implemented, some connection poolers > including Pgpool-II are struggling to adopt it. Er, well. pgpool is also taking advantage of MD5 weaknesses... While SCRAM fixes this class of problems, and channel binding actually makes this harder for poolers to deal with. -- Michael
Attachment
> On Fri, Apr 20, 2018 at 07:58:00AM +0900, Tatsuo Ishii wrote: >> Yeah. Since SCRAM auth is implemented, some connection poolers >> including Pgpool-II are struggling to adopt it. > > Er, well. pgpool is also taking advantage of MD5 weaknesses... While > SCRAM fixes this class of problems, and channel binding actually makes > this harder for poolers to deal with. One of Pgpool-II developers Usama are working hard to re-implement SCRAM auth for upcoming Pgpool-II 4.0: i.e. storing passwords (of course in some encrypted form) in Pgpool-II. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
Christopher>One of the things that they find likable is that by having the connection
pool live
Christopher>in the framework alongside the application is that this makes it easy to
attach
Christopher>hooks so that the pool can do intelligent things based on application-aware
logic.
pool live
Christopher>in the framework alongside the application is that this makes it easy to
attach
Christopher>hooks so that the pool can do intelligent things based on application-aware
logic.
I'm afraid I do not follow you. Can you please provide an example?
TL;DR:
1) I think in-application pooling would be required for performance reasons in any case.
2) Out-of-application pooling (in-backend or in-the-middle) is likely needed as well
JDBC clients use client-side connection pooling for performance reasons:
1) Connection setup does have overhead:
1.1) TCP connection takes time to init/close
1.2) Startup queries involve a couple of roundrips: "startup packet", then "SET extra_float_digits = 3", then "SET application_name = '...' "
2) Binary formats on the wire are tied to oids. Clients have to cache the oids somehow, and "cache per connection" is the current approach.
3) Application threads tend to augment "application_name", "search_path", etc for its own purposes, and it would slow the application down significantly if JDBC driver reverted application_name/search_path/etc for each and every "connection borrow".
4) I believe there's non-zero overhead for backend process startup
As Konstantin lists in the initial email, the problem is backend itself does not scale well with lots of backend processes.
In other words: it is fine if PostgreSQL is accessed by a single Java application since the number of connections would be reasonable (limited by the Java connection pool).
That, however, is not the case when the DB is accessed by lots of applications (==lots of idle connections) and/or in case the application is using short-lived connections (==in-app pool is missing that forces backend processes to come and go).
Vladimir
>Development in built-in connection pooling will be continued in https://github.com/postgrespro/postgresql.builtin_pool.git
The branch (as of 0020c44195992c6dce26baec354a5e54ff30b33f) passes pgjdbc tests: https://travis-ci.org/vlsi/pgjdbc/builds/368997672
Current tests are mostly single-threaded, so the tests are unlikely to trigger lots of "concurrent connection" uses.
The next step might be to create multiple schemas, and execute multiple tests in parallel.
Vladimir
On 19.04.2018 17:27, Dave Cramer wrote:
On Thu, Apr 19, 2018, 9:24 AM Konstantin Knizhnik, <k.knizhnik@postgrespro.ru> wrote:
On 19.04.2018 07:46, Tsunakawa, Takayuki wrote:
> From: Konstantin Knizhnik [mailto:k.knizhnik@postgrespro.ru]
> Oracle, for example, you can create dedicated and non-dedicated backends.
>> I wonder why we do not want to have something similar in Postgres.
> Yes, I want it, too. In addition to dedicated and shared server processes, Oracle provides Database Resident Connection Pooling (DRCP). I guessed you were inspired by this.
>
> https://docs.oracle.com/cd/B28359_01/server.111/b28310/manproc002.htm#ADMIN12348
It seems to be that my connection pooling is more close to DRCP than to
shared servers.
It is not clear from this article what this 35KB per client connection
are used for...
It seems to be some thing similar with session context used to
suspend/resume session.
In my prototype I also maintain some per-session context to keep values
of session specific GUCs, temporary namespace, ...
Definitely pooled session memory footprint depends on size of catalog,
prepared statements, updated GUCs,... but 10-100kb seems to be a
reasonable estimation.
>
> BTW, you are doing various great work -- autoprepare, multithreaded Postgres, built-in connection pooling, etc. etc., aren't you? Are you doing all of these alone?
Yes, but there is huge distance from prototype till product-ready
solution. And definitely I need some help here. This is why I have to
suspend future development of multithreaded version of Postgres (looks
like it is not considered as some realistic project by community).
But with builtin connection pooling situation is better and I am going
to tests it with some our clients which are interested in this feature.
KonstantinIt would be useful to test with the JDBC driverWe run into issues with many pool implementations due to our opinionated nature
I have tested built-in connection pool with YCSB benchmark which is implemented in Java and so works through JDBC driver.
Results were published in the following mail in this thread:
https://www.postgresql.org/message-id/7bbbb359-c582-7a08-5772-cb882988c0ae%40postgrespro.ru
-- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 20.04.2018 01:58, Tatsuo Ishii wrote: >> I think there's plenty things that don't really make sense solving >> outside of postgres: >> - additional added hop / context switches due to external pooler > This is only applied to external process type pooler (like Pgpool-II). > >> - temporary tables >> - prepared statements >> - GUCs and other session state > These are only applied to "non session based" pooler; sharing a > database connection with multiple client connections. "Session based" > connection pooler like Pgpool-II does not have the shortcomings. But them are not solving the main problem: restricting number of launched backends. Pgbouncer also can be used in session pooling mode. But it makes sense only if there is limited number of clients which permanently connect/disconnect to the database. But I do not think that it is so popular use case. Usually there is very large number of connected clients which rarely drop connection but only few of them are active at each moment of time. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 20.04.2018 03:14, Tatsuo Ishii wrote: >> On Fri, Apr 20, 2018 at 07:58:00AM +0900, Tatsuo Ishii wrote: >>> Yeah. Since SCRAM auth is implemented, some connection poolers >>> including Pgpool-II are struggling to adopt it. >> Er, well. pgpool is also taking advantage of MD5 weaknesses... While >> SCRAM fixes this class of problems, and channel binding actually makes >> this harder for poolers to deal with. > One of Pgpool-II developers Usama are working hard to re-implement > SCRAM auth for upcoming Pgpool-II 4.0: i.e. storing passwords (of > course in some encrypted form) in Pgpool-II. > Just want to notice that authentication is are where I have completely no experience. So any suggestions or help in developing right authentication mechanism for built-in connection pooling is welcome. Right authentication of pooled session by shared backend is performed in the same way as by normal (dedicated) Postgres backend. Postmaster just transfer accepted socket to one of the workers (backends) and it performs authentication in normal way. It actually means that all sessions scheduled to the same worker should access the same database under the same user. Accepting connections to different databases/users right now is supported by making it possible to create several session pools and binding each session pool to its own port at which postmaster will accept connections to this page pool. As alternative approach I considered spawning separate "authentication" process (or do it in postmaster), which will process startup package and only after it schedule session to one of the workers. But such policy is much more difficult to implement and it is unclear how to map database/user pairs to worker backends. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
> On 20.04.2018 01:58, Tatsuo Ishii wrote: >>> I think there's plenty things that don't really make sense solving >>> outside of postgres: >>> - additional added hop / context switches due to external pooler >> This is only applied to external process type pooler (like Pgpool-II). >> >>> - temporary tables >>> - prepared statements >>> - GUCs and other session state >> These are only applied to "non session based" pooler; sharing a >> database connection with multiple client connections. "Session based" >> connection pooler like Pgpool-II does not have the shortcomings. > But them are not solving the main problem: restricting number of > launched backends. Pgpool-II already does this. If number of concurrent clients exceeds max_connections, max_connections+1 client have to wait until other client disconnect the session. So "restricting number of launched backends" is an indenpendet function from whether "session based" connection poolers" is used or not. > Pgbouncer also can be used in session pooling mode. But it makes > sense only if there is limited number of clients which permanently > connect/disconnect to the database. > But I do not think that it is so popular use case. Usually there is > very large number of connected clients which rarely drop connection > but only few of them are active at each moment of time. Not neccessarily. i.e. Session based poolers allow to use temporary tables, prepared statements and keep GUCs and other session state, while non session based poolers does not allow to use them. So choosing "session based poolers" or "non session based poolers" is a trade off. i.e. let user choose one of them. If you are willing to merge your connection pooler into core, I would suggest you'd better to implement those pool modes. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
On 20.04.2018 11:16, Tatsuo Ishii wrote: >> On 20.04.2018 01:58, Tatsuo Ishii wrote: >>>> I think there's plenty things that don't really make sense solving >>>> outside of postgres: >>>> - additional added hop / context switches due to external pooler >>> This is only applied to external process type pooler (like Pgpool-II). >>> >>>> - temporary tables >>>> - prepared statements >>>> - GUCs and other session state >>> These are only applied to "non session based" pooler; sharing a >>> database connection with multiple client connections. "Session based" >>> connection pooler like Pgpool-II does not have the shortcomings. >> But them are not solving the main problem: restricting number of >> launched backends. > Pgpool-II already does this. If number of concurrent clients exceeds > max_connections, max_connections+1 client have to wait until other > client disconnect the session. So "restricting number of launched > backends" is an indenpendet function from whether "session based" > connection poolers" is used or not. Sorry, but delaying new client connection until some other client is disconnected is not an acceptable solution in most cases. Most of customers want to provide connections to the database server for unlimited (or at least > 100) number of clients. And this clients used to keep connection alive and do not hangout after execution of each statement/transaction. In this case approach with session pooling dopesn't work. > >> Pgbouncer also can be used in session pooling mode. But it makes >> sense only if there is limited number of clients which permanently >> connect/disconnect to the database. >> But I do not think that it is so popular use case. Usually there is >> very large number of connected clients which rarely drop connection >> but only few of them are active at each moment of time. > Not neccessarily. i.e. Session based poolers allow to use temporary > tables, prepared statements and keep GUCs and other session state, > while non session based poolers does not allow to use them. > > So choosing "session based poolers" or "non session based poolers" is > a trade off. i.e. let user choose one of them. > > If you are willing to merge your connection pooler into core, I would > suggest you'd better to implement those pool modes. Sorry, may we do not understand each other. There are the following facts: 1. There are some entities in Postgres which are local to a backend: temporary tables, GUCs, prepared statement, relation and catalog caches,... 2. Postgres doesn't "like" larger number of backends. Even only few of them are actually active. Large number of backends means large procarray, large snapshots,... Please refere to my measurement at the beginning of this thread which illustrate how performance of Pastgres degrades with increasing number of backends. 3. Session semantic (prepared statements, GUCs, temporary tables) can be supported only in session level pooling mode. 4. This mode is not acceptable in most cases because it is not possible to limit number of clients which want to establish connection wither database server or keep it small. This is why most pgbouncer users are using statement pooling mode. 5. It doesn't matter how you manged to implement pooling outside Postgres: if you want to preserve session semantic, then you need to spawn as much backends as sessions. And number of clients is limited by number of backends/sessions. The primary idea and main benefit of built-in connection pooler is to support session semantic with limited number of backends. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
>>>> This is only applied to external process type pooler (like Pgpool-II). >>>> >>>>> - temporary tables >>>>> - prepared statements >>>>> - GUCs and other session state >>>> These are only applied to "non session based" pooler; sharing a >>>> database connection with multiple client connections. "Session based" >>>> connection pooler like Pgpool-II does not have the shortcomings. >>> But them are not solving the main problem: restricting number of >>> launched backends. >> Pgpool-II already does this. If number of concurrent clients exceeds >> max_connections, max_connections+1 client have to wait until other >> client disconnect the session. So "restricting number of launched >> backends" is an indenpendet function from whether "session based" >> connection poolers" is used or not. > Sorry, but delaying new client connection until some other client is > disconnected is not an acceptable solution in most cases. I just wanted to pointed out the counter fact against this. >>> But them are not solving the main problem: restricting number of >>> launched backends. > Most of customers want to provide connections to the database server > for unlimited (or at least > 100) number of clients. > And this clients used to keep connection alive and do not hangout > after execution of each statement/transaction. > In this case approach with session pooling dopesn't work. I understand your customers like to have unlimited number of connections. But my customers do not. (btw, even with normal PostgreSQL, some of my customers are happily using over 1k, even 5k max_connections). >>> Pgbouncer also can be used in session pooling mode. But it makes >>> sense only if there is limited number of clients which permanently >>> connect/disconnect to the database. >>> But I do not think that it is so popular use case. Usually there is >>> very large number of connected clients which rarely drop connection >>> but only few of them are active at each moment of time. >> Not neccessarily. i.e. Session based poolers allow to use temporary >> tables, prepared statements and keep GUCs and other session state, >> while non session based poolers does not allow to use them. >> >> So choosing "session based poolers" or "non session based poolers" is >> a trade off. i.e. let user choose one of them. >> >> If you are willing to merge your connection pooler into core, I would >> suggest you'd better to implement those pool modes. > > > Sorry, may we do not understand each other. > There are the following facts: > 1. There are some entities in Postgres which are local to a backend: > temporary tables, GUCs, prepared statement, relation and catalog > caches,... > 2. Postgres doesn't "like" larger number of backends. Even only few > of them are actually active. Large number of backends means large > procarray, large snapshots,... > Please refere to my measurement at the beginning of this thread which > illustrate how performance of Pastgres degrades with increasing number > of backends. > 3. Session semantic (prepared statements, GUCs, temporary tables) can > be supported only in session level pooling mode. I agree with 1 -3. > 4. This mode is not acceptable in most cases because it is not > possible to limit number of clients which want to establish connection > wither database server or keep it small. > This is why most pgbouncer users are using statement pooling mode. Not sure about 4. I rarely see such users around me. > 5. It doesn't matter how you manged to implement pooling outside > Postgres: if you want to preserve session semantic, then you need to > spawn as much backends as sessions. And number of clients is limited > by number of backends/sessions. Rigt. I am happy with the limitation for now. > The primary idea and main benefit of built-in connection pooler is to > support session semantic with limited number of backends. I am confused. If so, why do you want to push statement based or transaction based built-in connection pooler? Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
On 20.04.2018 12:02, Tatsuo Ishii wrote: > > I understand your customers like to have unlimited number of > connections. But my customers do not. (btw, even with normal > PostgreSQL, some of my customers are happily using over 1k, even 5k > max_connections). If you have limited number of client, then you do not need pooling at all. With the only one exception if clients for some reasons do not want to keep connections to database server and prefer to establish connection on demand and disconnect as soon as possible. But IMHO in most cases it meas bad design of client application, because establishing connection (even with connection pooler) is quite expensive operation. The primary idea and main benefit of built-in connection pooler is to >> support session semantic with limited number of backends. > I am confused. If so, why do you want to push statement based or > transaction based built-in connection pooler? I want to provide session semantic but do not start dedicated backend for each session. Transaction level rescheduling (rather than statement level resheduling) is used to avoid complexity with storing/restoring transaction context and maintaining locks. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri., 20 Apr. 2018, 06:59 Andres Freund, <andres@anarazel.de> wrote:
On 2018-04-19 15:01:24 -0400, Tom Lane wrote:
> Only after you can say "there's nothing wrong with this that isn't
> directly connected to its not being in-core" does it make sense to try
> to push the logic into core.
I think there's plenty things that don't really make sense solving
outside of postgres:
- additional added hop / context switches due to external pooler
- temporary tables
- prepared statements
- GUCs and other session state
Totally agreed. Poolers can make some limited efforts there, but that's all.
Poolers also have a hard time determining if a query is read-only or read/write. Wheas Pg its self has a better chance, and we could help it along with function READONLY attributes if we wanted. This matters master/standby query routing. Standbys being able to proxy for the master would be fantastic but isn't practical without some kind of pooler.
I think there's at least one thing that we should attempt to make
easier for external pooler:
- proxy authorization
Yes, very yes. I've raised this before in a limited form - SET SESSION AURHORIZATION that cannot be reset without a cookie value. But true proxy auth would be better.
>> I understand your customers like to have unlimited number of >> connections. But my customers do not. (btw, even with normal >> PostgreSQL, some of my customers are happily using over 1k, even 5k >> max_connections). > > If you have limited number of client, then you do not need pooling at > all. Still pooler is needed even if the number of connections is low because connecting to PostgreSQL is very expensive operation as everybody knows. BTW, the main reason why Pgpool-II is used is, because it is a pooler, but query routing: write queies to primary server and read queries to standbys. This is not possible in bulit-in pooler. >> I am confused. If so, why do you want to push statement based or >> transaction based built-in connection pooler? > > I want to provide session semantic but do not start dedicated backend > for each session. > Transaction level rescheduling (rather than statement level > resheduling) is used to avoid complexity with storing/restoring > transaction context and maintaining locks. Not sure if it's acceptable for community. Probably many developers want built-in pooler keeps exactly the same semantics as normal connections. Tome Lane wrote: > FWIW, I concur with Heikki's position that we're going to have very high > standards for the transparency of any in-core pooler. Before trying to > propose a patch, it'd be a good idea to try to fix the perceived > shortcomings of some existing external pooler. Only after you can say > "there's nothing wrong with this that isn't directly connected to its > not being in-core" does it make sense to try to push the logic into core. So I would suggest you to start with session level in-core pooler, which would be much easier than transaction level pooler to make it transparent. Best regards, -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese:http://www.sraoss.co.jp
19 апр. 2018 г., в 23:59, Andres Freund <andres@anarazel.de> написал(а):
I think there's plenty things that don't really make sense solving
outside of postgres:
- additional added hop / context switches due to external pooler
- temporary tables
- prepared statements
- GUCs and other session state
+1
I think there's at least one thing that we should attempt to make
easier for external pooler:
- proxy authorization
I suggested it here [1] but fair amount of people argued against it in that thread.
On Fri, Jan 19, 2018 at 11:59 AM, Tomas Vondra <tomas.vondra@2ndquadrant.com> wrote: > Hmmm, that's unfortunate. I guess you'll have process the startup packet > in the main process, before it gets forked. At least partially. I'm not keen on a design that would involve doing more stuff in the postmaster, because that would increase the chances of the postmaster accidentally dying, which is really bad. I've been thinking about the idea of having a separate "listener" process that receives connections, and that the postmaster can restart if it fails. Or there could even be multiple listeners if needed. When the listener gets a connection, it hands it off to another process that then "owns" that connection. One problem with this is that the process that's going to take over the connection needs to get started by the postmaster, not the listener. The listener could signal the postmaster to start it, just like we do for background workers, but that might add a bit of latency. So what I'm thinking is that the postmaster could maintain a small (and configurably-sized) pool of preforked workers. That might be worth doing independently, as a way to reduce connection startup latency, although somebody would have to test it to see whether it really works... a lot of the startup work can't be done until we know which database the user wants. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Apr 18, 2018 at 9:41 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >> Well, may be I missed something, but i do not know how to efficiently >> support >> 1. Temporary tables >> 2. Prepared statements >> 3. Sessoin GUCs >> with any external connection pooler (with pooling level other than >> session). > > Me neither. What makes it easier to do these things in an internal > connection pooler? What could the backend do differently, to make these > easier to implement in an external pooler? I think you are Konstantin are possibly failing to see the big picture here. Temporary tables, prepared statements, and GUC settings are examples of session state that users expect will be preserved for the lifetime of a connection and not beyond; all session state, of whatever kind, has the same set of problems. A transparent connection pooling experience means guaranteeing that no such state vanishes before the user ends the current session, and also that no such state established by some other session becomes visible in the current session. And we really need to account for *all* such state, not just really big things like temporary tables and prepared statements and GUCs but also much subtler things such as the state of the PRNG established by srandom(). This is really very similar to the problem that parallel query has when spinning up new worker backends. As far as possible, we want the worker backends to have the same state as the original backend. However, there's no systematic way of being sure that every relevant backend-private global, including perhaps globals added by loadable modules, is in exactly the same state. For parallel query, we solved that problem by copying a bunch of things that we knew were commonly-used (cf. parallel.c) and by requiring functions to be labeled as parallel-restricted if they rely on anything other state. The problem for connection pooling is much harder. If you only ever ran parallel-safe functions throughout the lifetime of a session, then you would know that the session has no "hidden state" other than what parallel.c already knows about (except for any functions that are mislabeled, but we can say that's the user's fault for mislabeling them). But as soon as you run even one parallel-restricted or parallel-unsafe function, there might be a global variable someplace that holds arbitrary state which the core system won't know anything about. If you want to have some other process take over that session, you need to copy that state to the new process; if you want to reuse the current process for a new session, you need to clear that state. Since you don't know it exists or where to find it, and since the code to copy and/or clear it might not even exist, you can't. In other words, transparent connection pooling is going to require some new mechanism, which third-party code will have to know about, for tracking every last bit of session state that might need to be preserved or cleared. That's going to be a big project. Maybe some of that can piggyback on existing infrastructure like InvalidateSystemCaches(), but there's probably still a ton of ad-hoc state to deal with. And no out-of-core pooler has a chance of handling all that stuff correctly; an in-core pooler will be able to do so only with a lot of work. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Apr 20, 2018 at 11:40:59AM +0300, Konstantin Knizhnik wrote: > > Sorry, may we do not understand each other. > There are the following facts: > 1. There are some entities in Postgres which are local to a backend: > temporary tables, GUCs, prepared statement, relation and catalog caches,... > 2. Postgres doesn't "like" larger number of backends. Even only few of them > are actually active. Large number of backends means large procarray, large > snapshots,... > Please refere to my measurement at the beginning of this thread which > illustrate how performance of Pastgres degrades with increasing number of > backends. So, instead of trying to multiplex multiple sessions in a single operating system process, why don't we try to reduce the overhead of idle sessions that each have an operating system process? We already use procArray to reduce the number of _assigned_ PGPROC entries we have to scan. Why can't we create another array that only contains _active_ sessions, i.e. those not in a transaction. In what places can procArray scans be changed to use this new array? -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Mon, Apr 23, 2018 at 7:59 PM, Bruce Momjian <bruce@momjian.us> wrote: > So, instead of trying to multiplex multiple sessions in a single > operating system process, why don't we try to reduce the overhead of > idle sessions that each have an operating system process? We already > use procArray to reduce the number of _assigned_ PGPROC entries we have > to scan. Why can't we create another array that only contains _active_ > sessions, i.e. those not in a transaction. In what places can procArray > scans be changed to use this new array? There are lots of places where scans would benefit, but the cost of maintaining the new array would be very high in some workloads, so I don't think you'd come out ahead overall. Feel free to code it up and test it, though. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Mon, Apr 23, 2018 at 09:47:07PM -0400, Robert Haas wrote: > On Mon, Apr 23, 2018 at 7:59 PM, Bruce Momjian <bruce@momjian.us> wrote: > > So, instead of trying to multiplex multiple sessions in a single > > operating system process, why don't we try to reduce the overhead of > > idle sessions that each have an operating system process? We already > > use procArray to reduce the number of _assigned_ PGPROC entries we have > > to scan. Why can't we create another array that only contains _active_ > > sessions, i.e. those not in a transaction. In what places can procArray > > scans be changed to use this new array? > > There are lots of places where scans would benefit, but the cost of > maintaining the new array would be very high in some workloads, so I > don't think you'd come out ahead overall. Feel free to code it up and > test it, though. Well, it would be nice if we new exactly which scans are slow for a large number of idle sessions, and then we could determine what criteria for that array would be beneficial --- that seems like the easiest place to start. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Mon, Apr 23, 2018 at 09:53:37PM -0400, Bruce Momjian wrote: > On Mon, Apr 23, 2018 at 09:47:07PM -0400, Robert Haas wrote: > > On Mon, Apr 23, 2018 at 7:59 PM, Bruce Momjian <bruce@momjian.us> wrote: > > > So, instead of trying to multiplex multiple sessions in a single > > > operating system process, why don't we try to reduce the overhead of > > > idle sessions that each have an operating system process? We already > > > use procArray to reduce the number of _assigned_ PGPROC entries we have > > > to scan. Why can't we create another array that only contains _active_ > > > sessions, i.e. those not in a transaction. In what places can procArray > > > scans be changed to use this new array? > > > > There are lots of places where scans would benefit, but the cost of > > maintaining the new array would be very high in some workloads, so I > > don't think you'd come out ahead overall. Feel free to code it up and > > test it, though. > > Well, it would be nice if we new exactly which scans are slow for a > large number of idle sessions, and then we could determine what criteria > for that array would be beneficial --- that seems like the easiest place > to start. I guess my point is if we are looking at trying to store all the session state in shared memory, so any process can resume it, we might as well see if we can find a way to more cheaply store the state in an idle process. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On 23.04.2018 21:56, Robert Haas wrote: > On Fri, Jan 19, 2018 at 11:59 AM, Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: >> Hmmm, that's unfortunate. I guess you'll have process the startup packet >> in the main process, before it gets forked. At least partially. > I'm not keen on a design that would involve doing more stuff in the > postmaster, because that would increase the chances of the postmaster > accidentally dying, which is really bad. I've been thinking about the > idea of having a separate "listener" process that receives > connections, and that the postmaster can restart if it fails. Or > there could even be multiple listeners if needed. When the listener > gets a connection, it hands it off to another process that then "owns" > that connection. > > One problem with this is that the process that's going to take over > the connection needs to get started by the postmaster, not the > listener. The listener could signal the postmaster to start it, just > like we do for background workers, but that might add a bit of > latency. So what I'm thinking is that the postmaster could maintain > a small (and configurably-sized) pool of preforked workers. That > might be worth doing independently, as a way to reduce connection > startup latency, although somebody would have to test it to see > whether it really works... a lot of the startup work can't be done > until we know which database the user wants. > I agree that starting separate "listener" process(es) is the most flexible and scalable solution. I have not implemented this apporach due to the problems with forking new backend you have mentioned. But certainly it can be addressed. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Mon, Apr 23, 2018 at 3:14 PM, Robert Haas <robertmhaas@gmail.com> wrote: > In other words, transparent connection pooling is going to require > some new mechanism, which third-party code will have to know about, > for tracking every last bit of session state that might need to be > preserved or cleared. That's going to be a big project. Maybe some > of that can piggyback on existing infrastructure like > InvalidateSystemCaches(), but there's probably still a ton of ad-hoc > state to deal with. And no out-of-core pooler has a chance of > handling all that stuff correctly; an in-core pooler will be able to > do so only with a lot of work. Why does it have to be completely transparent? As long as the feature is optional (say, a .conf setting) the tradeoffs can be managed. It's a reasonable to expect to exchange some functionality for pooling; pgbouncer provides a 'release' query (say, DISCARD ALL) to be called upon release back to the pool. Having session state objects (not all of which we are talking about; advisory locks and notifications deserve consideration) 'just work' would be wonderful but ought not to hold up other usages of the feature. merlin
On Tue, Apr 24, 2018 at 9:52 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > > Why does it have to be completely transparent? As long as the feature > is optional (say, a .conf setting) the tradeoffs can be managed. It's > a reasonable to expect to exchange some functionality for pooling; > pgbouncer provides a 'release' query (say, DISCARD ALL) to be called > upon release back to the pool. Having session state objects (not all > of which we are talking about; advisory locks and notifications > deserve consideration) 'just work' would be wonderful but ought not to > hold up other usages of the feature. > > merlin Just my $0.02, I wouldn't take advantage of this feature as a user without it being transparent. I use too many of the features which would be affected by not maintaining the state. That's one of the reasons I only use an external JDBC pooler for my primary application, and plain ole connections for all of my secondary services which need to just work with temp tables, session variables, etc. I'd love it if I could use one of those poolers (or a built in one) which just magically increased performance for starting up connections, lowered the overhead of idle sessions, and didn't mess with session state. Short of that, i'll take the hit in performance and using more memory than I should with direct connections for now. Not sure how other users feel, but that's where i'm sitting for my use case.
On 23.04.2018 23:14, Robert Haas wrote: > On Wed, Apr 18, 2018 at 9:41 AM, Heikki Linnakangas <hlinnaka@iki.fi> wrote: >>> Well, may be I missed something, but i do not know how to efficiently >>> support >>> 1. Temporary tables >>> 2. Prepared statements >>> 3. Sessoin GUCs >>> with any external connection pooler (with pooling level other than >>> session). >> Me neither. What makes it easier to do these things in an internal >> connection pooler? What could the backend do differently, to make these >> easier to implement in an external pooler? > I think you are Konstantin are possibly failing to see the big picture > here. Temporary tables, prepared statements, and GUC settings are > examples of session state that users expect will be preserved for the > lifetime of a connection and not beyond; all session state, of > whatever kind, has the same set of problems. A transparent connection > pooling experience means guaranteeing that no such state vanishes > before the user ends the current session, and also that no such state > established by some other session becomes visible in the current > session. And we really need to account for *all* such state, not just > really big things like temporary tables and prepared statements and > GUCs but also much subtler things such as the state of the PRNG > established by srandom(). It is not quit true thst I have not realized this issues. In addition to connection pooling, I have also implemented pthread version of Postgres and their static variables are replaced with thread-local variables which let each thread use its own set of variables. Unfortunately in connection pooling this approach can not be used. But I think that performing scheduling at transaction level will eliminate the problem with static variables in most cases. My expectation is that there are very few of them which has session-level lifetime. Unfortunately it is not so easy to locate all such places. Once such variables are located, them can be saved in session context and restored on reschedule. More challenging thing is to handle system static variables which which can not be easily saved/restored. You example with srandom is exactly such case. Right now I do not know any efficient way to suspend/resume pseudo-random sequence. But frankly speaking, that such behaviour of random is completely not acceptable and built-in session pool unusable. > This is really very similar to the problem that parallel query has > when spinning up new worker backends. As far as possible, we want the > worker backends to have the same state as the original backend. > However, there's no systematic way of being sure that every relevant > backend-private global, including perhaps globals added by loadable > modules, is in exactly the same state. For parallel query, we solved > that problem by copying a bunch of things that we knew were > commonly-used (cf. parallel.c) and by requiring functions to be > labeled as parallel-restricted if they rely on anything other state. > The problem for connection pooling is much harder. If you only ever > ran parallel-safe functions throughout the lifetime of a session, then > you would know that the session has no "hidden state" other than what > parallel.c already knows about (except for any functions that are > mislabeled, but we can say that's the user's fault for mislabeling > them). But as soon as you run even one parallel-restricted or > parallel-unsafe function, there might be a global variable someplace > that holds arbitrary state which the core system won't know anything > about. If you want to have some other process take over that session, > you need to copy that state to the new process; if you want to reuse > the current process for a new session, you need to clear that state. > Since you don't know it exists or where to find it, and since the code > to copy and/or clear it might not even exist, you can't. > > In other words, transparent connection pooling is going to require > some new mechanism, which third-party code will have to know about, > for tracking every last bit of session state that might need to be > preserved or cleared. That's going to be a big project. Maybe some > of that can piggyback on existing infrastructure like > InvalidateSystemCaches(), but there's probably still a ton of ad-hoc > state to deal with. And no out-of-core pooler has a chance of > handling all that stuff correctly; an in-core pooler will be able to > do so only with a lot of work. I think that situation with parallel executors are slightly different: in this case several backends perform execution of the same query. So them really need to somehow share/synchronize state of static variables. But in case of connection pooling only one transaction is executed by backend at each moment of time. And there should be no problems with static variables unless them cross transaction boundary. But I do not think that there are many such variables. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
> On Apr 24, 2018, at 06:52, Merlin Moncure <mmoncure@gmail.com> wrote: > Why does it have to be completely transparent? Well, we have non-transparent connection pooling now, in the form of pgbouncer, and the huge fleet of existing application-stackpoolers. The main reason to move it into core is to avoid the limitations that a non-core pooler has. -- -- Christophe Pettus xof@thebuild.com
On 25.04.2018 08:34, Christophe Pettus wrote: >> On Apr 24, 2018, at 06:52, Merlin Moncure <mmoncure@gmail.com> wrote: >> Why does it have to be completely transparent? > Well, we have non-transparent connection pooling now, in the form of pgbouncer, and the huge fleet of existing application-stackpoolers. The main reason to move it into core is to avoid the limitations that a non-core pooler has. What do we mean by "completely transparent"? If complete transparency means that polled sessions behaves exactly the same as normal session in dedicated backend then it will be really difficult to achieve, taken in account all error handling nuances, issue with srandom, and may be some other contexts with session lifetime... But I start development of built-in connection poller because of our customer's requests. For example 1C clients never drop connections and 1C application is widely using temporary tables. So them can not use pgbouncer and number of clients can be very larger (thousands). Built-in connection pooling will satisfy their needs. And the fact that random() in polled connection will return different values is absolutely unimportant for them. So my point of view is the following: 1. Support of temporary tables in pooled sessions is important because them are widely used in many applications. 2. Support of prepared statements in polled sessions is also useful, because it allows to increase performance up to two times. 3. Support of GUCs is also required, because there are many things: locale, date format, timezone which are set by client application using GUCs. Other things seems to be less important. If there are some static variables (not associated with GUCs) with session (backend) lifetime, then them can be moved to session context. I just do not know some variables. > > -- > -- Christophe Pettus > xof@thebuild.com > -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Wed, Apr 25, 2018 at 12:34 AM, Christophe Pettus <xof@thebuild.com> wrote: > >> On Apr 24, 2018, at 06:52, Merlin Moncure <mmoncure@gmail.com> wrote: >> Why does it have to be completely transparent? > > The main reason to move it into core is to avoid the limitations that a non-core pooler has. The limitations headaches that I suffer with pgbouncer project (which I love and use often) are mainly administrative and performance related, not lack of session based server features. Applications that operate over a very large amount of virtual connections or engage a very high level of small transaction traffic are going to avoid session based features for a lot of other reasons anyways, at least in my experience. Probably the most useful feature I miss is async notifications, so much so that at one point we hacked pgbouncer to support them. Point being, full transparency is nice, but there are workarounds for most of the major issues and there are a lot of side channel benefits to making your applications 'stateless' (defined as state in application or database but not in between). Absent any other consideration, OP has proven to me that there is massive potential performance gains possible from moving the pooling mechanism into the database core process, and I'm already very excited about not having an extra server process to monitor and worry about. Tracking session state out of process seems pretty complicated and would probably add add complexity or overhead to multiple internal systems. If we get that tor free I'd be all for it but reading Robert's email I'm skeptical there are easy wins here. So +1 for further R&D and -1 for holding things up based on full transparency...no harm in shooting for that, but let's look at things from a cost/benefit perspective (IMO). merlin
On 25.04.2018 17:00, Merlin Moncure wrote: > On Wed, Apr 25, 2018 at 12:34 AM, Christophe Pettus <xof@thebuild.com> wrote: >>> On Apr 24, 2018, at 06:52, Merlin Moncure <mmoncure@gmail.com> wrote: >>> Why does it have to be completely transparent? >> The main reason to move it into core is to avoid the limitations that a non-core pooler has. > The limitations headaches that I suffer with pgbouncer project (which > I love and use often) are mainly administrative and performance > related, not lack of session based server features. Applications that > operate over a very large amount of virtual connections or engage a > very high level of small transaction traffic are going to avoid > session based features for a lot of other reasons anyways, at least in > my experience. Probably the most useful feature I miss is async > notifications, so much so that at one point we hacked pgbouncer to > support them. Point being, full transparency is nice, but there are > workarounds for most of the major issues and there are a lot of side > channel benefits to making your applications 'stateless' (defined as > state in application or database but not in between). > > Absent any other consideration, OP has proven to me that there is > massive potential performance gains possible from moving the pooling > mechanism into the database core process, and I'm already very excited > about not having an extra server process to monitor and worry about. > Tracking session state out of process seems pretty complicated and > would probably add add complexity or overhead to multiple internal > systems. If we get that tor free I'd be all for it but reading > Robert's email I'm skeptical there are easy wins here. So +1 for > further R&D and -1 for holding things up based on full > transparency...no harm in shooting for that, but let's look at things > from a cost/benefit perspective (IMO). > > merlin I did more research and find several other think which will not work with current built-in connection pooling implementation. One you have mentioned: notification mechanism. Another one is advisory locks. Right now I have now idea how to support them for pooled sessions. But I will think about it. But IMHO neither notifications, neither advisory locks are so widely used, comparing with temporary tables and prepared statements... -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
> On Apr 25, 2018, at 07:00, Merlin Moncure <mmoncure@gmail.com> wrote: > The limitations headaches that I suffer with pgbouncer project (which > I love and use often) are mainly administrative and performance > related, not lack of session based server features. For me, the most common issue I run into with pgbouncer (after general administrative overhead of having another moving part)is that it works at cross purposes with database-based sharding, as well as useful role and permissions scheme. Sinceeach server connection is specific to a database/role pair, you are left with some unappealing options to handle thatin a pooling environment. The next most common problem are prepared statements breaking, which certainly qualifies as a session-level feature. -- -- Christophe Pettus xof@thebuild.com
On Wed, Apr 25, 2018 at 9:43 AM, Christophe Pettus <xof@thebuild.com> wrote: > >> On Apr 25, 2018, at 07:00, Merlin Moncure <mmoncure@gmail.com> wrote: >> The limitations headaches that I suffer with pgbouncer project (which >> I love and use often) are mainly administrative and performance >> related, not lack of session based server features. > > For me, the most common issue I run into with pgbouncer (after general administrative overhead of having another movingpart) is that it works at cross purposes with database-based sharding, as well as useful role and permissions scheme. Since each server connection is specific to a database/role pair, you are left with some unappealing options to handlethat in a pooling environment. Would integrated pooling help the sharding case (genuinely curious)? I don't quite have my head around the issue. I've always wanted pgbouncer to be able to do things like round robin queries to non-sharded replica for simple load balancing but it doesn't (yet) have that capability. That type of functionality would not fit into in in-core pooler AIUI. Totally agree that the administrative benefits (user/role/.conf/etc/etc) is a huge win. > The next most common problem are prepared statements breaking, which certainly qualifies as a session-level feature. Yep. The main workaround today is to disable them. Having said that, it's not that difficult to imagine hooking prepared statement creation to a backend starting up (feature: run X,Y,Z SQL before running user queries). This might be be less effort than, uh, moving backend session state to a shareable object. I'll go further; managing cache memory consumption (say for pl/pgsql cached plans) is a big deal for certain workloads. The only really effective way to deal with that is to manage the server connection count and/or recycle server connections on intervals. Using pgbouncer to control backend count is a very effective way to deal with this problem and allowing virtualized connections to each mange there independent cache would be a step in the opposite direction. I very much like having control so that I have exactly 8 backends for my 8 core server with 8 copies of cache. Advisory locks are a completely separate problem. I suspect they might be used more than you realize, and they operate against a very fundamental subsystem of the database: the locking engine. I'm struggling as to why we would take another approach than 'don't use the non-xact variants of them in a pooling environment'. merlin
On Tue, Apr 24, 2018 at 1:00 PM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > My expectation is that there are very few of them which has session-level > lifetime. > Unfortunately it is not so easy to locate all such places. Once such > variables are located, them can be saved in session context and restored on > reschedule. The difficulty of finding them all is really the problem. If we had a reliable way to list everything that needs to be moved into session state, then we could try to come up with a design to do that. Otherwise, we're just swatting issues one by one and I bet we're missing quite a few. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Apr 25, 2018 at 10:00 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > systems. If we get that tor free I'd be all for it but reading > Robert's email I'm skeptical there are easy wins here. So +1 for > further R&D and -1 for holding things up based on full > transparency...no harm in shooting for that, but let's look at things > from a cost/benefit perspective (IMO). If we could look at a patch and say "here are the cases that this patch doesn't handle", then we could perhaps decide "we're OK with that, let's ship the feature and document the limitations". But right now it seems to me that we're looking at a feature where no really systematic effort has been made to list all of the potential failure modes, and I'm definitely not on board with the idea of shipping something with a list of cases that are known to work and an unknown list of failure modes. Konstantin has fixed things here and there, but we don't know how much more there is and don't have a well-designed plan to find all such things. Also, I think it's worth considering that the kinds of failures users will get out of anything that's not handled are really the worst kind. If you have an application that relies on session state other than what his patch knows how to preserve, your application will appear to work in light testing because your connection won't actually be swapped out underneath you -- and then fail unpredictably in production when such swapping occurs. There will be no clear way to tell which error messages or behavior differences are due to limitations of the proposed feature, which ones are due to defects in the application, and which ones might be due to PostgreSQL bugs. They'll all look the same, and even experienced PG hackers won't easily be able to tell whether a message saying "cursor XYZ doesn't exist" (or whatever the case is specifically) is because the application didn't create that cursor and nevertheless tried to use it, or whether it's because the connection pooling facility silently through it out. All of that sounds to me like it's well below the standard I'd expect for a core feature. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Wed, Apr 25, 2018 at 2:58 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Wed, Apr 25, 2018 at 10:00 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> systems. If we get that tor free I'd be all for it but reading >> Robert's email I'm skeptical there are easy wins here. So +1 for >> further R&D and -1 for holding things up based on full >> transparency...no harm in shooting for that, but let's look at things >> from a cost/benefit perspective (IMO). > > Also, I think it's worth considering that the kinds of failures users > will get out of anything that's not handled are really the worst kind. > If you have an application that relies on session state other than > what his patch knows how to preserve, your application will appear to > work in light testing because your connection won't actually be > swapped out underneath you -- and then fail unpredictably in > production when such swapping occurs. There will be no clear way to > tell which error messages or behavior differences are due to > limitations of the proposed feature, which ones are due to defects in > the application, and which ones might be due to PostgreSQL bugs. > They'll all look the same, and even experienced PG hackers won't Connection pooling is not a new phenomenon, and many stacks (in particular java) tend to pool connection by default. All of the problems we discuss here for the most part affect competitive solutions and I humbly submit the tradeoffs are _very_ widely understood. FWICT we get occasional reports that are simply and clearly answered. I guess there are some people dumb enough to flip GUC settings involving seemingly important things in production without testing or reading any documentation or the innumerable articles and blogs that will pop up...hopefully they are self selecting out of the industry :-). Looking at pgbouncer, they produce a chart that says, 'these features don't work, and please consider that before activating this feature' (https://wiki.postgresql.org/wiki/PgBouncer#Feature_matrix_for_pooling_modes) and that ought to be entirely sufficient to avoid that class of problems. This is very clear and simple. The main gripes with pgbouncer FWICT were relating to the postgres JDBC driver's unavoidable tendency (later fixed) to prepare 'BEGIN' causing various problems, which was a bug really (in the JDBC driver) which did in fact spill into this list. For this feature to be really attractive we'd want to simultaneously allow pooled and non-pooled connections on different ports, or even multiple pools (say, for different applications). Looking at things from your perspective, we might want to consider blocking (with error) features that are not 'pooling compatible' if they arrive through a pooled connection. merlin
On Wed, Apr 25, 2018 at 03:42:31PM -0400, Robert Haas wrote: > The difficulty of finding them all is really the problem. If we had a > reliable way to list everything that needs to be moved into session > state, then we could try to come up with a design to do that. > Otherwise, we're just swatting issues one by one and I bet we're > missing quite a few. Hm? We already know about the reset value of a parameter in pg_settings, which points out to the value which would be used if reset in a session, even after ebeing reloaded. If you compare it with the actual setting value, wouldn't that be enough to know which parameters have been changed at session-level by an application once connecting? So you can pull out a list using such comparisons. The context a parameter is associated to can also help. -- Michael
Attachment
On 25.04.2018 20:02, Merlin Moncure wrote: > > Would integrated pooling help the sharding case (genuinely curious)? > I don't quite have my head around the issue. I've always wanted > pgbouncer to be able to do things like round robin queries to > non-sharded replica for simple load balancing but it doesn't (yet) > have that capability. That type of functionality would not fit into > in in-core pooler AIUI. Totally agree that the administrative > benefits (user/role/.conf/etc/etc) is a huge win. Yes, pgbpouncer is not intended to balance workload. You should use ha-proxy or pg-pool. libpq now allow tp specify multiple URLs, but unfortunately right now libpq is not able to perform load balancing. I do not understand how it is related with integrating connection pooling. Such pooler definitely shound be external if you want to scatter queries between different nodes. >> The next most common problem are prepared statements breaking, which certainly qualifies as a session-level feature. > Yep. The main workaround today is to disable them. Having said that, > it's not that difficult to imagine hooking prepared statement creation > to a backend starting up (feature: run X,Y,Z SQL before running user > queries). Sorry, I do not completely understand your idea. Yes, it is somehow possible to simulate session semantic by prepending all session specific commands (mostly setting GUCs) to each SQL statements. But it doesn't work for prepared statements: the idea of prepared statements is that compilation of statement should be done only once. > This might be be less effort than, uh, moving backend > session state to a shareable object. I'll go further; managing cache > memory consumption (say for pl/pgsql cached plans) is a big deal for > certain workloads. The only really effective way to deal with that > is to manage the server connection count and/or recycle server > connections on intervals. Using pgbouncer to control backend count is > a very effective way to deal with this problem and allowing > virtualized connections to each mange there independent cache would be > a step in the opposite direction. I very much like having control so > that I have exactly 8 backends for my 8 core server with 8 copies of > cache. Database performance is mostly limited by disk, so optimal number of backends may be different from number of cores. But certainly possibility to launch "optimal" number of backends is one of the advantages of builtin session pooling. > > Advisory locks are a completely separate problem. I suspect they > might be used more than you realize, and they operate against a very > fundamental subsystem of the database: the locking engine. I'm > struggling as to why we would take another approach than 'don't use > the non-xact variants of them in a pooling environment'. > > merlin -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 26.04.2018 05:09, Michael Paquier wrote: > On Wed, Apr 25, 2018 at 03:42:31PM -0400, Robert Haas wrote: >> The difficulty of finding them all is really the problem. If we had a >> reliable way to list everything that needs to be moved into session >> state, then we could try to come up with a design to do that. >> Otherwise, we're just swatting issues one by one and I bet we're >> missing quite a few. > Hm? We already know about the reset value of a parameter in > pg_settings, which points out to the value which would be used if reset > in a session, even after ebeing reloaded. If you compare it with the > actual setting value, wouldn't that be enough to know which parameters > have been changed at session-level by an application once connecting? > So you can pull out a list using such comparisons. The context a > parameter is associated to can also help. > -- > Michael Sorry, may be I do not understand you correctly. But GUCs are already handled by builtin connection pooler. It is done at guc.c level, so doesn't matter how GUC variable is changed. All modified GUCs are saved into the session context and restored on reschedule. But there are some other static variables which are not related with GUCs. Most of them are really associated with backend, not with session. So them should not be handled by reschedule. But there may be some variables which are intended to be session specific. And locating this variables is really non trivial task. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Thu, Apr 26, 2018 at 6:04 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > On 25.04.2018 20:02, Merlin Moncure wrote: >> Yep. The main workaround today is to disable them. Having said that, >> it's not that difficult to imagine hooking prepared statement creation >> to a backend starting up (feature: run X,Y,Z SQL before running user >> queries). > > Sorry, I do not completely understand your idea. > Yes, it is somehow possible to simulate session semantic by prepending all > session specific commands (mostly setting GUCs) to each SQL statements. > But it doesn't work for prepared statements: the idea of prepared statements > is that compilation of statement should be done only once. The idea is that you have arbitrary SQL that runs when after the backend (postgres binary) is forked from postmaster. This would be an ideal place to introduce prepared statements in a way that is pooling compatible; you still couldn't PREPARE from the application but you'd be free to call already prepared statements (via SQL level EXECUTE or libpq PQexecPrepared()). Of course, if somebody throws a DEALLOCATE or DISCARD ALL, or issues a problematic DROP x CASCADE, you'd be in trouble but that'a not a big deal IMO because you can control for those things in the application. > Database performance is mostly limited by disk, so optimal number of > backends may be different from number of cores. > But certainly possibility to launch "optimal" number of backends is one of > the advantages of builtin session pooling. Sure, but some workloads are cpu limited (all- or mostly- read with data < memory, or very complex queries on smaller datasets). So we would measure configure based one expectations exactly as is done today with pgbouncer. This is a major feature of pgbouncer: being able to _reduce_ the number of session states relative to the number of connections is an important feature; it isolates your database from various unpleasant failure modes such as runaway memory consumption. Anyways, I'm looking at your patch. I see you've separated the client connection count ('sessions') from the server backend instances ('backends') in the GUC. Questions: *) Should non pooled connections be supported simultaneously with pooled connections? *) Should there be multiple pools with independent configurations (yes, please)? *) How are you pinning client connections to an application managed transaction? (IMNSHO, this feature is useless without being able to do that) FYI, it's pretty clear you've got a long road building consensus and hammering out a reasonable patch through the community here. Don't get discouraged -- there is value here, but it's going to take some work. merlin
On Wed, Apr 25, 2018 at 10:09 PM, Michael Paquier <michael@paquier.xyz> wrote: > On Wed, Apr 25, 2018 at 03:42:31PM -0400, Robert Haas wrote: >> The difficulty of finding them all is really the problem. If we had a >> reliable way to list everything that needs to be moved into session >> state, then we could try to come up with a design to do that. >> Otherwise, we're just swatting issues one by one and I bet we're >> missing quite a few. > > Hm? We already know about the reset value of a parameter in > pg_settings, which points out to the value which would be used if reset > in a session, even after ebeing reloaded. If you compare it with the > actual setting value, wouldn't that be enough to know which parameters > have been changed at session-level by an application once connecting? > So you can pull out a list using such comparisons. The context a > parameter is associated to can also help. Uh, there's a lot of session backend state other than GUCs. If the only thing that we needed to worry about were GUCs, this problem would have been solved years ago. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 27.04.2018 16:49, Merlin Moncure wrote: > On Thu, Apr 26, 2018 at 6:04 AM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> On 25.04.2018 20:02, Merlin Moncure wrote: >>> Yep. The main workaround today is to disable them. Having said that, >>> it's not that difficult to imagine hooking prepared statement creation >>> to a backend starting up (feature: run X,Y,Z SQL before running user >>> queries). >> Sorry, I do not completely understand your idea. >> Yes, it is somehow possible to simulate session semantic by prepending all >> session specific commands (mostly setting GUCs) to each SQL statements. >> But it doesn't work for prepared statements: the idea of prepared statements >> is that compilation of statement should be done only once. > The idea is that you have arbitrary SQL that runs when after the > backend (postgres binary) is forked from postmaster. This would be an > ideal place to introduce prepared statements in a way that is pooling > compatible; you still couldn't PREPARE from the application but you'd > be free to call already prepared statements (via SQL level EXECUTE or > libpq PQexecPrepared()). Of course, if somebody throws a DEALLOCATE > or DISCARD ALL, or issues a problematic DROP x CASCADE, you'd be in > trouble but that'a not a big deal IMO because you can control for > those things in the application. As far as I know in this way prepared statements can be now handled by pgbounce in transaction/statement pooling mode. But from my point of view, in most cases this approach is practically unusable. It is very hard to predict from the very beginning all statements applications will want to execute and prepare then at backend start. >> Database performance is mostly limited by disk, so optimal number of >> backends may be different from number of cores. >> But certainly possibility to launch "optimal" number of backends is one of >> the advantages of builtin session pooling. > Sure, but some workloads are cpu limited (all- or mostly- read with > data < memory, or very complex queries on smaller datasets). So we > would measure configure based one expectations exactly as is done > today with pgbouncer. This is a major feature of pgbouncer: being > able to _reduce_ the number of session states relative to the number > of connections is an important feature; it isolates your database from > various unpleasant failure modes such as runaway memory consumption. > > Anyways, I'm looking at your patch. I see you've separated the client > connection count ('sessions') from the server backend instances > ('backends') in the GUC. Questions: > *) Should non pooled connections be supported simultaneously with > pooled connections? > *) Should there be multiple pools with independent configurations (yes, please)? > Right now my prototype supports two modes: 1. All connections are polled. 2. There are several session pools, each bounded to its own port. Connections to the main Postgres port are normal (dedicated). Connections to one of session pools port's are redirected to one of the workers of this page pool. Please notice, that the last version of connection pooler is in https://github.com/postgrespro/postgresql.builtin_pool.git repository. > *) How are you pinning client connections to an application managed > transaction? (IMNSHO, this feature is useless without being able to do > that) Sorry, I do not completely understand the question. Rescheduling is now done at transaction level - it means that backand can not be switched to other session until completing current transaction. The main argument for transaction level pooling is that it allows not worry about heavy weight locks, which are associated with procarray entries. > > FYI, it's pretty clear you've got a long road building consensus and > hammering out a reasonable patch through the community here. Don't > get discouraged -- there is value here, but it's going to take some > work. Thank you. I am absolutely sure that a lot of additional work has to be done before this prototype may become usable. > merlin -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, Apr 27, 2018 at 10:05 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > On 27.04.2018 16:49, Merlin Moncure wrote: >> *) How are you pinning client connections to an application managed >> transaction? (IMNSHO, this feature is useless without being able to do >> that) > > Sorry, I do not completely understand the question. > Rescheduling is now done at transaction level - it means that backand can > not be switched to other session until completing current transaction. > The main argument for transaction level pooling is that it allows not worry > about heavy weight locks, which are associated with procarray entries. I'm confused here...could be language issues or terminology (I'll look at your latest code). Here is how I understand things: Backend=instance of postgres binary Session=application state within postgres binary (temp tables, prepared statement etc) Connection=Client side connection AIUI (I could certainly be wrong), withing connection pooling, ratio of backend/session is still 1:1. The idea is that client connections when they issue SQL to the server reserve a Backend/Session, use it for the duration of a transaction, and release it when the transaction resolves. So many client connections share backends. As with pgbouncer, the concept of session in a traditional sense is not really defined; session state management would be handled within the application itself, or within data within tables, but not within backend private memory. Does that align with your thinking? merlin
On 27.04.2018 18:33, Merlin Moncure wrote: > On Fri, Apr 27, 2018 at 10:05 AM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> On 27.04.2018 16:49, Merlin Moncure wrote: >>> *) How are you pinning client connections to an application managed >>> transaction? (IMNSHO, this feature is useless without being able to do >>> that) >> Sorry, I do not completely understand the question. >> Rescheduling is now done at transaction level - it means that backand can >> not be switched to other session until completing current transaction. >> The main argument for transaction level pooling is that it allows not worry >> about heavy weight locks, which are associated with procarray entries. > I'm confused here...could be language issues or terminology (I'll look > at your latest code). Here is how I understand things: > Backend=instance of postgres binary > Session=application state within postgres binary (temp tables, > prepared statement etc) > Connection=Client side connection Backend is a process, forked by postmaster. > AIUI (I could certainly be wrong), withing connection pooling, ratio > of backend/session is still 1:1. The idea is that client connections > when they issue SQL to the server reserve a Backend/Session, use it > for the duration of a transaction, and release it when the transaction > resolves. So many client connections share backends. As with > pgbouncer, the concept of session in a traditional sense is not really > defined; session state management would be handled within the > application itself, or within data within tables, but not within > backend private memory. Does that align with your thinking? No. Number of sessions is equal to number of client connections. So client is not reserving "Backend/Session" as it happen in pgbouncer. One backend keeps multiple sessions. And for each session it maintains session context which included client's connection. And it is backend's decision transaction of which client it is going to execute now. This is why built-in pooler is able to provide session semantic without backend/session=1:1 requirement. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, Apr 27, 2018 at 11:44 AM, Konstantin Knizhnik <k.knizhnik@postgrespro.ru> wrote: > > > On 27.04.2018 18:33, Merlin Moncure wrote: >> On Fri, Apr 27, 2018 at 10:05 AM, Konstantin Knizhnik >> <k.knizhnik@postgrespro.ru> wrote: >>> On 27.04.2018 16:49, Merlin Moncure wrote: >> I'm confused here...could be language issues or terminology (I'll look >> at your latest code). Here is how I understand things: >> Backend=instance of postgres binary >> Session=application state within postgres binary (temp tables, >> prepared statement etc) >> Connection=Client side connection > > Backend is a process, forked by postmaster. right, we are saying the same thing here. >> AIUI (I could certainly be wrong), withing connection pooling, ratio >> of backend/session is still 1:1. The idea is that client connections >> when they issue SQL to the server reserve a Backend/Session, use it >> for the duration of a transaction, and release it when the transaction >> resolves. So many client connections share backends. As with >> pgbouncer, the concept of session in a traditional sense is not really >> defined; session state management would be handled within the >> application itself, or within data within tables, but not within >> backend private memory. Does that align with your thinking? > > No. Number of sessions is equal to number of client connections. > So client is not reserving "Backend/Session" as it happen in pgbouncer. > One backend keeps multiple sessions. And for each session it maintains > session context which included client's connection. > And it is backend's decision transaction of which client it is going to > execute now. > This is why built-in pooler is able to provide session semantic without > backend/session=1:1 requirement. I see. I'm not so sure that is a good idea in the general sense :(. Connection sharing sessions is normal and well understood, and we have tooling to manage that already (DISCARD). Having the session state abstracted out and pinned to the client connection seems complex and wasteful, at least sometimes. What _I_ (maybe not others) want is a faster pgbouncer that is integrated into the database; IMO it does everything exactly right. merlin
On 27.04.2018 23:43, Merlin Moncure wrote: > On Fri, Apr 27, 2018 at 11:44 AM, Konstantin Knizhnik > <k.knizhnik@postgrespro.ru> wrote: >> >> On 27.04.2018 18:33, Merlin Moncure wrote: >>> On Fri, Apr 27, 2018 at 10:05 AM, Konstantin Knizhnik >>> <k.knizhnik@postgrespro.ru> wrote: >>>> On 27.04.2018 16:49, Merlin Moncure wrote: >>> I'm confused here...could be language issues or terminology (I'll look >>> at your latest code). Here is how I understand things: >>> Backend=instance of postgres binary >>> Session=application state within postgres binary (temp tables, >>> prepared statement etc) >>> Connection=Client side connection >> Backend is a process, forked by postmaster. > right, we are saying the same thing here. > >>> AIUI (I could certainly be wrong), withing connection pooling, ratio >>> of backend/session is still 1:1. The idea is that client connections >>> when they issue SQL to the server reserve a Backend/Session, use it >>> for the duration of a transaction, and release it when the transaction >>> resolves. So many client connections share backends. As with >>> pgbouncer, the concept of session in a traditional sense is not really >>> defined; session state management would be handled within the >>> application itself, or within data within tables, but not within >>> backend private memory. Does that align with your thinking? >> No. Number of sessions is equal to number of client connections. >> So client is not reserving "Backend/Session" as it happen in pgbouncer. >> One backend keeps multiple sessions. And for each session it maintains >> session context which included client's connection. >> And it is backend's decision transaction of which client it is going to >> execute now. >> This is why built-in pooler is able to provide session semantic without >> backend/session=1:1 requirement. > I see. I'm not so sure that is a good idea in the general sense :(. > Connection sharing sessions is normal and well understood, and we have > tooling to manage that already (DISCARD). Having the session state > abstracted out and pinned to the client connection seems complex and > wasteful, at least sometimes. What _I_ (maybe not others) want is a > faster pgbouncer that is integrated into the database; IMO it does > everything exactly right. Yandex's Odyssey is faster version of pgbouncer (supporting multithreading and many other things). Why do you need to integrate it in Postgres if you do not want to preserve session semantic? Just to minimize efforts needed to maintain extra components? But in principle, pooler can be distributed as Postgres extension and is started as background worker. Will it help to eliminate administration overhead of separate page pool? In any case, my built-in pooler isoriented on application which needs session sementic (using temporary tables, GUCs, prepared statements,...) As I many time mentioned, is is possible to provide it only inside database, not in some external pooler, doesn't matter which architecture it has. > > merlin -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, Apr 27, 2018 at 4:43 PM, Merlin Moncure <mmoncure@gmail.com> wrote: > What _I_ (maybe not others) want is a > faster pgbouncer that is integrated into the database; IMO it does > everything exactly right. I have to admit that I find that an amazing statement. Not that pgbouncer is bad technology, but saying that it does everything exactly right seems like a vast overstatement. That's like saying that you don't want running water in your house, just a faster motor for the bucket you use to draw water from the well. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 03.05.2018 20:01, Robert Haas wrote: > On Fri, Apr 27, 2018 at 4:43 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> What _I_ (maybe not others) want is a >> faster pgbouncer that is integrated into the database; IMO it does >> everything exactly right. > I have to admit that I find that an amazing statement. Not that > pgbouncer is bad technology, but saying that it does everything > exactly right seems like a vast overstatement. That's like saying > that you don't want running water in your house, just a faster motor > for the bucket you use to draw water from the well. > May be if you are engaged in agriculture at your country house, then having a well with good motor pump is better for watering of plants than water faucet at your kitchen. But most of homeowners prefer to open a tapto wash hands rather than perform some complex manipulations with motor pump. I absolutely sure that external connection poolers will always have their niche: them can be used as natural proxy between multiple clients and DBMS. Usually HA/load balancing also can be done at this level. But there are many cases when users just do not want to worry about connection pooling: them just has some number of clients (which can be larger enough and several times larger than optimal number of Postgres backends) and them want them to access database without introducing some intermediate layers. In this case built-in connection pooler will be the ideal solution. This is from user's point of view. From Postgres developer's point of view, built-in pooler has some technical advantages comparing with external pooler. Some of this advantages can be eliminated by significant redesign of Postgres architecture, for example introducing shared cache of prepared statements... But in any case, the notion of session context and possibility to maintain larger number of opened sessions will always be topical. Some update on status of built-in connection pooler prototype: I managed to run regression and isolation tests for pooled connections. Right now 6 of 185 tests failed are failed for regression tests and 2 of 67 tests failed for isolation tests. For regression tests result may vary depending on parallel schedule, because of manipulations with roles/permissions which are not currently supported. The best results are for sequential schedule: 5 failed tests: this failures caused by differences in pg_prepared_statements caused by "mangled" prepared names. Failures of isolation tests are caused by unsupported advisory locks. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Thu, May 3, 2018 at 12:01 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, Apr 27, 2018 at 4:43 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> What _I_ (maybe not others) want is a >> faster pgbouncer that is integrated into the database; IMO it does >> everything exactly right. > > I have to admit that I find that an amazing statement. Not that > pgbouncer is bad technology, but saying that it does everything > exactly right seems like a vast overstatement. That's like saying > that you don't want running water in your house, just a faster motor > for the bucket you use to draw water from the well. Well you certainly have a point there; I do have a strong tendency for overstatement :-). Let's put it like this: being able to have connections funnel down to a smaller number of sessions is nice feature. Applications that are large, complex, or super high volume have a tendency towards stateless (with respect to the database session) architecture anyways so I tend not to mind lack of session features when pooling (prepared statements perhaps being the big outlier here). It really opens up a lot of scaling avenues. So better a better phrased statement might be, "I like the way pgbouncer works, in particular transaction mode pooling from the perspective of the applications using it". Current main pain points are the previously mentioned administrative headaches and better performance from a different architecture (pthreads vs libev) would be nice. I'm a little skeptical that we're on the right path if we are pushing a lot of memory consumption into the session level where a session is pinned all the way back to a client connection. plpgsql function plan caches can be particularly hungry on memory and since sessions have their own GUC ISTM each sessions has to have their own set of them since plans depend on search path GUC which is session specific. Previous discussions on managing cache memory consumption (I do dimly recall you making a proposal on that very thing) centrally haven't gone past panning stages AFAIK. If we are breaking 1:1 backend:session relationship, what controls would we have to manage resource consumption? merlin
On Fri, May 4, 2018 at 11:22 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > If we are breaking 1:1 backend:session relationship, what controls > would we have to manage resource consumption? I mean, if you have a large number of sessions open, it's going to take more memory in any design. If there are multiple sessions per backend, there may be some possibility to save memory by allocating it per-backend rather than per-session; it shouldn't be any worse than if you didn't have pooling in the first place. However, I think that's probably worrying about the wrong end of the problem first. IMHO, what we ought to start by doing is considering what a good architecture for this would be, and how to solve the general problem of per-backend session state. If we figure that out, then we could worry about optimizing whatever needs optimizing, e.g. memory usage. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, May 4, 2018 at 2:25 PM, Robert Haas <robertmhaas@gmail.com> wrote: > On Fri, May 4, 2018 at 11:22 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >> If we are breaking 1:1 backend:session relationship, what controls >> would we have to manage resource consumption? > > I mean, if you have a large number of sessions open, it's going to > take more memory in any design. If there are multiple sessions per > backend, there may be some possibility to save memory by allocating it > per-backend rather than per-session; it shouldn't be any worse than if > you didn't have pooling in the first place. It is absolutely worse, or at least can be. plpgsql plan caches can be GUC dependent due to search_path; you might get a different plan depending on which tables resolve into the function. You might rightfully regard this as an edge case but there are other 'leakages', for example, sessions with different planner settings obviously ought not to share backend plans. Point being, there are many interdependent things in the session that will make it difficult to share some portions but not others. > However, I think that's probably worrying about the wrong end of the > problem first. IMHO, what we ought to start by doing is considering > what a good architecture for this would be, and how to solve the > general problem of per-backend session state. If we figure that out, > then we could worry about optimizing whatever needs optimizing, e.g. > memory usage. Exactly -- being able to manage down resource consumption by controlling session count is a major feature that ought not to be overlooked. So I'm kind of signalling that if given a choice between that (funneling a large pool of connections down to a smaller number of backends) and externalized shared sessions I'd rather have the funnel; it solves a number of very important problems with respect to server robustness. So I'm challenging (in a friendly, curious way) if breaking session:backend 1:1 is really a good idea. Maybe a connection pooler implementation can do both of those things or it's unfair to expect an implementation to do both of them. merlin
On 04.05.2018 18:22, Merlin Moncure wrote: > On Thu, May 3, 2018 at 12:01 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, Apr 27, 2018 at 4:43 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >>> What _I_ (maybe not others) want is a >>> faster pgbouncer that is integrated into the database; IMO it does >>> everything exactly right. >> I have to admit that I find that an amazing statement. Not that >> pgbouncer is bad technology, but saying that it does everything >> exactly right seems like a vast overstatement. That's like saying >> that you don't want running water in your house, just a faster motor >> for the bucket you use to draw water from the well. > Well you certainly have a point there; I do have a strong tendency for > overstatement :-). > > Let's put it like this: being able to have connections funnel down to > a smaller number of sessions is nice feature. Applications that are > large, complex, or super high volume have a tendency towards stateless > (with respect to the database session) architecture anyways so I tend > not to mind lack of session features when pooling (prepared statements > perhaps being the big outlier here). It really opens up a lot of > scaling avenues. So better a better phrased statement might be, "I > like the way pgbouncer works, in particular transaction mode pooling > from the perspective of the applications using it". Current main pain > points are the previously mentioned administrative headaches and > better performance from a different architecture (pthreads vs libev) > would be nice. > > I'm a little skeptical that we're on the right path if we are pushing > a lot of memory consumption into the session level where a session is > pinned all the way back to a client connection. plpgsql function plan > caches can be particularly hungry on memory and since sessions have > their own GUC ISTM each sessions has to have their own set of them > since plans depend on search path GUC which is session specific. > Previous discussions on managing cache memory consumption (I do dimly > recall you making a proposal on that very thing) centrally haven't > gone past panning stages AFAIK. > > If we are breaking 1:1 backend:session relationship, what controls > would we have to manage resource consumption? Most of resource consumption is related with backends, not with sessions. It is first of all catalog and relation caches. If there are thousands of tables in a databases, then this caches (which size is not limited now) can grow up to several megabytes. Taken in account, that at modern SMP systems with hundreds of CPU core it may be reasonable to spawn hundreds of backends, total memory footprint of this caches can be very significant. This is why I think that we should move towards shared caches... But this trip is not expected to be so easy. Right now connection pooler allows to handle much more user sessions than there are active backends. So it helps to partly solve this problem with resource consumption. Session context itself is not expected to be very large: changed GUCs + prepared statements. I accept your argument about stateless application architecture. Moreover, this is more or less current state of things: most customers has to use pgbouncer and so have to prohibit to use in their application all session specific stuff. What them are loosing in this case? Prepared statements? But there are really alternative solutions: autoprepare, shared plan cache,... which allow to use prepared statements without session context. Temporary tables, advisory locks,... ? Temporary tables are actually very "ugly" thing, causing a lot of problems: - can not be created at hot standby - cause catalog bloating - deallocation of large number of temporary table may acquire too much locks. ... May be them somehow should be redesigned? For example, have shared ctalog entry for temporary table, but backend-private content... Or make it possible to change lifetime of temporary tables from session to transaction... -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, May 4, 2018 at 5:54 PM, Merlin Moncure <mmoncure@gmail.com> wrote: >> I mean, if you have a large number of sessions open, it's going to >> take more memory in any design. If there are multiple sessions per >> backend, there may be some possibility to save memory by allocating it >> per-backend rather than per-session; it shouldn't be any worse than if >> you didn't have pooling in the first place. > > It is absolutely worse, or at least can be. plpgsql plan caches can > be GUC dependent due to search_path; you might get a different plan > depending on which tables resolve into the function. You might > rightfully regard this as an edge case but there are other 'leakages', > for example, sessions with different planner settings obviously ought > not to share backend plans. Point being, there are many > interdependent things in the session that will make it difficult to > share some portions but not others. I think you may be misunderstanding my remarks. Suppose I've got 10 real connections multiplexed across 1000 sessions. Barring a crazy-stupid implementation, that should never use more memory than 1000 completely separate connections. (How could it?) It will of course use a lot more memory than 10 real connections handling 10 sessions, but that's to be expected. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 05.05.2018 00:54, Merlin Moncure wrote: > On Fri, May 4, 2018 at 2:25 PM, Robert Haas <robertmhaas@gmail.com> wrote: >> On Fri, May 4, 2018 at 11:22 AM, Merlin Moncure <mmoncure@gmail.com> wrote: >>> If we are breaking 1:1 backend:session relationship, what controls >>> would we have to manage resource consumption? >> I mean, if you have a large number of sessions open, it's going to >> take more memory in any design. If there are multiple sessions per >> backend, there may be some possibility to save memory by allocating it >> per-backend rather than per-session; it shouldn't be any worse than if >> you didn't have pooling in the first place. > It is absolutely worse, or at least can be. plpgsql plan caches can > be GUC dependent due to search_path; you might get a different plan > depending on which tables resolve into the function. You might > rightfully regard this as an edge case but there are other 'leakages', > for example, sessions with different planner settings obviously ought > not to share backend plans. Point being, there are many > interdependent things in the session that will make it difficult to > share some portions but not others. Right now, in my built-in connection pool implementation there is shared prepared statements cache for all sessions in one backend, but actually each session has its own set of prepared statements. I just append session identifier to prepared statement name to make it unique. So there is no problem with different execution plans for different clients caused by specific GUC settings (like enable_seqscan or max_parallel_workers_per_gather). But the primary reason for such behavior is to avoid prepared statements name conflicts between different clients. From my point of view, there are very few cases when using client-specific plans has any sense. In most cases, requirement is quite opposite: I want to be able to prepare execution plan (using missed in Postgres hints, GUCs, adjusted statistic,...) which will be used by all clients. The most natural and convenient way to achieve it is to use shared plan cache. But shared plan cache is a different story, not directly related with connection pooling.\ > Point being, there are many interdependent things in the session that will make it difficult to share some portions butnot others. I do not see so much such things... Yes, GUCs can affect behavior within session. But GUCs are now supported: each session can have its own set of GUCs. Prepared plans may depend on GUCs, but them are also private for each session now. What else? And in any case, with external connection pooler you are not able to use session semantic at all: GUCs, prepared statements, temporary table, advisory locks,... with built-in connection pooler you can use sessions but with some restrictions (lack of advisory locks, for example). It is better than nothing, isn't it? -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On Fri, May 4, 2018 at 03:25:15PM -0400, Robert Haas wrote: > On Fri, May 4, 2018 at 11:22 AM, Merlin Moncure <mmoncure@gmail.com> wrote: > > If we are breaking 1:1 backend:session relationship, what controls > > would we have to manage resource consumption? > > I mean, if you have a large number of sessions open, it's going to > take more memory in any design. If there are multiple sessions per > backend, there may be some possibility to save memory by allocating it > per-backend rather than per-session; it shouldn't be any worse than if > you didn't have pooling in the first place. > > However, I think that's probably worrying about the wrong end of the > problem first. IMHO, what we ought to start by doing is considering > what a good architecture for this would be, and how to solve the > general problem of per-backend session state. If we figure that out, > then we could worry about optimizing whatever needs optimizing, e.g. > memory usage. Yes, I think this matches my previous question --- if we are going to swap out session state to allow multiple sessions to multiplex in the same OS process, and that swapping has similar overhead to how the OS swaps processes, why not just let the OS continue doing the process swapping. I think we need to first find out what it is that makes high session counts slow. For example, if we swap out session state, will we check the visibility rules for the swapped out session. If not, and that is what makes swapping session state make Postgres faster, let's just find a way to skip checking visibility rules for inactive sessions and get the same benefit more simply. -- Bruce Momjian <bruce@momjian.us> http://momjian.us EnterpriseDB http://enterprisedb.com + As you are, so once was I. As I am, so you will be. + + Ancient Roman grave inscription +
On Thu, May 17, 2018 at 9:09 PM, Bruce Momjian <bruce@momjian.us> wrote: >> However, I think that's probably worrying about the wrong end of the >> problem first. IMHO, what we ought to start by doing is considering >> what a good architecture for this would be, and how to solve the >> general problem of per-backend session state. If we figure that out, >> then we could worry about optimizing whatever needs optimizing, e.g. >> memory usage. > > Yes, I think this matches my previous question --- if we are going to > swap out session state to allow multiple sessions to multiplex in the > same OS process, and that swapping has similar overhead to how the OS > swaps processes, why not just let the OS continue doing the process > swapping. > > I think we need to first find out what it is that makes high session > counts slow. For example, if we swap out session state, will we check > the visibility rules for the swapped out session. If not, and that is > what makes swapping session state make Postgres faster, let's just find > a way to skip checking visibility rules for inactive sessions and get > the same benefit more simply. I don't think we're really in agreement. I am pretty convinced that we can save a lot of memory and other CPU resources if we don't need a separate process for each session. I don't have any doubt that the benefit is there. My point is rather that we need an organized way to attach the problem of saving and restoring session state, not an ad-hoc approach for each particular kind of session state. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
New versions of built-in connection pool is attached to this mail. Now client's startup package is received by one of listener workers and postmater knows database/user name of the recevied connection and so is able to marshal it to the proper connection pool. Right now SSL is not supported. Also I provided some general mechanism for moving static variables to session context. File include/storage/sessionvars.h contains list of such variables which are stored to session context on reschedule. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
I continue work on built-in connection pooler. I have implemented three strategies for splitting sessions between session pool workers: - random - round-robin - load balancing (choose server with minimal wait queue size) It is still not fixing the main drawback of the current implementation of built-in pooler: long transaction or query can block all other sessions scheduled to this backend. To prevent such situation we have to somehow migrate session to some other (idle) backends. Unfortunately session should take with it a lot of "luggage": serialized GUCs, prepared statements and, worst of all, temporary tables. If first two in principle can be handled, what to do with temporary table is unclear. Frankly speaking I think that implementation of temporary tables in Postgres has to be rewritten in any case. Them are causing catalog blow, can not be used in parallel queries,... May be in case of such rewriting of temporary tables implementation them can be better marries with built-on connection pooler. But right now sessions can not be migrated. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On 25.10.2018 11:53, Konstantin Knizhnik wrote: > I continue work on built-in connection pooler. > I have implemented three strategies for splitting sessions between > session pool workers: > - random > - round-robin > - load balancing (choose server with minimal wait queue size) > > It is still not fixing the main drawback of the current implementation > of built-in pooler: long transaction or query can block all other > sessions > scheduled to this backend. To prevent such situation we have to > somehow migrate session to some other (idle) backends. > Unfortunately session should take with it a lot of "luggage": > serialized GUCs, prepared statements and, worst of all, temporary tables. > If first two in principle can be handled, what to do with temporary > table is unclear. > > Frankly speaking I think that implementation of temporary tables > in Postgres has to be rewritten in any case. Them are causing catalog > blow, can not be used in parallel queries,... > May be in case of such rewriting of temporary tables implementation > them can be better marries with built-on connection pooler. > But right now sessions can not be migrated. > Updated version of builtin connection pooler fixing issue with open file descriptors limit exhaustion. -- Konstantin Knizhnik Postgres Professional: http://www.postgrespro.com The Russian Postgres Company