Re: Built-in connection pooling - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Built-in connection pooling
Date
Msg-id b6763aa0-765a-8b08-e78d-01e401a7bd98@2ndquadrant.com
Whole thread Raw
In response to Re: Built-in connection pooling  (Konstantin Knizhnik <k.knizhnik@postgrespro.ru>)
Responses Re: Built-in connection pooling
List pgsql-hackers

On 01/22/2018 05:05 PM, Konstantin Knizhnik wrote:
> 
> 
> On 19.01.2018 20:28, Tomas Vondra wrote:
>>>
>>> With pgbouncer you will never be able to use prepared statements which
>>> slows down simple queries almost twice (unless my patch with
>>> autoprepared statements is committed).
>>>
>> I don't see why that wouldn't be possible? Perhaps not for prepared
>> statements with simple protocol, but I'm pretty sure it's doable for
>> extended protocol (which seems like a reasonable limitation).
>>
>> That being said, I think it's a mistake to turn this thread into a
>> pgbouncer vs. the world battle. I could name things that are possible
>> only with standalone connection pool - e.g. pausing connections and
>> restarting the database without interrupting the clients.
>>
>> But that does not mean built-in connection pool is not useful.
>>
>>
>> regards
>>
> 
> Sorry, I do not understand how extended protocol can help to handle 
> prepared statements without shared prepared statement cache or
> built-in connection pooling.
>

The extended protocol makes it easy for pgbouncer (or any other proxy)
to identify prepared statements, so that it can track (a) which prepared
statements a client defined, and (b) what prepared statements are
defined on a connection. And then do something when a client gets
assigned a connection missing some of those.

I do not claim doing this would be trivial, but I don't see why would
that be impossible.

Of course, the built-in pool can handle this in different ways, as it
has access to the internal caches.

> The problems is that now in Postgres most of caches including catalog
> cache, relation cache, prepared statements cache are private to a backend.

True. I wouldn't say it's a "problem" but it's certainly a challenge for
certain features.

> There is certainly one big advantage of such approach: no need to
> synchronize access to the cache. But it seems to be the only advantage.
> And there are a lot of drawbacks:
> inefficient use of memory, complex invalidation mechanism, not
> compatible with connection pooling...
> 

Perhaps. I personally see the minimal synchronization as a quite
valuable feature.

> So there are three possible ways (may be more, but I know only three):
> 1. Implement built-in connection pooling which will be aware of proper
> use of local caches. This is what I have implemented with the proposed
> approach.
> 2. Implicit autoprepare. Clients will not be able to use standard
> Postgres prepare mechanism, but executor will try to generate generic
> plan for ordinary queries. My implementation of this approach is at
> commit fest.
> 3. Global caches. It seems to be the best solution but the most
> difficult to implement.
> 

Perhaps.

> Actually I think that the discussion about the value of built-in
> connection pooling is very important.

I agree, and I wasn't speaking against built-in connection pooling.

> Yes, external connection pooling is more flexible. It allows to 
> perform pooling either at client side either at server side (or even 
> combine two approaches).>
> Also external connection pooling for PostgreSQL is not limited by 
> pgbouncer/pgpool.>
> There are many frameworks maintaining their own connection pool, for 
> example J2EE, jboss, hibernate,...>
> I have a filling than about 70% of enterprise systems working with 
> databases are written in Java and doing connection pooling in their 
> own way.>

True, but that does not really mean we don't need "our" connection
pooling (built-in or not). The connection pools are usually built into
the application servers, so each application server has their own
independent pool. With larger deployments (a couple of application
servers) that quickly causes problems with max_connections.

> So may be embedded connection pooling is not needed for such
> applications...
>
> But what I have heard from main people is that Postgres' poor
> connection pooling is one of the main drawbacks of Postgres
> complicating it's usage in enterprise environments.
> 

Maybe. I'm sure there's room for improvement.

That being said, when enterprise developers tell me PostgreSQL is
missing some feature, 99% of the time it turns out they're doing
something quite stupid.

> In any case please find updated patch with some code cleanup and
> more comments added.
> 

OK, will look.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Remove PARTIAL_LINKING?
Next
From: Peter Eisentraut
Date:
Subject: Re: Add %r substitution for psql prompts to show recovery status