Thread: Held idle connections vs use of a Pooler

Held idle connections vs use of a Pooler

From
mark
Date:
Hello,

I am relatively new to postgres (just a few months) so apologies if
any of you are bearing with me.

I am trying to get a rough idea of the amount of bang for the buck I
might see if I put in a connection pooling service into the enviroment
vs our current methodology of using persistent open connections.

We have a number of in house applications that connect to a central
Postgres instance. (8.3.7). The box is admitting underpowered with
only 8 cores, and 8gb or ram and not great disk IO out of an MSA-70.
the database is about 35GB on disk and does mainly (~95%) OTLP type
queries. I am currently begging for more ram.

Most of the connections from the various apps hold idle connections
until they need to execute a query once done go back to holding an
open idle connection.  (there are ~600 open connections at any given
time, and most of the time most are idle)

this is typically fine while the number of active queries is low, but
some other application (that doesn't use connection pooling or holding
open connections when not in use) is hitting the db from time to time
with 50-100 small queries (2ms queries from my testing) nearly all at
once. when this happens the whole response time goes out the door
however).


I think from reading this list for a few weeks the answer is move to
using connection pooling package elsewhere to better manage incoming
connections, with a lower number to the db.

I am told this will require some re-working of some app code as I
understand pg-pool was tried a while back in our QA environment and
server parts of various in-house apps/scripts/..etc started to
experience show stopping problems.

to help make my case to the devs and various managers I was wondering
if someone could expand on what extra work is having to be done while
queries run and there is a high (500-600) number of open yet idle
connections to db. lots of the queries executed use sub-transactions
if that makes a difference.


basically what I am paying extra for with that many persistent
connections, that I might save if I go to the effort of getting the
in-house stuff to make use of a connection pooler ?


thank you for your time.

..: mark

Re: Held idle connections vs use of a Pooler

From
Merlin Moncure
Date:
On Tue, Sep 14, 2010 at 12:10 PM, mark <dvlhntr@gmail.com> wrote:
> Hello,
>
> I am relatively new to postgres (just a few months) so apologies if
> any of you are bearing with me.
>
> I am trying to get a rough idea of the amount of bang for the buck I
> might see if I put in a connection pooling service into the enviroment
> vs our current methodology of using persistent open connections.
>
> We have a number of in house applications that connect to a central
> Postgres instance. (8.3.7). The box is admitting underpowered with
> only 8 cores, and 8gb or ram and not great disk IO out of an MSA-70.
> the database is about 35GB on disk and does mainly (~95%) OTLP type
> queries. I am currently begging for more ram.
>
> Most of the connections from the various apps hold idle connections
> until they need to execute a query once done go back to holding an
> open idle connection.  (there are ~600 open connections at any given
> time, and most of the time most are idle)

This is IMO a strong justification for a connection pooler.  Certain
classes of problems will go away and you will have a more responsive
server under high load conditions.

> this is typically fine while the number of active queries is low, but
> some other application (that doesn't use connection pooling or holding
> open connections when not in use) is hitting the db from time to time
> with 50-100 small queries (2ms queries from my testing) nearly all at
> once. when this happens the whole response time goes out the door
> however).
>
>
> I think from reading this list for a few weeks the answer is move to
> using connection pooling package elsewhere to better manage incoming
> connections, with a lower number to the db.
>
> I am told this will require some re-working of some app code as I
> understand pg-pool was tried a while back in our QA environment and
> server parts of various in-house apps/scripts/..etc started to
> experience show stopping problems.

What types of problems did you have?  Performance related or bugs
stemming from changes in the way your pooler runs the queries?  What
kind of session level objects (like prepared statements) do you rely
on? The answer to this question will affect the feasibility of using a
pooler, or which one you use.  pgbouncer in transaction mode is a
great choice if you can live under the restrictions -- it's almost
completely transparent.  pgpool I'm not nearly as familiar with.

> to help make my case to the devs and various managers I was wondering
> if someone could expand on what extra work is having to be done while
> queries run and there is a high (500-600) number of open yet idle
> connections to db. lots of the queries executed use sub-transactions
> if that makes a difference.

General note: queries with subtransactions (savepoints or pl/pgsql
exception handlers) are much more expensive than those without.   I
would maybe be trying to batch work in your load spike somehow or
working it so that retries are done in the app vs the database.

merlin

Re: Held idle connections vs use of a Pooler

From
"Joshua D. Drake"
Date:
On Tue, 2010-09-14 at 10:10 -0600, mark wrote:
> Hello,
>
> I am relatively new to postgres (just a few months) so apologies if
> any of you are bearing with me.
>
> I am trying to get a rough idea of the amount of bang for the buck I
> might see if I put in a connection pooling service into the enviroment
> vs our current methodology of using persistent open connections.

Well what a pooler does is provide persisten open connections that can
be reused. What tech are you using for these persisten open connections?


> Most of the connections from the various apps hold idle connections
> until they need to execute a query once done go back to holding an
> open idle connection.  (there are ~600 open connections at any given
> time, and most of the time most are idle)

Sounds like each app is holding its own pool?


> I think from reading this list for a few weeks the answer is move to
> using connection pooling package elsewhere to better manage incoming
> connections, with a lower number to the db.

Correct, because each connection is overhead. If you have 600
connections, of which really only 20 are currently executing, that is
highly inefficient.

A pooler would have say, 40 connections open, with 20 currently
executing and a max pool of 600.

>
> I am told this will require some re-working of some app code as I
> understand pg-pool was tried a while back in our QA environment and
> server parts of various in-house apps/scripts/..etc started to
> experience show stopping problems.

Use pgbouncer. It is what Skype uses.

Sincerely,

Joshua D. Drake


--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt

Re: Held idle connections vs use of a Pooler

From
Craig James
Date:
On 9/14/10 9:10 AM, mark wrote:
> Hello,
>
> I am relatively new to postgres (just a few months) so apologies if
> any of you are bearing with me.
>
> I am trying to get a rough idea of the amount of bang for the buck I
> might see if I put in a connection pooling service into the enviroment
> vs our current methodology of using persistent open connections.
>
> We have a number of in house applications that connect to a central
> Postgres instance. (8.3.7). The box is admitting underpowered with
> only 8 cores, and 8gb or ram and not great disk IO out of an MSA-70.
> the database is about 35GB on disk and does mainly (~95%) OTLP type
> queries. I am currently begging for more ram.
>
> Most of the connections from the various apps hold idle connections
> until they need to execute a query once done go back to holding an
> open idle connection.  (there are ~600 open connections at any given
> time, and most of the time most are idle)
>
> this is typically fine while the number of active queries is low, but
> some other application (that doesn't use connection pooling or holding
> open connections when not in use) is hitting the db from time to time
> with 50-100 small queries (2ms queries from my testing) nearly all at
> once. when this happens the whole response time goes out the door
> however).

While connection pooling may be a good answer for you, there also appears to be a problem/bug in 8.3.x that may be
bitingyou.  My installation is very similar to yours (hundreds of idle "lightweight" connections, occasional heavy use
bycertain apps).  Look at this thread: 

   http://archives.postgresql.org/pgsql-performance/2010-04/msg00071.php

On the server that's been upgraded to 8.4.4, we're not seeing this problem.  But it's not in full production yet, so I
can'tsay for sure that the CPU spikes are gone. 

(Unfortunately, the archives.postgresql.org HTML formatting is horrible -- why on Earth can't it wrap lines?)

Craig

>
>
> I think from reading this list for a few weeks the answer is move to
> using connection pooling package elsewhere to better manage incoming
> connections, with a lower number to the db.
>
> I am told this will require some re-working of some app code as I
> understand pg-pool was tried a while back in our QA environment and
> server parts of various in-house apps/scripts/..etc started to
> experience show stopping problems.
>
> to help make my case to the devs and various managers I was wondering
> if someone could expand on what extra work is having to be done while
> queries run and there is a high (500-600) number of open yet idle
> connections to db. lots of the queries executed use sub-transactions
> if that makes a difference.
>
>
> basically what I am paying extra for with that many persistent
> connections, that I might save if I go to the effort of getting the
> in-house stuff to make use of a connection pooler ?
>
>
> thank you for your time.
>
> ..: mark
>


Re: Held idle connections vs use of a Pooler

From
"Joshua D. Drake"
Date:
On Tue, 2010-09-14 at 10:10 -0600, mark wrote:
> Hello,
>
> I am relatively new to postgres (just a few months) so apologies if
> any of you are bearing with me.
>
> I am trying to get a rough idea of the amount of bang for the buck I
> might see if I put in a connection pooling service into the enviroment
> vs our current methodology of using persistent open connections.

Well what a pooler does is provide persisten open connections that can
be reused. What tech are you using for these persisten open connections?


> Most of the connections from the various apps hold idle connections
> until they need to execute a query once done go back to holding an
> open idle connection.  (there are ~600 open connections at any given
> time, and most of the time most are idle)

Sounds like each app is holding its own pool?


> I think from reading this list for a few weeks the answer is move to
> using connection pooling package elsewhere to better manage incoming
> connections, with a lower number to the db.

Correct, because each connection is overhead. If you have 600
connections, of which really only 20 are currently executing, that is
highly inefficient.

A pooler would have say, 40 connections open, with 20 currently
executing and a max pool of 600.

>
> I am told this will require some re-working of some app code as I
> understand pg-pool was tried a while back in our QA environment and
> server parts of various in-house apps/scripts/..etc started to
> experience show stopping problems.

Use pgbouncer. It is what Skype uses.

Sincerely,

Joshua D. Drake


--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering
http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt