Thread: max_connections proposal

max_connections proposal

From
Craig Ringer
Date:
There might be a very cheap and simple way to help reduce the number of
people running into problems because they set massive max_connections
values that their server cannot cope with instead of using pooling.

In the default postgresql.conf, change:

max_connections = 100                   # (change requires restart)
# Note:  Increasing max_connections costs ~400 bytes of shared memory
# per connection slot, plus lock space (see max_locks_per_transaction).

to:

max_connections = 100                   # (change requires restart)
# WARNING: If you're about to increase max_connections above 100, you
# should probably be using a connection pool instead. See:
#     http://wiki.postgresql.org/max_connections
#
# Note:  Increasing max_connections costs ~400 bytes of shared memory
# per connection slot, plus lock space (see max_locks_per_transaction).
#


... where wiki.postgresql.org/max_connections (which doesn't yet exist)
explains the throughput costs of too many backends and the advantages of
configuring a connection pool instead.

Sure, this somewhat contravenes the "users don't read - ever" principle,
but we can hope that _some_ people will read a comment immediately
beside the directive they're modifying.

--
Craig Ringer

Re: max_connections proposal

From
Merlin Moncure
Date:
On Wed, May 25, 2011 at 10:58 PM, Craig Ringer
<craig@postnewspapers.com.au> wrote:
> There might be a very cheap and simple way to help reduce the number of
> people running into problems because they set massive max_connections values
> that their server cannot cope with instead of using pooling.
>
> In the default postgresql.conf, change:
>
> max_connections = 100                   # (change requires restart)
> # Note:  Increasing max_connections costs ~400 bytes of shared memory
> # per connection slot, plus lock space (see max_locks_per_transaction).
>
> to:
>
> max_connections = 100                   # (change requires restart)
> # WARNING: If you're about to increase max_connections above 100, you
> # should probably be using a connection pool instead. See:
> #     http://wiki.postgresql.org/max_connections
> #
> # Note:  Increasing max_connections costs ~400 bytes of shared memory
> # per connection slot, plus lock space (see max_locks_per_transaction).
> #
>
>
> ... where wiki.postgresql.org/max_connections (which doesn't yet exist)
> explains the throughput costs of too many backends and the advantages of
> configuring a connection pool instead.
>
> Sure, this somewhat contravenes the "users don't read - ever" principle, but
> we can hope that _some_ people will read a comment immediately beside the
> directive they're modifying.

+1 on this idea, although I'm not so sure it's a good idea to point to
the wiki.  Also, all other .conf explanation is in the standard docs,
so maybe this should be too.

merlin

Re: max_connections proposal

From
Tom Lane
Date:
Craig Ringer <craig@postnewspapers.com.au> writes:
> max_connections = 100                   # (change requires restart)
> # WARNING: If you're about to increase max_connections above 100, you
> # should probably be using a connection pool instead. See:
> #     http://wiki.postgresql.org/max_connections

This gives the impression that performance is great at 100 and falls off
a cliff at 101, which is both incorrect and likely to lower peoples'
opinion of the software.  I'd suggest wording more like "if you're
considering raising max_connections into the thousands, you should
probably use a connection pool instead".  And I agree with Merlin that a
wiki pointer is inappropriate.

            regards, tom lane

Re: max_connections proposal

From
Craig Ringer
Date:
On 05/26/2011 09:48 PM, Tom Lane wrote:
> Craig Ringer<craig@postnewspapers.com.au>  writes:
>> max_connections = 100                   # (change requires restart)
>> # WARNING: If you're about to increase max_connections above 100, you
>> # should probably be using a connection pool instead. See:
>> #     http://wiki.postgresql.org/max_connections
>
> This gives the impression that performance is great at 100 and falls off
> a cliff at 101, which is both incorrect and likely to lower peoples'
> opinion of the software.

Fair call; the use of a specific value is misleading.

> I'd suggest wording more like "if you're
> considering raising max_connections into the thousands, you should
> probably use a connection pool instead".

Best performance is often obtained with the number of _active_
connections in the 10s to 30s on commonplace hardware. I'd want to use
"hundreds" - because mailing list posts etc suggest that people start
running into problems under load at the 400-500 mark, and more
importantly because it's well worth moving to pooling _way_ before that
point.

> And I agree with Merlin that a
> wiki pointer is inappropriate.

That does make sense.

--
Craig Ringer

Re: max_connections proposal

From
Tom Lane
Date:
Craig Ringer <craig@postnewspapers.com.au> writes:
> On 05/26/2011 09:48 PM, Tom Lane wrote:
>> Craig Ringer<craig@postnewspapers.com.au>  writes:
>>> max_connections = 100                   # (change requires restart)
>>> # WARNING: If you're about to increase max_connections above 100, you
>>> # should probably be using a connection pool instead. See:
>>> #     http://wiki.postgresql.org/max_connections

>> This gives the impression that performance is great at 100 and falls off
>> a cliff at 101, which is both incorrect and likely to lower peoples'
>> opinion of the software.

> Fair call; the use of a specific value is misleading.

>> I'd suggest wording more like "if you're
>> considering raising max_connections into the thousands, you should
>> probably use a connection pool instead".

> Best performance is often obtained with the number of _active_
> connections in the 10s to 30s on commonplace hardware. I'd want to use
> "hundreds" - because mailing list posts etc suggest that people start
> running into problems under load at the 400-500 mark, and more
> importantly because it's well worth moving to pooling _way_ before that
> point.

OK, maybe word it as "If you're considering raising max_connections much
above 100, ..." ?

            regards, tom lane

Re: max_connections proposal

From
Cédric Villemain
Date:
2011/5/27 Tom Lane <tgl@sss.pgh.pa.us>:
> Craig Ringer <craig@postnewspapers.com.au> writes:
>> On 05/26/2011 09:48 PM, Tom Lane wrote:
>>> Craig Ringer<craig@postnewspapers.com.au>  writes:
>>>> max_connections = 100                   # (change requires restart)
>>>> # WARNING: If you're about to increase max_connections above 100, you
>>>> # should probably be using a connection pool instead. See:
>>>> #     http://wiki.postgresql.org/max_connections
>
>>> This gives the impression that performance is great at 100 and falls off
>>> a cliff at 101, which is both incorrect and likely to lower peoples'
>>> opinion of the software.
>
>> Fair call; the use of a specific value is misleading.
>
>>> I'd suggest wording more like "if you're
>>> considering raising max_connections into the thousands, you should
>>> probably use a connection pool instead".
>
>> Best performance is often obtained with the number of _active_
>> connections in the 10s to 30s on commonplace hardware. I'd want to use
>> "hundreds" - because mailing list posts etc suggest that people start
>> running into problems under load at the 400-500 mark, and more
>> importantly because it's well worth moving to pooling _way_ before that
>> point.
>
> OK, maybe word it as "If you're considering raising max_connections much
> above 100, ..." ?

"Be aware that a too large value can be counter-productive and a
connection pooler can be more appropriate."

No scale... I am really happy to face more and more servers where
'top'  truncate the list of processors... We will have to scale and
not make that limitation a feature, imho.

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: max_connections proposal

From
Stuart Bishop
Date:
On Fri, May 27, 2011 at 6:22 AM, Craig Ringer
<craig@postnewspapers.com.au> wrote:

> Best performance is often obtained with the number of _active_ connections
> in the 10s to 30s on commonplace hardware. I'd want to use "hundreds" -
> because mailing list posts etc suggest that people start running into
> problems under load at the 400-500 mark, and more importantly because it's
> well worth moving to pooling _way_ before that point.

If you can. I'd love a connection pool that knows when I have a
resource that persists across transactions like a cursor or temporary
table and the backend connection needs to be maintained between
transactions, or if there are no such resources and the backend
connection can be released to the pool between transactions. I suspect
this sort of pool would need to be built into the core. At the moment
I only see a benefit with a pool from connections from my webapp which
I know can safely go through pgbouncer in transaction pooling mode.

Or would there be some way of detecting if the current session has
access to stuff that persists across transactions and this feature
could be added to the existing connection pools?


--
Stuart Bishop <stuart@stuartbishop.net>
http://www.stuartbishop.net/

Re: max_connections proposal

From
Greg Smith
Date:
On 05/25/2011 10:58 PM, Craig Ringer wrote:
> max_connections = 100                   # (change requires restart)
> # WARNING: If you're about to increase max_connections above 100, you
> # should probably be using a connection pool instead. See:
> #     http://wiki.postgresql.org/max_connections
> #
> # Note:  Increasing max_connections costs ~400 bytes of shared memory
> # per connection slot, plus lock space (see max_locks_per_transaction).

While tempting to do something this simple, the most useful path to
follow is probably to nail this head-on and comprehensively in the docs
instead.  Discussion of this topic on the hackers list seems to have
concluded that connection pooling isn't as vital to do inside the
database, as a high priority relative to other development, because it's
addressed so well via external projects.  Pointing people toward them
seems quite appropriate given that position.  Really addressing this
well would take the following steps:

-Add a section to the external projects section of the documentation:
http://www.postgresql.org/docs/current/interactive/external-projects.html introducing
connection pooling as a useful type of additional software to add.
Shouldn't be controversial to suggest pgbouncer and pgpool-II as
examples there.

-Expand the documentation on max_connections to warn about how snapshot
visibility overhead makes extremely large numbers of connections impractical

-Also expand the documentation to suggest that CPU switching
inefficiency may make a much smaller number of connections than expected
optimal, and point toward the external project section for more
information about pooling.

-Add a warning to the postgresql.conf suggesting people read the
documentation for max_connections before increasing this value.

This area pops up enough that I've made a discussion of it part of even
my shortest talk about PostgreSQL performance issues to be wary of.
There's a good documentation patch project for somebody here, I just
haven't had time to get to it yet.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books


Re: max_connections proposal

From
Edison So
Date:
Can anyone tell me that if the max_connections is above 100, the server will use pooling instead?
 
For all participants in this particular dsicuss, what is the reasonable value for max_connections without causing any harm to the Postgres 9.0 server.
 
I am a nonvice Postgres user so any advice is always welcomed.
 
Thanks,

On Wed, May 25, 2011 at 10:58 PM, Craig Ringer <craig@postnewspapers.com.au> wrote:
There might be a very cheap and simple way to help reduce the number of people running into problems because they set massive max_connections values that their server cannot cope with instead of using pooling.

In the default postgresql.conf, change:

max_connections = 100                   # (change requires restart)
# Note:  Increasing max_connections costs ~400 bytes of shared memory
# per connection slot, plus lock space (see max_locks_per_transaction).

to:

max_connections = 100                   # (change requires restart)
# WARNING: If you're about to increase max_connections above 100, you
# should probably be using a connection pool instead. See:
#     http://wiki.postgresql.org/max_connections
#
# Note:  Increasing max_connections costs ~400 bytes of shared memory
# per connection slot, plus lock space (see max_locks_per_transaction).
#


... where wiki.postgresql.org/max_connections (which doesn't yet exist) explains the throughput costs of too many backends and the advantages of configuring a connection pool instead.

Sure, this somewhat contravenes the "users don't read - ever" principle, but we can hope that _some_ people will read a comment immediately beside the directive they're modifying.

--
Craig Ringer

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general



--
Edison

Re: max_connections proposal

From
Craig Ringer
Date:
On 29/05/2011 10:44 AM, Edison So wrote:
> Can anyone tell me that if the max_connections is above 100, the server
> will use pooling instead?

No. PostgreSQL does not have any built-in connection pooling, that was
the point of the suggestion, to advise people that they might want to
consider it.

You should _consider_ using connection pooling instead of high numbers
of connections if your application is suitable. You will usually get
better throughput and often get better overall query latency if you
configure lower max_connections and then use a connection pool like
pgbouncer or PgPool-II.

Many people using high max_connections are using PHP and pg_pconnect.
Those people should particularly consider using a connection pool
instead of increasing max_connections . Most people who have performance
issues due to overload seem to have this setup.

A few features aren't suitable for pooling, including LISTEN/NOTIFY,
advisory locking, and named server-side prepared statements (explicit
SQL "PREPARE").

> For all participants in this particular dsicuss, what is the reasonable
> value for max_connections without causing any harm to the Postgres 9.0
> server.

It's dependent on your workload, the capacity of your server, whether
requests come in batches or continuously, and all sorts of other things.
That's why Tom (wisely) pointed out that naming a number was a really
bad idea, even if it was intended only as a vague hint.

Some people on this list clearly run production servers with
max_connections in the several-hundreds without any problems. Others
have posted asking for help with server load, stalls and memory
exhaustion when using only 250 connections.

There's a big difference between an Amazon EC2 node and a real server
with a local, big, fast RAID10 array. The former might practically melt
down with a configuration that would not be enough to push the latter
even close to its limits.

I'm beginning to suspect that the comment I suggested is a bad idea as
currently constructed. Maybe the problem cannot be even hinted at in a
single short paragraph without creating more confusion than it solves.
Something is needed, but perhaps it should just a be a pointer to the
documentation:

max_connections = 50
# Thinking of increasing this? Read http://some-documentation-url first!


--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Re: max_connections proposal

From
Craig Ringer
Date:
On 29/05/2011 4:39 PM, Craig Ringer wrote:
> On 29/05/2011 10:44 AM, Edison So wrote:
>> Can anyone tell me that if the max_connections is above 100, the server
>> will use pooling instead?
>
> No. PostgreSQL does not have any built-in connection pooling, that was
> the point of the suggestion, to advise people that they might want to
> consider it.

Whoops, bad wording.

"That was the point of my original suggestion: to advise people that
they might want to consider configuring a third-party connection pool
like PgPool-II or PgBouncer instead of greatly increasing max_connections ".

--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/

Re: max_connections proposal

From
Edison So
Date:
Thanks Graig for your comprehensive explanation although I do not understanding everything you said such as pgbouncer and pg_connect. I have just started to use Postgres 9.0 with no prior training.
 
I live in Canada and where I live has no instructor-led training on Postgres 9.0 with replication. Can you tell where I can get one. I just want a Postgres 9.x administration course which also talks about built-in replication and Slony. I do not mind traveling to USA for it.
 
Thanks,

On Sun, May 29, 2011 at 4:39 AM, Craig Ringer <craig@postnewspapers.com.au> wrote:
On 29/05/2011 10:44 AM, Edison So wrote:
Can anyone tell me that if the max_connections is above 100, the server
will use pooling instead?

No. PostgreSQL does not have any built-in connection pooling, that was the point of the suggestion, to advise people that they might want to consider it.

You should _consider_ using connection pooling instead of high numbers of connections if your application is suitable. You will usually get better throughput and often get better overall query latency if you configure lower max_connections and then use a connection pool like pgbouncer or PgPool-II.

Many people using high max_connections are using PHP and pg_pconnect. Those people should particularly consider using a connection pool instead of increasing max_connections . Most people who have performance issues due to overload seem to have this setup.

A few features aren't suitable for pooling, including LISTEN/NOTIFY, advisory locking, and named server-side prepared statements (explicit SQL "PREPARE").


For all participants in this particular dsicuss, what is the reasonable
value for max_connections without causing any harm to the Postgres 9.0
server.

It's dependent on your workload, the capacity of your server, whether requests come in batches or continuously, and all sorts of other things. That's why Tom (wisely) pointed out that naming a number was a really bad idea, even if it was intended only as a vague hint.

Some people on this list clearly run production servers with max_connections in the several-hundreds without any problems. Others have posted asking for help with server load, stalls and memory exhaustion when using only 250 connections.

There's a big difference between an Amazon EC2 node and a real server with a local, big, fast RAID10 array. The former might practically melt down with a configuration that would not be enough to push the latter even close to its limits.

I'm beginning to suspect that the comment I suggested is a bad idea as currently constructed. Maybe the problem cannot be even hinted at in a single short paragraph without creating more confusion than it solves. Something is needed, but perhaps it should just a be a pointer to the documentation:

max_connections = 50
# Thinking of increasing this? Read http://some-documentation-url first!


--
Craig Ringer

Tech-related writing at http://soapyfrogs.blogspot.com/



--
Edison

Re: max_connections proposal

From
Craig Ringer
Date:
On 05/30/2011 03:26 AM, Edison So wrote:
> Thanks Graig for your comprehensive explanation although I do not
> understanding everything you said such as pgbouncer and pg_connect. I
> have just started to use Postgres 9.0 with no prior training.

Google is great :-)

http://www.postgresql.org/docs/current/static/

http://pgfoundry.org/projects/pgbouncer/
http://wiki.postgresql.org/wiki/PgBouncer

http://pgpool.projects.postgresql.org/

http://www.postgresql.org/docs/9.0/static/runtime-config-connection.html#GUC-MAX-CONNECTIONS

> I live in Canada and where I live has no instructor-led training on
> Postgres 9.0 with replication. Can you tell where I can get one.

I don't have any involvement with formal training or know anything about
it. Others may.

I think most people here just use the (IMO excellent) documentation, the
wiki, experimentation, and the mailing lists.

You might be able to find offerings from some of the professional
support people on this list:

   http://www.postgresql.org/support/professional_support

or via Google.

--
Craig Ringer

Re: max_connections proposal

From
Edison So
Date:
Thank Graig for the links. You have been very helpful.
 
When I get time, I will definitely read over the materials to get familar with Postgres.
 
Have a wonderful night.
 
Edison

On Sun, May 29, 2011 at 7:27 PM, Craig Ringer <craig@postnewspapers.com.au> wrote:
On 05/30/2011 03:26 AM, Edison So wrote:
Thanks Graig for your comprehensive explanation although I do not
understanding everything you said such as pgbouncer and pg_connect. I
have just started to use Postgres 9.0 with no prior training.

Google is great :-)

http://www.postgresql.org/docs/current/static/

http://pgfoundry.org/projects/pgbouncer/
http://wiki.postgresql.org/wiki/PgBouncer

http://pgpool.projects.postgresql.org/

http://www.postgresql.org/docs/9.0/static/runtime-config-connection.html#GUC-MAX-CONNECTIONS


I live in Canada and where I live has no instructor-led training on
Postgres 9.0 with replication. Can you tell where I can get one.

I don't have any involvement with formal training or know anything about it. Others may.

I think most people here just use the (IMO excellent) documentation, the wiki, experimentation, and the mailing lists.

You might be able to find offerings from some of the professional support people on this list:

 http://www.postgresql.org/support/professional_support

or via Google.

--
Craig Ringer



--
Edison

Re: max_connections proposal

From
Scott Marlowe
Date:
On Thu, May 26, 2011 at 5:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> OK, maybe word it as "If you're considering raising max_connections much
> above 100, ..." ?

I think it can be even shorter and to the point:

If you're considering raising max_connections consider pooling instead.