Thread: max_connections proposal
There might be a very cheap and simple way to help reduce the number of people running into problems because they set massive max_connections values that their server cannot cope with instead of using pooling. In the default postgresql.conf, change: max_connections = 100 # (change requires restart) # Note: Increasing max_connections costs ~400 bytes of shared memory # per connection slot, plus lock space (see max_locks_per_transaction). to: max_connections = 100 # (change requires restart) # WARNING: If you're about to increase max_connections above 100, you # should probably be using a connection pool instead. See: # http://wiki.postgresql.org/max_connections # # Note: Increasing max_connections costs ~400 bytes of shared memory # per connection slot, plus lock space (see max_locks_per_transaction). # ... where wiki.postgresql.org/max_connections (which doesn't yet exist) explains the throughput costs of too many backends and the advantages of configuring a connection pool instead. Sure, this somewhat contravenes the "users don't read - ever" principle, but we can hope that _some_ people will read a comment immediately beside the directive they're modifying. -- Craig Ringer
On Wed, May 25, 2011 at 10:58 PM, Craig Ringer <craig@postnewspapers.com.au> wrote: > There might be a very cheap and simple way to help reduce the number of > people running into problems because they set massive max_connections values > that their server cannot cope with instead of using pooling. > > In the default postgresql.conf, change: > > max_connections = 100 # (change requires restart) > # Note: Increasing max_connections costs ~400 bytes of shared memory > # per connection slot, plus lock space (see max_locks_per_transaction). > > to: > > max_connections = 100 # (change requires restart) > # WARNING: If you're about to increase max_connections above 100, you > # should probably be using a connection pool instead. See: > # http://wiki.postgresql.org/max_connections > # > # Note: Increasing max_connections costs ~400 bytes of shared memory > # per connection slot, plus lock space (see max_locks_per_transaction). > # > > > ... where wiki.postgresql.org/max_connections (which doesn't yet exist) > explains the throughput costs of too many backends and the advantages of > configuring a connection pool instead. > > Sure, this somewhat contravenes the "users don't read - ever" principle, but > we can hope that _some_ people will read a comment immediately beside the > directive they're modifying. +1 on this idea, although I'm not so sure it's a good idea to point to the wiki. Also, all other .conf explanation is in the standard docs, so maybe this should be too. merlin
Craig Ringer <craig@postnewspapers.com.au> writes: > max_connections = 100 # (change requires restart) > # WARNING: If you're about to increase max_connections above 100, you > # should probably be using a connection pool instead. See: > # http://wiki.postgresql.org/max_connections This gives the impression that performance is great at 100 and falls off a cliff at 101, which is both incorrect and likely to lower peoples' opinion of the software. I'd suggest wording more like "if you're considering raising max_connections into the thousands, you should probably use a connection pool instead". And I agree with Merlin that a wiki pointer is inappropriate. regards, tom lane
On 05/26/2011 09:48 PM, Tom Lane wrote: > Craig Ringer<craig@postnewspapers.com.au> writes: >> max_connections = 100 # (change requires restart) >> # WARNING: If you're about to increase max_connections above 100, you >> # should probably be using a connection pool instead. See: >> # http://wiki.postgresql.org/max_connections > > This gives the impression that performance is great at 100 and falls off > a cliff at 101, which is both incorrect and likely to lower peoples' > opinion of the software. Fair call; the use of a specific value is misleading. > I'd suggest wording more like "if you're > considering raising max_connections into the thousands, you should > probably use a connection pool instead". Best performance is often obtained with the number of _active_ connections in the 10s to 30s on commonplace hardware. I'd want to use "hundreds" - because mailing list posts etc suggest that people start running into problems under load at the 400-500 mark, and more importantly because it's well worth moving to pooling _way_ before that point. > And I agree with Merlin that a > wiki pointer is inappropriate. That does make sense. -- Craig Ringer
Craig Ringer <craig@postnewspapers.com.au> writes: > On 05/26/2011 09:48 PM, Tom Lane wrote: >> Craig Ringer<craig@postnewspapers.com.au> writes: >>> max_connections = 100 # (change requires restart) >>> # WARNING: If you're about to increase max_connections above 100, you >>> # should probably be using a connection pool instead. See: >>> # http://wiki.postgresql.org/max_connections >> This gives the impression that performance is great at 100 and falls off >> a cliff at 101, which is both incorrect and likely to lower peoples' >> opinion of the software. > Fair call; the use of a specific value is misleading. >> I'd suggest wording more like "if you're >> considering raising max_connections into the thousands, you should >> probably use a connection pool instead". > Best performance is often obtained with the number of _active_ > connections in the 10s to 30s on commonplace hardware. I'd want to use > "hundreds" - because mailing list posts etc suggest that people start > running into problems under load at the 400-500 mark, and more > importantly because it's well worth moving to pooling _way_ before that > point. OK, maybe word it as "If you're considering raising max_connections much above 100, ..." ? regards, tom lane
2011/5/27 Tom Lane <tgl@sss.pgh.pa.us>: > Craig Ringer <craig@postnewspapers.com.au> writes: >> On 05/26/2011 09:48 PM, Tom Lane wrote: >>> Craig Ringer<craig@postnewspapers.com.au> writes: >>>> max_connections = 100 # (change requires restart) >>>> # WARNING: If you're about to increase max_connections above 100, you >>>> # should probably be using a connection pool instead. See: >>>> # http://wiki.postgresql.org/max_connections > >>> This gives the impression that performance is great at 100 and falls off >>> a cliff at 101, which is both incorrect and likely to lower peoples' >>> opinion of the software. > >> Fair call; the use of a specific value is misleading. > >>> I'd suggest wording more like "if you're >>> considering raising max_connections into the thousands, you should >>> probably use a connection pool instead". > >> Best performance is often obtained with the number of _active_ >> connections in the 10s to 30s on commonplace hardware. I'd want to use >> "hundreds" - because mailing list posts etc suggest that people start >> running into problems under load at the 400-500 mark, and more >> importantly because it's well worth moving to pooling _way_ before that >> point. > > OK, maybe word it as "If you're considering raising max_connections much > above 100, ..." ? "Be aware that a too large value can be counter-productive and a connection pooler can be more appropriate." No scale... I am really happy to face more and more servers where 'top' truncate the list of processors... We will have to scale and not make that limitation a feature, imho. -- Cédric Villemain 2ndQuadrant http://2ndQuadrant.fr/ PostgreSQL : Expertise, Formation et Support
On Fri, May 27, 2011 at 6:22 AM, Craig Ringer <craig@postnewspapers.com.au> wrote: > Best performance is often obtained with the number of _active_ connections > in the 10s to 30s on commonplace hardware. I'd want to use "hundreds" - > because mailing list posts etc suggest that people start running into > problems under load at the 400-500 mark, and more importantly because it's > well worth moving to pooling _way_ before that point. If you can. I'd love a connection pool that knows when I have a resource that persists across transactions like a cursor or temporary table and the backend connection needs to be maintained between transactions, or if there are no such resources and the backend connection can be released to the pool between transactions. I suspect this sort of pool would need to be built into the core. At the moment I only see a benefit with a pool from connections from my webapp which I know can safely go through pgbouncer in transaction pooling mode. Or would there be some way of detecting if the current session has access to stuff that persists across transactions and this feature could be added to the existing connection pools? -- Stuart Bishop <stuart@stuartbishop.net> http://www.stuartbishop.net/
On 05/25/2011 10:58 PM, Craig Ringer wrote: > max_connections = 100 # (change requires restart) > # WARNING: If you're about to increase max_connections above 100, you > # should probably be using a connection pool instead. See: > # http://wiki.postgresql.org/max_connections > # > # Note: Increasing max_connections costs ~400 bytes of shared memory > # per connection slot, plus lock space (see max_locks_per_transaction). While tempting to do something this simple, the most useful path to follow is probably to nail this head-on and comprehensively in the docs instead. Discussion of this topic on the hackers list seems to have concluded that connection pooling isn't as vital to do inside the database, as a high priority relative to other development, because it's addressed so well via external projects. Pointing people toward them seems quite appropriate given that position. Really addressing this well would take the following steps: -Add a section to the external projects section of the documentation: http://www.postgresql.org/docs/current/interactive/external-projects.html introducing connection pooling as a useful type of additional software to add. Shouldn't be controversial to suggest pgbouncer and pgpool-II as examples there. -Expand the documentation on max_connections to warn about how snapshot visibility overhead makes extremely large numbers of connections impractical -Also expand the documentation to suggest that CPU switching inefficiency may make a much smaller number of connections than expected optimal, and point toward the external project section for more information about pooling. -Add a warning to the postgresql.conf suggesting people read the documentation for max_connections before increasing this value. This area pops up enough that I've made a discussion of it part of even my shortest talk about PostgreSQL performance issues to be wary of. There's a good documentation patch project for somebody here, I just haven't had time to get to it yet. -- Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us "PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books
Can anyone tell me that if the max_connections is above 100, the server will use pooling instead?
For all participants in this particular dsicuss, what is the reasonable value for max_connections without causing any harm to the Postgres 9.0 server.
I am a nonvice Postgres user so any advice is always welcomed.
Thanks,
On Wed, May 25, 2011 at 10:58 PM, Craig Ringer <craig@postnewspapers.com.au> wrote:
There might be a very cheap and simple way to help reduce the number of people running into problems because they set massive max_connections values that their server cannot cope with instead of using pooling.
In the default postgresql.conf, change:
max_connections = 100 # (change requires restart)
# Note: Increasing max_connections costs ~400 bytes of shared memory
# per connection slot, plus lock space (see max_locks_per_transaction).
to:
max_connections = 100 # (change requires restart)
# WARNING: If you're about to increase max_connections above 100, you
# should probably be using a connection pool instead. See:
# http://wiki.postgresql.org/max_connections
#
# Note: Increasing max_connections costs ~400 bytes of shared memory
# per connection slot, plus lock space (see max_locks_per_transaction).
#
... where wiki.postgresql.org/max_connections (which doesn't yet exist) explains the throughput costs of too many backends and the advantages of configuring a connection pool instead.
Sure, this somewhat contravenes the "users don't read - ever" principle, but we can hope that _some_ people will read a comment immediately beside the directive they're modifying.
--
Craig Ringer
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Edison
On 29/05/2011 10:44 AM, Edison So wrote: > Can anyone tell me that if the max_connections is above 100, the server > will use pooling instead? No. PostgreSQL does not have any built-in connection pooling, that was the point of the suggestion, to advise people that they might want to consider it. You should _consider_ using connection pooling instead of high numbers of connections if your application is suitable. You will usually get better throughput and often get better overall query latency if you configure lower max_connections and then use a connection pool like pgbouncer or PgPool-II. Many people using high max_connections are using PHP and pg_pconnect. Those people should particularly consider using a connection pool instead of increasing max_connections . Most people who have performance issues due to overload seem to have this setup. A few features aren't suitable for pooling, including LISTEN/NOTIFY, advisory locking, and named server-side prepared statements (explicit SQL "PREPARE"). > For all participants in this particular dsicuss, what is the reasonable > value for max_connections without causing any harm to the Postgres 9.0 > server. It's dependent on your workload, the capacity of your server, whether requests come in batches or continuously, and all sorts of other things. That's why Tom (wisely) pointed out that naming a number was a really bad idea, even if it was intended only as a vague hint. Some people on this list clearly run production servers with max_connections in the several-hundreds without any problems. Others have posted asking for help with server load, stalls and memory exhaustion when using only 250 connections. There's a big difference between an Amazon EC2 node and a real server with a local, big, fast RAID10 array. The former might practically melt down with a configuration that would not be enough to push the latter even close to its limits. I'm beginning to suspect that the comment I suggested is a bad idea as currently constructed. Maybe the problem cannot be even hinted at in a single short paragraph without creating more confusion than it solves. Something is needed, but perhaps it should just a be a pointer to the documentation: max_connections = 50 # Thinking of increasing this? Read http://some-documentation-url first! -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/
On 29/05/2011 4:39 PM, Craig Ringer wrote: > On 29/05/2011 10:44 AM, Edison So wrote: >> Can anyone tell me that if the max_connections is above 100, the server >> will use pooling instead? > > No. PostgreSQL does not have any built-in connection pooling, that was > the point of the suggestion, to advise people that they might want to > consider it. Whoops, bad wording. "That was the point of my original suggestion: to advise people that they might want to consider configuring a third-party connection pool like PgPool-II or PgBouncer instead of greatly increasing max_connections ". -- Craig Ringer Tech-related writing at http://soapyfrogs.blogspot.com/
Thanks Graig for your comprehensive explanation although I do not understanding everything you said such as pgbouncer and pg_connect. I have just started to use Postgres 9.0 with no prior training.
I live in Canada and where I live has no instructor-led training on Postgres 9.0 with replication. Can you tell where I can get one. I just want a Postgres 9.x administration course which also talks about built-in replication and Slony. I do not mind traveling to USA for it.
Thanks,
On Sun, May 29, 2011 at 4:39 AM, Craig Ringer <craig@postnewspapers.com.au> wrote:
On 29/05/2011 10:44 AM, Edison So wrote:No. PostgreSQL does not have any built-in connection pooling, that was the point of the suggestion, to advise people that they might want to consider it.Can anyone tell me that if the max_connections is above 100, the server
will use pooling instead?
You should _consider_ using connection pooling instead of high numbers of connections if your application is suitable. You will usually get better throughput and often get better overall query latency if you configure lower max_connections and then use a connection pool like pgbouncer or PgPool-II.
Many people using high max_connections are using PHP and pg_pconnect. Those people should particularly consider using a connection pool instead of increasing max_connections . Most people who have performance issues due to overload seem to have this setup.
A few features aren't suitable for pooling, including LISTEN/NOTIFY, advisory locking, and named server-side prepared statements (explicit SQL "PREPARE").It's dependent on your workload, the capacity of your server, whether requests come in batches or continuously, and all sorts of other things. That's why Tom (wisely) pointed out that naming a number was a really bad idea, even if it was intended only as a vague hint.For all participants in this particular dsicuss, what is the reasonable
value for max_connections without causing any harm to the Postgres 9.0
server.
Some people on this list clearly run production servers with max_connections in the several-hundreds without any problems. Others have posted asking for help with server load, stalls and memory exhaustion when using only 250 connections.
There's a big difference between an Amazon EC2 node and a real server with a local, big, fast RAID10 array. The former might practically melt down with a configuration that would not be enough to push the latter even close to its limits.
I'm beginning to suspect that the comment I suggested is a bad idea as currently constructed. Maybe the problem cannot be even hinted at in a single short paragraph without creating more confusion than it solves. Something is needed, but perhaps it should just a be a pointer to the documentation:
max_connections = 50
# Thinking of increasing this? Read http://some-documentation-url first!
--
Craig Ringer
Tech-related writing at http://soapyfrogs.blogspot.com/
--
Edison
On 05/30/2011 03:26 AM, Edison So wrote: > Thanks Graig for your comprehensive explanation although I do not > understanding everything you said such as pgbouncer and pg_connect. I > have just started to use Postgres 9.0 with no prior training. Google is great :-) http://www.postgresql.org/docs/current/static/ http://pgfoundry.org/projects/pgbouncer/ http://wiki.postgresql.org/wiki/PgBouncer http://pgpool.projects.postgresql.org/ http://www.postgresql.org/docs/9.0/static/runtime-config-connection.html#GUC-MAX-CONNECTIONS > I live in Canada and where I live has no instructor-led training on > Postgres 9.0 with replication. Can you tell where I can get one. I don't have any involvement with formal training or know anything about it. Others may. I think most people here just use the (IMO excellent) documentation, the wiki, experimentation, and the mailing lists. You might be able to find offerings from some of the professional support people on this list: http://www.postgresql.org/support/professional_support or via Google. -- Craig Ringer
Thank Graig for the links. You have been very helpful.
When I get time, I will definitely read over the materials to get familar with Postgres.
Have a wonderful night.
Edison
On Sun, May 29, 2011 at 7:27 PM, Craig Ringer <craig@postnewspapers.com.au> wrote:
On 05/30/2011 03:26 AM, Edison So wrote:Google is great :-)Thanks Graig for your comprehensive explanation although I do not
understanding everything you said such as pgbouncer and pg_connect. I
have just started to use Postgres 9.0 with no prior training.
http://www.postgresql.org/docs/current/static/
http://pgfoundry.org/projects/pgbouncer/
http://wiki.postgresql.org/wiki/PgBouncer
http://pgpool.projects.postgresql.org/
http://www.postgresql.org/docs/9.0/static/runtime-config-connection.html#GUC-MAX-CONNECTIONSI don't have any involvement with formal training or know anything about it. Others may.I live in Canada and where I live has no instructor-led training on
Postgres 9.0 with replication. Can you tell where I can get one.
I think most people here just use the (IMO excellent) documentation, the wiki, experimentation, and the mailing lists.
You might be able to find offerings from some of the professional support people on this list:
http://www.postgresql.org/support/professional_support
or via Google.
--
Craig Ringer
--
Edison
On Thu, May 26, 2011 at 5:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > OK, maybe word it as "If you're considering raising max_connections much > above 100, ..." ? I think it can be even shorter and to the point: If you're considering raising max_connections consider pooling instead.