Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17) - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
Date
Msg-id CA+Tgmob=dROutVwtsAJEz97ZhwgBiGOsjtFBbgkqsCWdBvMb-A@mail.gmail.com
Whole thread Raw
In response to Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)  ("Kevin Grittner" <Kevin.Grittner@wicourts.gov>)
Responses Re: Wiki link for max_connections? (Fwd: Re: [ADMIN] PostgreSQL oom_adj postmaster process to -17)
List pgsql-hackers
On Thu, Aug 9, 2012 at 1:11 PM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> I didn't figure it was; my emphasis was because this has been raised
> before and nothing happened for want of a consensus on what
> particular wording should be used, so users were left with no
> guidance.  I don't want this to continue to be a victim of "the
> perfect is the enemy of the good" syndrome.

So, to get the ball rolling here, I spent some time on this today, and
added a paragraph to the Linux Memory Overcommit section of the
documentation.  I back-patched it back to 9.0.   There were additional
merge on conflicts in 8.4 which I did not bother to resolve.  There
may be room to further improve what I did here; suggestions are
welcome.  I think we probably still need to add something to the
max_connections documentation; I have not done that.

>> Also, I am a bit doubtful about the advice on sizing the
>> connection pool as applied to small servers:
>> surely it's not sane to recommend that a single-processor system
>> with one disk should have max_connections = 3.  At least, *I*
>> don't think that's sane.
>
> I'm not sure it's wrong when combined with this: "Remember that this
> "sweet spot" is for the number of connections that are actively
> doing work.  ...  You should always make max_connections a bit
> bigger than the number of connections you enable in your connection
> pool. That way there are always a few slots available for direct
> connections for system maintenance and monitoring."  Where would you
> expect the knee to be for connections concurrently actively doing
> work on a single-core, single-drive system ?

I don't know.  But my experience with our customers is that people are
often forced to set the size of the connection pool far larger than
what that formula would suggest.  Many people are doing
transaction-level pooling, and for those people, they've got to look
at how many multi-statement transactions they've got and think about
what the peak value for that quantity is.  It's still worth using
pooling because it reduces the number of simultaneous connections, but
it's not going to reduce it to the kind of values you're talking
about.  Also, consider that transactions aren't all the same length.
Suppose 90% of your queries execute in 50ms, and 10% execute in 60s.
Even though it's less efficient, you've got to set the connection pool
large enough that at least some of the 50 ms queries can continue to
get processed even if the maximum number of 60s queries that you ever
expect to see in parallel are already running.  This may seem like a
theoretical problem but we have customers who use connection pools to
get the number of simultaneous connections down to, say, 800.  I
guarantee you that these people do not have 200 CPUs and 400 disks,
but they're smart people and they find that smaller pool sizes don't
work.

Sure, we can say, well, the fine print tells you that 2*CPUs+disks is
not REALLY the formula you should use, but it's just so far off what I
see in the field that I have a hard time thinking it's really helping
people to give them that as a starting point.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pg_operator.oprcode in 9.2rc1
Next
From: Robert Haas
Date:
Subject: Re: patch: shared session variables