Thread: Advice/guideline on increasing shared_buffers and kernel parameters

Advice/guideline on increasing shared_buffers and kernel parameters

From
"Gnanakumar"
Date:
Hi,

Our Production server has got 35 GB physical RAM size.   Since the server
has lots of RAM, we want to really make use of it.  We've already configured
"max_connections" to 1000 and "shared_buffers" to 1536 MB, but when we tried
to increase only "shared_buffers" to 3072MB (keeping "max_connections" as it
is), PostgreSQL failed to start with the following error:

    EDTFATAL:  could not create shared memory segment: Invalid argument
    EDTDETAIL:  Failed system call was shmget(key=5432001,
size=3307192320, 03600).

Keeping max connection property to 1000, how do I "best" tune/set up its
memory related parameters (including Linux Kernel parameters -- SHMMAX and
SHMALL)?

Experts insights/pointers on this are really appreciated.

Given below current settings available in our server:
-- SHMMAX & SHMALL --
# cat /proc/sys/kernel/shmall
2097152
# cat /proc/sys/kernel/shmmax
2147483648

-- OS & Kernel --
OS: CentOS release 5.2
Arch: 64-bit
Kernel: 2.6.18

-- PostgreSQL conf --
shared_buffers = 1536MB
max_connections = 1000

We're currently running PostgreSQL v8.2.22.

Regards,
Gnanam



Re: Advice/guideline on increasing shared_buffers and kernel parameters

From
Jan Lentfer
Date:
Hi Gnanam,


On Tue, 8 May 2012 12:22:58 +0530, Gnanakumar wrote:
> Our Production server has got 35 GB physical RAM size.   Since the
> server
> has lots of RAM, we want to really make use of it.  We've already
> configured
> "max_connections" to 1000 and "shared_buffers" to 1536 MB, but when
> we tried
> to increase only "shared_buffers" to 3072MB (keeping
> "max_connections" as it
> is), PostgreSQL failed to start with the following error:
>
>     EDTFATAL:  could not create shared memory segment: Invalid argument
>     EDTDETAIL:  Failed system call was shmget(key=5432001,
> size=3307192320, 03600).
>
> Keeping max connection property to 1000, how do I "best" tune/set up
> its
> memory related parameters (including Linux Kernel parameters --
> SHMMAX and
> SHMALL)?


did you read
http://www.postgresql.org/docs/8.2/static/kernel-resources.html ?

If it is a dedicated DB server the rule of thumb usually is to use 25%
RAM for shared buffers, but no more than 8GB unless proper benchmarking
has shown a benefit using above 8GB. But I am not sure if 8GB of shared
buffers is suitable for 8.2 at all. 8.2 is EOL btw.
Also set effective cache size to 50% RAM.

Depending on your work load you might tune work_mem,
maintenance_work_mem etc.

Also, do you really NEED 1000 concurrent sessions? IF you really do, it
might be worse to look into a connection pooler.

Here is a quite nice guide:

http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server

hth

Jan


--
professional: http://www.oscar-consult.de
private: http://neslonek.homeunix.org/drupal/


Re: Advice/guideline on increasing shared_buffers and kernel parameters

From
"Gnanakumar"
Date:
> did you read
> http://www.postgresql.org/docs/8.2/static/kernel-resources.html ?

Yes, I read.  But I'm not able to find a correct way to increase Linux
Kernel parameters.

> If it is a dedicated DB server the rule of thumb usually is to use
> 25% RAM for shared buffers, but no more than 8GB unless proper
> benchmarking has shown a benefit using above 8GB. But I am not
> sure if 8GB of shared buffers is suitable for 8.2 at all. 8.2 is EOL btw.
> Also set effective cache size to 50% RAM.

As you can see, in my case, I'm setting only 3 GB (3072 MB), which is
actually below 8 GB.  So, I need to increase kernel parameters in this
case. Any ideas/insights?

> Also, do you really NEED 1000 concurrent sessions? IF you really do, it
might
> be worse to look into a connection pooler.

Yes, our web-based application has crossed more than 500 concurrent users.
Hence we've already upgraded RAM and now we want to upgrade max connection
parameter too.  Yes, we're already using pgpool-II v3.1.1 for connection
pooling.



Re: Advice/guideline on increasing shared_buffers and kernel parameters

From
Jan Lentfer
Date:
On Tue, 8 May 2012 14:56:52 +0530, Gnanakumar wrote:

> As you can see, in my case, I'm setting only 3 GB (3072 MB), which is
> actually below 8 GB.  So, I need to increase kernel parameters in
> this
> case. Any ideas/insights?

I am not using CentOS (or Linux at all), but

edit /etc/sysctl.conf

and do something like /etc/init.d/sysctl restart


also some quick googling braught this up:
http://grokbase.com/t/centos/centos/11a40897q1/centos-6-increase-shared-memory-limits-permanently

Judging from your error you should set shmmax to something in the range
of 3670016000


Jan

--
professional: http://www.oscar-consult.de
private: http://neslonek.homeunix.org/drupal/


Re: Advice/guideline on increasing shared_buffers and kernel parameters

From
"Kevin Grittner"
Date:
"Gnanakumar" <gnanam@zoniac.com> wrote:

> our web-based application has crossed more than 500 concurrent
> users.  Hence we've already upgraded RAM and now we want to
> upgrade max connection parameter too.  Yes, we're already using
> pgpool-II v3.1.1 for connection pooling.

The main point of using a connection pooler is to funnel a large
number of client connection into the pooler into a small number of
database connections.  We get very good performance dealing with
thousands of concurrent users with a pool of 35 connections to the
database.  We originally had a larger pool, but contention was
reducing performance, and we found that throughput and latency both
improved with a smaller pool of database connections.

If you want to handle more users than you can currently support, you
probably need to use fewer database connections.

-Kevin

Re: Advice/guideline on increasing shared_buffers and kernel parameters

From
"Gnanakumar"
Date:
> We get very good performance dealing with
> thousands of concurrent users with a pool of 35 connections to the
> database.
>
> If you want to handle more users than you can currently support, you
> probably need to use fewer database connections.

First, please excuse me that I'm not able to understand this particular
point clearly. How can be reducing/using fewer connections in connection
pooler can support larger concurrent incoming connection requests?  If this
is so critical to revisit (reducing), then I may have to convince/justify my
peers also, before making this change in the Production server.  Can you
throw some light on this subject?

Thanks for bringing this idea to notice.



Re: Advice/guideline on increasing shared_buffers and kernel parameters

From
"Kevin Grittner"
Date:
"Gnanakumar" <gnanam@zoniac.com> wrote:
>> We get very good performance dealing with thousands of concurrent
>> users with a pool of 35 connections to the database.
>>
>> If you want to handle more users than you can currently support,
>> you probably need to use fewer database connections.
>
> First, please excuse me that I'm not able to understand this
> particular point clearly. How can be reducing/using fewer
> connections in connection pooler can support larger concurrent
> incoming connection requests? If this is so critical to revisit
> (reducing), then I may have to convince/justify my peers also,
> before making this change in the Production server. Can you throw
> some light on this subject?
>
> Thanks for bringing this idea to notice.

There have been numerous discussions of this on the lists, so you
can probably find a more in-depth discussion of the topic if you
search the archives, and this may motivate me to put together a Wiki
page on the topic, but here's the general concept.

A database server only has so many resources, and if you don't have
enough active connections active to use all of them, your throughput
will generally improve by using more connections.  Once all of the
resources are in use, you won't push any more through by having more
connections competing for the resources.  In fact, throughput starts
to fall off due to the overhead from that contention.  If you look
at any graph of PostgreSQL performance with number of connections on
the x axis and tps on the y access (with nothing else changing), you
will performance climb as connections rise until you hit saturation,
and then you have a "knee" after which performance falls off.  A lot
of work has been done for version 9.3 to push that knee to the right
and make the fall-off more gradual, but the issue is intrinsic --
without a built-in connection pool or at least an admission control
policy, the knee will always be there.

Now, this decision not to include a connection pooler inside the
PostgreSQL server itself is not capricious and arbitrary.  In many
cases you will get better performance if the connection pooler is
running on a separate machine.  In even more cases (at least in my
experience) you can get improved functionality by incorporating a
connection pool into client-side software.  Many frameworks,
including the ones we use at Wisconsin Courts, do the pooling in a
Java process running on the same server as the database server (to
minimize latency effects from the database protocol) and make
high-level requests to the Java process to run a certain function
with a given set of parameters as a single database transaction.
This ensures that network latency or connection failures can't cause
a transaction to hang while waiting for something from the network,
and provides a simple way to retry any database transaction which
rolls back with a serialization failure (SQLSTATE 40001 or 40P01).

Since a pooler built in to the database engine would be inferior
(for the above reasons), the community has decided not to go that
route.

I know I won't be able to remember *all* of the reasons that
performance *falls off* after you reach the "knee" rather than just
staying level, but I'll list the ones which come to mind at the
moment.  If anyone wants to add to the list, feel free to reply, or
look for a Wiki page to appear this week and add them there.

 - Context switches.  The processor is interrupted from working on
one query and has to switch to another, which involves saving state
and restoring state.  While the core is busy swapping states it is
not doing any useful work on any query.

 - Cache line contention.  One query is likely to be working on a
particular area of RAM, and the query taking its place is likely to
be working on a different area; causing data cached on the CPU chip
to be discarded, only to need to be reloaded to continue the other
query.  Besides that the various processes will be grabbing control
of cache lines from each other, causing stalls.  (Humorous note, in
one oprofile run of a heavily contended load, 10% of CPU time was
attributed to a 1-byte noop; analysis showed that it was because it
needed to wait on a cache line for the following machine code
operation.)

 - Lock contention.  This happens at various levels: spinlocks, LW
locks, and all the locks that show up in pg_locks.  As more
processes compete for the spinlocks (which protect LW locks
acquisition and release, which in turn protect the heavyweight and
predicate lock acquisition and release) they account for a high
percentage of CPU time used.

 - RAM usage.  The work_mem setting can have a big impact on
performance.  If it is too small, hash tables and sorts spill to
disk, bitmap heap scans become "lossy", requiring more work on each
page access, etc.  So you want it to be big.  But work_mem RAM can
be allocated for each node of a query on each connection, all at the
same time.  So a big work_mem with a large number of connections can
cause a lot of the OS cache to be periodically discarded, forcing
more accesses to disk; or it could even put the system into
swapping.  So the more connections you have, the more you need to
make a choice between slow plans and trashing cache/swapping.

 - Disk access.  If you *do* need to go to disk for random access, a
large number of connections can tend to force more tables and
indexes to be accessed at the same time, causing heavier seeking all
over the disk.

 - General scaling.  Some internal structures allocated based on
max_connections scale at O(N^2) or O(N*log(N)).  Some types of
overhead which are negligible at a lower number of connections can
become significant with a large number of connections.

A formula which has held up pretty well across a lot of benchmarks
for years is that for optimal throughput the number of active
connections should be somewhere near ((core_count * 2) +
effective_spindle_count).  Core count should not include HT threads,
even if hyperthreading is enabled.  Effective spindle count is zero
if the active data set is fully cached, and approaches the actual
number of spindles as the cache hit rate falls.  Benchmarks of WIP
for version 9.3 suggest that this formula will need adjustment on
that release.  I haven't looked at how well the formula works with
SDDs.  In any event, I would recommend using this as a starting
point for a connection pool size, and trying incremental adjustments
with your actual workload to find the actual "sweet spot" for your
hardware and workload.

-Kevin

Re: Advice/guideline on increasing shared_buffers and kernel parameters

From
"Gnanakumar"
Date:
> There have been numerous discussions of this on the lists, so you
> can probably find a more in-depth discussion of the topic if you
> search the archives, and this may motivate me to put together a Wiki
> page on the topic, but here's the general concept.

I was really astonished on seeing a great, nice and detailed explanation on
this topic. I would like to *really appreciate* your effort and patience in
writing down this with your real-time experience.  I've already thought of
converting this into a document and keep it handy so that I may want to
refer back whenever I need.

Thanks once again for that great explanation.



Re: Advice/guideline on increasing shared_buffers and kernel parameters

From
"Gnanakumar"
Date:
> A formula which has held up pretty well across a lot of benchmarks
> for years is that for optimal throughput the number of active
> connections should be somewhere near
> ((core_count * 2) + effective_spindle_count).

Our entire Production application stack is setup in Amazon EC2 cloud
environment, that includes database server also.  So, in that case, how do I
find out "effective_spindle_count"?  I know "core_count" can be determined
from Amazon EC2 instance type.  Per Amazon EC2, EBS volumes are reportedly a
shared resource.



Re: Advice/guideline on increasing shared_buffers and kernel parameters

From
"Kevin Grittner"
Date:
"Gnanakumar" <gnanam@zoniac.com> wrote:

> I've already thought of converting this into a document and keep
> it handy so that I may want to refer back whenever I need.

I've put up a first cut at such a document as a Wiki page:

http://wiki.postgresql.org/wiki/Number_Of_Database_Connections

Everyone should feel free to improve upon it.  I'll probably add a
"thought experiment" I've used a few times which seems to help some
people understand the issue.

>> A formula which has held up pretty well across a lot of
>> benchmarks for years is that for optimal throughput the number of
>> active connections should be somewhere near
>> ((core_count * 2) + effective_spindle_count).
>
> Our entire Production application stack is setup in Amazon EC2
> cloud environment, that includes database server also.  So, in
> that case, how do I find out "effective_spindle_count"?  I know
> "core_count" can be determined from Amazon EC2 instance type.  Per
> Amazon EC2, EBS volumes are reportedly a shared resource.

I think you need to experiment with different pools sizes.  Please
post results and/or update the Wiki page.

-Kevin

Re: Advice/guideline on increasing shared_buffers and kernel parameters

From
Bruce Momjian
Date:
On Thu, May 10, 2012 at 12:41:17PM +0530, Gnanakumar wrote:
> > There have been numerous discussions of this on the lists, so you
> > can probably find a more in-depth discussion of the topic if you
> > search the archives, and this may motivate me to put together a Wiki
> > page on the topic, but here's the general concept.
>
> I was really astonished on seeing a great, nice and detailed explanation on
> this topic. I would like to *really appreciate* your effort and patience in
> writing down this with your real-time experience.  I've already thought of
> converting this into a document and keep it handy so that I may want to
> refer back whenever I need.
>
> Thanks once again for that great explanation.

Agreed.  That was pretty amazing!

--
  Bruce Momjian  <bruce@momjian.us>        http://momjian.us
  EnterpriseDB                             http://enterprisedb.com

  + It's impossible for everything to be true. +