Thread: how big shmmax is good for Postgres...

how big shmmax is good for Postgres...

From
Jessica Richard
Date:
On a Linux system,  if the total memory is 4G and the shmmax is set to 4G, I know it is bad, but how bad can it be? Just trying to understand the impact the "shmmax" parameter can have  on Postgres  and the entire system after Postgres comes up on this number.

What is the reasonable setting for shmmax on a 4G total machine?

Thanks a lot,
Jessica

Re: how big shmmax is good for Postgres...

From
Bill Moran
Date:
In response to Jessica Richard <rjessil@yahoo.com>:

> On a Linux system,  if the total memory is 4G and the shmmax is set to 4G, I know it is bad, but how bad can it be?
Justtrying to understand the impact the "shmmax" parameter can have  on Postgres  and the entire system after Postgres
comesup on this number. 

It's not bad by definition.  shmmax is a cap on the max that can be used.
Just because you set it to 4G doesn't mean any application is going to
use all of that.  With PostgreSQL, the maximum amount of shared memory it
will allocate is governed by the shared_buffers setting in the
postgresql.conf.

It _is_ a good idea to set shmmax to a reasonable size to prevent
a misbehaving application from eating up all the memory on a system,
but I've yet to see PostgreSQL misbehave in this manner.  Perhaps I'm
too trusting.

> What is the reasonable setting for shmmax on a 4G total machine?

If you mean what's a reasonable setting for shared_buffers, conventional
wisdom says to start with 25% of the available RAM and increase it or
decrease it if you discover your workload benefits from more or less.
By "available RAM" is meant the free RAM after all other applications
are running, which will be 4G if this machine only runs PostgreSQL, but
could be less if it runs other things like a web server.

--
Bill Moran
Collaborative Fusion Inc.
http://people.collaborativefusion.com/~wmoran/

wmoran@collaborativefusion.com
Phone: 412-422-3463x4023

Re: how big shmmax is good for Postgres...

From
"Scott Marlowe"
Date:
On Thu, Jul 10, 2008 at 4:53 AM, Jessica Richard <rjessil@yahoo.com> wrote:
> On a Linux system,  if the total memory is 4G and the shmmax is set to 4G, I
> know it is bad, but how bad can it be? Just trying to understand the impact
> the "shmmax" parameter can have  on Postgres  and the entire system after
> Postgres comes up on this number.

There are two settings, shmmax for the OS, and shared_buffers for
pgsql.  If the OS has a max setting of 4G but pgsql is set to use
512Meg then you'd be safe.

Now, assuming that both the OS and pgsql are set to 4G, and you're
approaching that level of usage, the OS will start swapping out to
make room for the shared memory.  The machine will likely be starved
of memory, and will go into what's often called a swap storm where it
spends all its time swapping stuff in and out and doing little else.

> What is the reasonable setting for shmmax on a 4G total machine?

Keep in mind that postgresql's shared buffers are just one level of
caching / buffering that's going on.  The OS caches file access, and
sometimes the RAID controller and / or hard drive.

Generally the nominal setting is 25% of memory, but that's variable.
If the working set of your data will fit in 10% then there's no need
for setting shared_memory to 25%...

Re: how big shmmax is good for Postgres...

From
"Scott Marlowe"
Date:
I just wanted to add to my previous post that shared_memory generally
has a performance envelope of quickly increasing performance as you
first increase share_memory, then a smaller performance step with each
increase in shared_memory.  Once all of the working set of your data
fits, the return starts to fall off quickly.  Assuming you were on a
machine with infinite memory and there were no penalties for a large
shared_buffers, then you would go until you were comfortably in the
level area.  If your working set were 1G on a dedicated machine with
64G, you could assume that memory was functionally unlimited.
Somewhere around 1.1 Gigs of memory and you get no performance
increase.

In real life, maintaining a cache has costs too.  The kernel of most
OSes now caches data very well.  and it does it very well for large
chunks of data.  So on a machine that big, the OS would be caching the
whole dataset as well.

What we're interested in is the working set size.  If you typically
call up a couple k of data, band it around some and commit, then never
hit that section again for hours, and all your calls do that,  and the
access area is evenly spread across the database, then your working
set is the whole thing.  This is a common access pattern when dealing
with transactional systems.

If you commonly have 100 transactions doing that at once, then you
multiply much memory they use times 100 to get total buffers in use,
and the rest is likely NEVER going to get used.

In these systems, what seems like a bad idea, lowering the
buffer_size, might be the exact right call.

For session servers and large transactional systems, it's often best
to let the OS do the best caching of the most of the data, and have
enough shared buffers to handle 2-10 times the in memory data set
size.  This will result in a buffer size of a few hundred megabytes.

The advantage here is that the OS doesn't have to spend a lot of time
maintaining a large buffer pool and checkpoints are cheaper.  With
spare CPU the background writer can use spare I/O cycles to write out
the smaller number of dirty pages in shared_memory and the system runs
faster.

Same is true of session servers.  If a DB is just used for tracking
logged in users, it only needs a tiny amount of shared_buffers.

Conversely, when you need large numbers of shared_buffers is when you
have something like a large social networking site.  A LOT of people
updating a large data set at the same time likely need way more
shared_buffers to run well.  A user might be inputing data for several
minutes or even hours.  The same pages are getting hit over and over
too.  For this kind of app, you need as much memory as you can afford
to throw at the problem, and a semi fast large RAID array.  A large
cache means your RAID controller / array only have to write, on
average, as fast as the database commits it.

When you can reach the point where shared_buffers is larger than 1/2
your memory you're now using postgresql for caching more so than the
kernel.  As your shared_buffers goes past 75% of available memory you
now have three times as much cache under postgesql'l control than
under the OS's control.  This would mean you've already testing this
and found that postgresql's caching works better for you than the OS's
caching does.  Haven't seen that happen a lot.  Maybe some large
transactional systems would work well that way.

My point here, and I have one, is that larger shared_buffers only
makes sense if you can use them.  They can work against you in a few
different ways that aren't obvious up front.  Checkpointing, Self DOS
due to swap storm, using up memory that the kernel might be better at
using as cache, etc.

So the answer really is, do some realistic testing, with an eye
towards anamalous behavior.

Re: how big shmmax is good for Postgres...

From
"Scott Marlowe"
Date:
Some corrections:

On Thu, Jul 10, 2008 at 6:11 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:

SNIP

> If you commonly have 100 transactions doing that at once, then you
> multiply much memory they use times 100 to get total buffer >> SPACE << in use,
> and the rest is likely NEVER going to get used.
>
> In these systems, what seems like a bad idea, lowering the
> buffer_size, might be the exact right call.
>
> For session servers and large transactional systems, it's often best
> to let the OS do the best caching of the most of the data, and have
> enough shared buffers to handle 2-10 times the in memory data set
> size.  This will result in a buffer size of a few hundred megabytes.
>
> The advantage here is that the (NOT OS) DATABASE doesn't have to spend a lot of time
> maintaining a large buffer pool and checkpoints are cheaper.
> The background writer can use spare >> CPU << and I/O cycles to write out
> the now smaller number of dirty pages in shared_memory and the system runs
> faster.

>
> Conversely, when you need large numbers of shared_buffers is when you
> have something like a large social networking site.  A LOT of people
> updating a large data set at the same time likely need way more
> shared_buffers to run well.  A user might be inputing data for several
> minutes or even hours.  The same pages are getting hit over and over
> too.  For this kind of app, you need as much memory as you can afford
> to throw at the problem, and a semi fast large RAID array.  A large
>  >> RAID << cache means your RAID controller / array only have to write, on
> average, as fast as the database commits.

Just minor edits.  If there's anything obviously wrong someone please
let me know.