Thread: shared_buffers advice

shared_buffers advice

From
Paul McGarry
Date:
Hi there,

I'm after a little bit of advice on the shared_buffers setting (I have
read the various docs on/linked from the performance tuning wiki page,
some very helpful stuff there so thanks to those people).

I am setting up a 64bit Linux server running Postgresql 8.3, the
server has 64gigs of memory and Postgres is the only major application
running on it. (This server is to go alongside some existing 8.3
servers, we will look at 8.4/9 migration later)

I'm basically wondering how the postgresql cache (ie shared_buffers)
and the OS page_cache interact. The general advice seems to be to
assign 1/4 of RAM to shared buffers.

I don't have a good knowledge of the internals but I'm wondering if
this will effectively mean that roughly the same amount of RAM being
used for the OS page cache will be used for redundantly caching
something the Postgres is caching as well?

IE when Postgres reads something from disk it will go into both the OS
page cache and the Postgresql shared_buffers and the OS page cache
copy is unlikely to be useful for anything.

If that is the case what are the downsides to having less overlap
between the caches, IE heavily favouring one or the other, such as
allocating shared_buffers to a much larger percentage (such as 90-95%
of expected 'free' memory).

Paul

Re: shared_buffers advice

From
Dave Crooke
Date:
There seems to be a wide range of opinion on this .... I am new to PG and grew up on Oracle, where more SGA is always a good thing ... I know people who run Oracle on 2TB Superdome's with titanic SGA sizes to keep the whole DB in RAM. I'd be using a 40GB+ Oracle SGA on that box of yours.

A lot of the folks here say that there isn't much performance to be gained by giving PG's buffer cache the bulk of the RAM .... if you have the opportunity, it'd be interesting to test it both ways with a representative workload before going live. I for one would be very curious to see the results.

One caveat: the PG child processes can collectively use a lot of transient RAM, e.g. for sorts and vaccuming, depending on config and workload. If this causes swapping, or even uses up enough memory to effectively eliminate the OS buffer cache, it's going to hurt performance.

Cheers
Dave

On Wed, Mar 10, 2010 at 8:28 PM, Paul McGarry <paul@paulmcgarry.com> wrote:
Hi there,

I'm after a little bit of advice on the shared_buffers setting (I have
read the various docs on/linked from the performance tuning wiki page,
some very helpful stuff there so thanks to those people).

I am setting up a 64bit Linux server running Postgresql 8.3, the
server has 64gigs of memory and Postgres is the only major application
running on it. (This server is to go alongside some existing 8.3
servers, we will look at 8.4/9 migration later)

I'm basically wondering how the postgresql cache (ie shared_buffers)
and the OS page_cache interact. The general advice seems to be to
assign 1/4 of RAM to shared buffers.

I don't have a good knowledge of the internals but I'm wondering if
this will effectively mean that roughly the same amount of RAM being
used for the OS page cache will be used for redundantly caching
something the Postgres is caching as well?

IE when Postgres reads something from disk it will go into both the OS
page cache and the Postgresql shared_buffers and the OS page cache
copy is unlikely to be useful for anything.

If that is the case what are the downsides to having less overlap
between the caches, IE heavily favouring one or the other, such as
allocating shared_buffers to a much larger percentage (such as 90-95%
of expected 'free' memory).

Paul

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: shared_buffers advice

From
Greg Smith
Date:
Dave Crooke wrote:
> There seems to be a wide range of opinion on this .... I am new to PG
> and grew up on Oracle, where more SGA is always a good thing ... I
> know people who run Oracle on 2TB Superdome's with titanic SGA sizes
> to keep the whole DB in RAM. I'd be using a 40GB+ Oracle SGA on that
> box of yours.

I wouldn't call it opinion so much as a series of anecdotes all
suggesting the same thing:  that you cannot translate SGA practice into
PostgreSQL and expect that to work the same way.  Some data points:

-An academic study at Duke suggested 40% of RAM was optimal for their
mixed workload, but that was a fairly small amount of RAM.
http://www.cs.duke.edu/~shivnath/papers/ituned.pdf

-Tests done by Jignesh Shah at Sun not too long ago put diminishing
returns on a system with a bunch of RAM at 10GB, probably due to buffer
lock contention issues (details beyond that number not in the slides,
recalling from memory of the talk itself):
http://blogs.sun.com/jkshah/entry/postgresql_east_2008_talk_best

-My warnings about downsides related to checkpoint issues with larger
buffer pools isn't an opinion at all; that's a fact based on limitations
in how Postgres does its checkpoints.  If we get something more like
Oracle's incremental checkpoint logic, this particular concern might go
away.

-Concerns about swapping, work_mem, etc. are all very real.  All of us
who have had the database server process killed by the Linux OOM killer
at least once know that's one OS you absolutely cannot push this too
hard on.  This is not unique to here, that issue exists in Oracle+SGA
land as well:
http://lkml.indiana.edu/hypermail/linux/kernel/0103.3/0906.html

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: shared_buffers advice

From
"Pierre C"
Date:
> -My warnings about downsides related to checkpoint issues with larger
> buffer pools isn't an opinion at all; that's a fact based on limitations
> in how Postgres does its checkpoints.  If we get something more like
> Oracle's incremental checkpoint logic, this particular concern might go
> away.

Does PG issue checkpoint writes in "sorted" order ?

I wonder about something, too : if your DB size is smaller than RAM, you
could in theory set shared_buffers to a size larger than your DB provided
you still have enough free RAM left for work_mem and OS writes management.
How does this interact with the logic which prevents seq-scans hogging
shared_buffers ?

Re: shared_buffers advice

From
Nikolas Everett
Date:


On Tue, Mar 16, 2010 at 7:24 AM, Pierre C <lists@peufeu.com> wrote:
I wonder about something, too : if your DB size is smaller than RAM, you could in theory set shared_buffers to a size larger than your DB provided you still have enough free RAM left for work_mem and OS writes management. How does this interact with the logic which prevents seq-scans hogging shared_buffers ?

I think the logic you are referring to is the clock sweep buffer accounting scheme.  That just makes sure that the most popular pages stay in the buffers.  If your entire db fits in the buffer pool then it'll all get in there real fast.

Two things to consider though:
1.  The checkpoint issue still stands.
2.  You should really mess around with your cost estimates if this is the case.  If you make random IO cost the same as sequential IO postgres will prefer index scans over bitmap index scans and table scans which makes sense if everything is in memory.

Re: shared_buffers advice

From
"Pierre C"
Date:
> I think the logic you are referring to is the clock sweep buffer
> accounting
> scheme.  That just makes sure that the most popular pages stay in the
> buffers.  If your entire db fits in the buffer pool then it'll all get in
> there real fast.


Actually, I meant that in the case of a seq scan, PG will try to use just
a few buffers (a ring) in shared_buffers instead of thrashing the whole
buffers. But if there was actually a lot of free space in shared_buffers,
do the pages stay, or do they not ?

Re: shared_buffers advice

From
Tom Lane
Date:
"Pierre C" <lists@peufeu.com> writes:
> Does PG issue checkpoint writes in "sorted" order ?

No.  IIRC, a patch for that was submitted, and rejected because no
significant performance improvement could be demonstrated.  We don't
have enough information about the actual on-disk layout to be very
intelligent about this, so it's better to just issue the writes and
let the OS sort them.

            regards, tom lane

Re: shared_buffers advice

From
Greg Stark
Date:
On Tue, Mar 16, 2010 at 1:48 PM, Pierre C <lists@peufeu.com> wrote:
> Actually, I meant that in the case of a seq scan, PG will try to use just a
> few buffers (a ring) in shared_buffers instead of thrashing the whole
> buffers. But if there was actually a lot of free space in shared_buffers, do
> the pages stay, or do they not ?

They don't. The logic only kicks in if the table is expected to be >
1/4 of shared buffers though.

--
greg

Re: shared_buffers advice

From
Greg Stark
Date:
On Tue, Mar 16, 2010 at 2:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> "Pierre C" <lists@peufeu.com> writes:
>> Does PG issue checkpoint writes in "sorted" order ?
>
> No.  IIRC, a patch for that was submitted, and rejected because no
> significant performance improvement could be demonstrated.  We don't
> have enough information about the actual on-disk layout to be very
> intelligent about this, so it's better to just issue the writes and
> let the OS sort them.

Keep in mind that postgres is issuing writes to the OS buffer cache.
It defers fsyncing the files as late as it can in the hopes that most
of those buffers will be written out by the OS before then. That gives
the OS a long time window in which to flush them out in whatever order
and whatever schedule is most convenient.

If the OS filesystem buffer cache is really small then that might not
work so well. It might be worth rerunning those benchmarks on a
machine with shared buffers taking up all of RAM.


--
greg

Re: shared_buffers advice

From
Greg Smith
Date:
Pierre C wrote:
> Actually, I meant that in the case of a seq scan, PG will try to use
> just a few buffers (a ring) in shared_buffers instead of thrashing the
> whole buffers. But if there was actually a lot of free space in
> shared_buffers, do the pages stay, or do they not ?

Pages inserted into the ring buffer and later re-used for new data do
not stay behind even if there is room for them.  There's a potential
improvement possible in that code involving better management of the
situation where the buffer cache hasn't actually reached full capacity
yet, but as it's an unusual case it's hard to justify optimizing for.
Besides, the hope is that in this case the OS cache will end up caching
everything anyway until it has a reason to evict it.  So if you follow
the rest of the data suggesting you should not give all the memory to
PostgreSQL to manage, you end up with a reasonable solution to this
problem anyway.  Those pages will just live in the OS cache instead of
the database's, with only a few trickling in and staying behind each
time you do a sequential scan.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: shared_buffers advice

From
Greg Smith
Date:
Greg Stark wrote:
> On Tue, Mar 16, 2010 at 2:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
>> "Pierre C" <lists@peufeu.com> writes:
>>
>>> Does PG issue checkpoint writes in "sorted" order ?
>>>
>> No.  IIRC, a patch for that was submitted, and rejected because no
>> significant performance improvement could be demonstrated.
> If the OS filesystem buffer cache is really small then that might not
> work so well. It might be worth rerunning those benchmarks on a
> machine with shared buffers taking up all of RAM.
>

Here's the original patch again:
http://archives.postgresql.org/message-id/20080415181742.6C97.52131E4D@oss.ntt.co.jp

I was the person who tried to reproduce the suggested 10% pgbench
speedup on a similar system and couldn't replicate any improvement.
Never was sure what was going on to show such a difference on the
reference system used to develop the patch versus mine, since they were
pretty similar.  Possibly some positive interaction with LVM in the test
case I didn't have.  Maybe the actual reason sorting helped was
limitations in the HP P400 controller used there I wasn't running into
with the Areca card I used.  And the always popular "didn't account
fully for all pgbench run to run variation"  possibility crossed my mind
too--that the original observed speedup wasn't caused by the patch but
by something else.

I did not go out of my way to find test conditions where the patch would
more likely to help, like the situation you describe where
shared_buffers was really large relative to the OS cache.  Since the
patch complicates the checkpoint code and requires some working memory
to operate, it would have to be a unquestionable win using standard
practices before it was worth applying.  If it only helps in a situation
people are unlikely to use in the field, and it net negative for
everyone else, that's still going to end up on the interesting but
rejected idea scrapheap at the end of the day.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: shared_buffers advice

From
Alvaro Herrera
Date:
Greg Stark escribió:
> On Tue, Mar 16, 2010 at 2:30 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > "Pierre C" <lists@peufeu.com> writes:
> >> Does PG issue checkpoint writes in "sorted" order ?
> >
> > No.  IIRC, a patch for that was submitted, and rejected because no
> > significant performance improvement could be demonstrated.  We don't
> > have enough information about the actual on-disk layout to be very
> > intelligent about this, so it's better to just issue the writes and
> > let the OS sort them.
>
> Keep in mind that postgres is issuing writes to the OS buffer cache.
> It defers fsyncing the files as late as it can in the hopes that most
> of those buffers will be written out by the OS before then. That gives
> the OS a long time window in which to flush them out in whatever order
> and whatever schedule is most convenient.

Maybe it would make more sense to try to reorder the fsync calls
instead.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: shared_buffers advice

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Maybe it would make more sense to try to reorder the fsync calls
> instead.

Reorder to what, though?  You still have the problem that we don't know
much about the physical layout on-disk.

            regards, tom lane

Re: shared_buffers advice

From
Alvaro Herrera
Date:
Tom Lane escribió:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Maybe it would make more sense to try to reorder the fsync calls
> > instead.
>
> Reorder to what, though?  You still have the problem that we don't know
> much about the physical layout on-disk.

Well, to block numbers as a first step.

However, this reminds me that sometimes we take the block-at-a-time
extension policy too seriously.  We had a customer that had a
performance problem because they were inserting lots of data to TOAST
tables, causing very frequent extensions.  I kept wondering whether an
allocation policy that allocated several new blocks at a time could be
useful (but I didn't try it).  This would also alleviate fragmentation,
thus helping the physical layout be more similar to logical block
numbers.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: shared_buffers advice

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane escribi�:
>> Reorder to what, though?  You still have the problem that we don't know
>> much about the physical layout on-disk.

> Well, to block numbers as a first step.

fsync is a file-based operation, and we know exactly zip about the
relative positions of different files on the disk.

> However, this reminds me that sometimes we take the block-at-a-time
> extension policy too seriously.

Yeah, that's a huge performance penalty in some circumstances.

> We had a customer that had a
> performance problem because they were inserting lots of data to TOAST
> tables, causing very frequent extensions.  I kept wondering whether an
> allocation policy that allocated several new blocks at a time could be
> useful (but I didn't try it).  This would also alleviate fragmentation,
> thus helping the physical layout be more similar to logical block
> numbers.

That's not going to do anything towards reducing the actual I/O volume.
Although I suppose it might be useful if it just cuts the number of
seeks.

            regards, tom lane

Re: shared_buffers advice

From
Alvaro Herrera
Date:
Tom Lane escribió:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Tom Lane escribi�:
> >> Reorder to what, though?  You still have the problem that we don't know
> >> much about the physical layout on-disk.
>
> > Well, to block numbers as a first step.
>
> fsync is a file-based operation, and we know exactly zip about the
> relative positions of different files on the disk.

Doh, right, I was thinking in the sync-file-range kind of API.


> > We had a customer that had a
> > performance problem because they were inserting lots of data to TOAST
> > tables, causing very frequent extensions.  I kept wondering whether an
> > allocation policy that allocated several new blocks at a time could be
> > useful (but I didn't try it).  This would also alleviate fragmentation,
> > thus helping the physical layout be more similar to logical block
> > numbers.
>
> That's not going to do anything towards reducing the actual I/O volume.
> Although I suppose it might be useful if it just cuts the number of
> seeks.

Oh, they had no problems with I/O volume.  It was relation extension
lock that was heavily contended for them.

--
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

Re: shared_buffers advice

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane escribió:
>> That's not going to do anything towards reducing the actual I/O volume.
>> Although I suppose it might be useful if it just cuts the number of
>> seeks.

> Oh, they had no problems with I/O volume.  It was relation extension
> lock that was heavily contended for them.

Really?  I guess that serialized all the I/O ... I'll bet if we got rid
of that locking somehow, they *would* have a problem with I/O volume.

            regards, tom lane

Re: shared_buffers advice

From
Alvaro Herrera
Date:
Tom Lane escribió:
> Alvaro Herrera <alvherre@commandprompt.com> writes:
> > Tom Lane escribió:
> >> That's not going to do anything towards reducing the actual I/O volume.
> >> Although I suppose it might be useful if it just cuts the number of
> >> seeks.
>
> > Oh, they had no problems with I/O volume.  It was relation extension
> > lock that was heavily contended for them.
>
> Really?  I guess that serialized all the I/O ... I'll bet if we got rid
> of that locking somehow, they *would* have a problem with I/O volume.

Well, that would solve the problem as far as I'm concerned and they'd
have to start talking to their storage provider ;-)

--
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

Re: shared_buffers advice

From
Greg Smith
Date:
Alvaro Herrera wrote:
> Maybe it would make more sense to try to reorder the fsync calls
> instead.
>

The pretty obvious left behind idea from 8.3 spread checkpoint
development was to similarly spread the fsync calls around.  Given that
we know, for example, Linux with ext3 is going to dump the whole
filesystem write cache out when the fsync call comes in, the way they're
currently scheduled has considerably potential for improvement.

Unfortunately, since the tuning on that is going to be very platform
dependent and require a lot of benchmarking work, I think we need a
performance farm up and running as a prerequisite to finishing that work
off.  The spread checkpoint stuff was a much more obvious improvement,
and that was hard enough to quantify usefully and test.

Returning to the idea of the sorted checkpoints patch as a simple
example, if it were possible to just push that patch to a test repo and
see how that changed typical throughput/latency against a
well-established history, it would be a lot easier to figure out if
something like that is sensible to consider or not.  I'm not sure how to
make progress on similar ideas about tuning closer to the filesystem
level without having something automated that takes over the actual
benchmark running and data recording steps; it's just way too time
consuming to do those right now with every tool that's available for
PostgreSQL so far.  That's the problem I work on, there are easily a
half dozen good ideas for improvements here floating around where coding
time is dwarfed by required performance validation time.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Fwd: shared_buffers advice

From
VJK
Date:


See below:


On Wed, Mar 10, 2010 at 9:28 PM, Paul McGarry <paul@paulmcgarry.com> wrote:
Hi there,

I'm after a little bit of advice on the shared_buffers setting (I have
read the various docs on/linked from the performance tuning wiki page,
some very helpful stuff there so thanks to those people).

I am setting up a 64bit Linux server running Postgresql 8.3, the
server has 64gigs of memory and Postgres is the only major application
running on it. (This server is to go alongside some existing 8.3
servers, we will look at 8.4/9 migration later)

I'm basically wondering how the postgresql cache (ie shared_buffers)
and the OS page_cache interact. The general advice seems to be to
assign 1/4 of RAM to shared buffers.

I don't have a good knowledge of the internals but I'm wondering if
this will effectively mean that roughly the same amount of RAM being
used for the OS page cache will be used for redundantly caching
something the Postgres is caching as well?

IE when Postgres reads something from disk it will go into both the OS
page cache and the Postgresql shared_buffers and the OS page cache
copy is unlikely to be useful for anything.

If that is the case what are the downsides to having less overlap
between the caches, IE heavily favouring one or the other, such as
allocating shared_buffers to a much larger percentage (such as 90-95%
of expected 'free' memory).

Pg apparently does not have an option of using direct IO with reads which some other databases do (the O_DIRECT mode).  Therefore,  double-buffering with read operations seems unavoidable.  Counterintuitively,   it may be a good idea to just rely on OS buffering and keep shared_buffers rather small, say, 512MB.

VJ


 
Paul

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance


Re: shared_buffers advice

From
Dimitri Fontaine
Date:
Greg Smith <greg@2ndquadrant.com> writes:
>  I'm not sure how to make progress on similar ideas about
> tuning closer to the filesystem level without having something automated
> that takes over the actual benchmark running and data recording steps; it's
> just way too time consuming to do those right now with every tool that's
> available for PostgreSQL so far.  That's the problem I work on, there are
> easily a half dozen good ideas for improvements here floating around where
> coding time is dwarfed by required performance validation time.

I still think the best tool around currently for this kind of testing is
tsung, but I've yet to have the time to put money where my mouth is, as
they say. Still, I'd be happy to take some time a help you decide if
it's the tool you want to base your performance testing suite on or not.

  http://tsung.erlang-projects.org/

Regards,
--
dim

Re: shared_buffers advice

From
Greg Smith
Date:
Dimitri Fontaine wrote:
> I still think the best tool around currently for this kind of testing is
> tsung

I am happy to say that for now, pgbench is the only actual testing tool
supported.  Done; now I don't need tsung.

However, that doesn't actually solve any of the problems I was talking
about though, which is why I'm not even talking about that part.  We
need the glue to pull out software releases, run whatever testing tool
is appropriate, and then save the run artifacts in some standardized
form so they can be referenced with associated build/configuration
information to track down a regression when it does show up.  Building
those boring bits are the real issue here; load testing tools are easy
to find because those are fun to work on.

And as a general commentary on the vision here, tsung will never fit
into this anyway because "something that can run on the buildfarm
machines with the software they already have installed" is the primary
target.  I don't see anything about tsung so interesting that it trumps
that priority, even though it is an interesting tool.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: shared_buffers advice

From
Dimitri Fontaine
Date:
Greg Smith <greg@2ndquadrant.com> writes:
> However, that doesn't actually solve any of the problems I was talking about
> though, which is why I'm not even talking about that part.  We need the glue
> to pull out software releases, run whatever testing tool is appropriate, and
> then save the run artifacts in some standardized form so they can be
> referenced with associated build/configuration information to track down a
> regression when it does show up.  Building those boring bits are the real
> issue here; load testing tools are easy to find because those are fun to
> work on.

Oh, ok. I missparsed the previous message. Tsung has a way to monitor OS
level information, and I guess adding the build/configuration would
be... as easy as adding it to pgbench :)

> And as a general commentary on the vision here, tsung will never fit into
> this anyway because "something that can run on the buildfarm machines with
> the software they already have installed" is the primary target.  I don't
> see anything about tsung so interesting that it trumps that priority, even
> though it is an interesting tool.

I though we might talk about a performance farm which would be quite
different, if only because to sustain a high enough client load you
might need more than one injector machine targeting a given server at
once.

But if you target the buildfarm, introducing new dependencies does sound
like a problem (that I can't evaluate the importance of).

Regards,
--
dim

Re: shared_buffers advice

From
Konrad Garus
Date:
2010/3/11 Paul McGarry <paul@paulmcgarry.com>:

> I'm basically wondering how the postgresql cache (ie shared_buffers)
> and the OS page_cache interact. The general advice seems to be to
> assign 1/4 of RAM to shared buffers.
>
> I don't have a good knowledge of the internals but I'm wondering if
> this will effectively mean that roughly the same amount of RAM being
> used for the OS page cache will be used for redundantly caching
> something the Postgres is caching as well?

I have a similar problem but I can't see an answer in this thread.

Our dedicated server has 16 GB RAM. Among other settings
shared_buffers is 2 GB, effective_cache_size is 12 GB.

Do shared_buffers duplicate contents of OS page cache? If so, how do I
know if 25% RAM is the right value for me? Actually it would not seem
to be true - the less redundancy the better.

Another question - is there a tool or built-in statistic that tells
when/how often/how much a table is read from disk? I mean physical
read, not poll from OS cache to shared_buffers.

--
Konrad Garus

[SPAM] Re: shared_buffers advice

From
Ben Chobot
Date:
On May 24, 2010, at 4:25 AM, Konrad Garus wrote:

> Do shared_buffers duplicate contents of OS page cache? If so, how do I
> know if 25% RAM is the right value for me? Actually it would not seem
> to be true - the less redundancy the better.

You can look into the pg_buffercache contrib module.

> Another question - is there a tool or built-in statistic that tells
> when/how often/how much a table is read from disk? I mean physical
> read, not poll from OS cache to shared_buffers.

Well, the pg_stat_* tables tell you how much logical IO is going on, but postgres has no way of knowing how effective
theOS or disk controller caches are. 

Re: shared_buffers advice

From
Merlin Moncure
Date:
On Wed, Mar 10, 2010 at 9:28 PM, Paul McGarry <paul@paulmcgarry.com> wrote:
> Hi there,
>
> I'm after a little bit of advice on the shared_buffers setting (I have
> read the various docs on/linked from the performance tuning wiki page,
> some very helpful stuff there so thanks to those people).
>
> I am setting up a 64bit Linux server running Postgresql 8.3, the
> server has 64gigs of memory and Postgres is the only major application
> running on it. (This server is to go alongside some existing 8.3
> servers, we will look at 8.4/9 migration later)
>
> I'm basically wondering how the postgresql cache (ie shared_buffers)
> and the OS page_cache interact. The general advice seems to be to
> assign 1/4 of RAM to shared buffers.
>
> I don't have a good knowledge of the internals but I'm wondering if
> this will effectively mean that roughly the same amount of RAM being
> used for the OS page cache will be used for redundantly caching
> something the Postgres is caching as well?
>
> IE when Postgres reads something from disk it will go into both the OS
> page cache and the Postgresql shared_buffers and the OS page cache
> copy is unlikely to be useful for anything.
>
> If that is the case what are the downsides to having less overlap
> between the caches, IE heavily favouring one or the other, such as
> allocating shared_buffers to a much larger percentage (such as 90-95%
> of expected 'free' memory).

I've personally heard tons of anecdotal evidence wrt shared buffers
setting.  There is a bit of benchmarking info suggesting you can eek
marginal gains via shared buffers setting but you have to take
(unfortunately) o/s, hardware, filesystem and other factors all into
account.

Here is what I'm pretty confident about saying:
*) a page fault to disk is a much bigger deal than a fault to pg cache
vs os/ cache.

many people assume that raising shared buffers decreases the chance of
a disk fault.  it doesn't -- at least not in the simple way you would
think -- all modern o/s aggressively cache filesystem data already so
we are simply layering over the o/s cache.

If your database is really big -- anything that reduces disk faults is
a win and increases them is a loss.  tps measurements according to
pgbench are not as interesting to me as iops from the disk system.

*) shared buffer affects are hard to detect in the single user case.

The performance of a single 'non disk bound' large query will perform
pretty much the same regardless of how you set shared buffers.  In
other words, you will not be able to easily measure the differences in
the setting outside of a real or simulated production workload.

*) shared_buffers is one of the _least_ important performance settings
in postgresql.conf

Many settings, like work_mem, planner tweaks, commit settings,
autovacuum settings, can dramatically impact your workload performance
in spectacular ways, but tend to be 'case by case' specific.  shared
buffers affects _everything_, albeit in very subtle ways, so you have
to be careful.

*) I sometimes wonder if the o/s should just manage everything.

we just said goodbye to the fsm (thank goodness for that!) -- what
about a fully o/s managed cache?  goodbye svsv ipc?  note my views
here are very simplistic -- I don't have anything close to a full
understanding of the cache machinery in the database.

merlin

Re: shared_buffers advice

From
Konrad Garus
Date:
2010/5/24 Merlin Moncure <mmoncure@gmail.com>:

> *) a page fault to disk is a much bigger deal than a fault to pg cache
> vs os/ cache.

That was my impression. That's why I did not touch our 2/16 GB setting
right away. I guess that 2 more gigabytes in OS cache is better than 2
more (duplicated) gigabytes in PG shared_buffers. In our case 2 GB
shared_buffers appears to be enough to avoid thrashing between OS and
PG.

> *) shared_buffers is one of the _least_ important performance settings
> in postgresql.conf
>
> Many settings, like work_mem, planner tweaks, commit settings,
> autovacuum settings

Can you recommend any sources on these parameters, especially commit
settings and planner tweaks?


Thank you so much for the whole answer! Not only it addresses the
immediate question, but also many of the unasked that I had in the
back of my head. It's brief and gives a broad view over all the
performance concerns. It should be part of documentation or the first
page of performance wiki. Have you copied it from somewhere?

--
Konrad Garus

Re: shared_buffers advice

From
Merlin Moncure
Date:
On Tue, May 25, 2010 at 5:58 AM, Konrad Garus <konrad.garus@gmail.com> wrote:
> 2010/5/24 Merlin Moncure <mmoncure@gmail.com>:
>
>> *) a page fault to disk is a much bigger deal than a fault to pg cache
>> vs os/ cache.
>
> That was my impression. That's why I did not touch our 2/16 GB setting
> right away. I guess that 2 more gigabytes in OS cache is better than 2
> more (duplicated) gigabytes in PG shared_buffers. In our case 2 GB
> shared_buffers appears to be enough to avoid thrashing between OS and
> PG.
>
>> *) shared_buffers is one of the _least_ important performance settings
>> in postgresql.conf
>>
>> Many settings, like work_mem, planner tweaks, commit settings,
>> autovacuum settings
>
> Can you recommend any sources on these parameters, especially commit
> settings and planner tweaks?
>
>
> Thank you so much for the whole answer! Not only it addresses the
> immediate question, but also many of the unasked that I had in the
> back of my head. It's brief and gives a broad view over all the
> performance concerns. It should be part of documentation or the first
> page of performance wiki. Have you copied it from somewhere?

Thank you for your nice comments.  This was strictly a brain dump from
yours truly.  There is a fairly verbose guide on the wiki
(http://wiki.postgresql.org/wiki/Tuning_Your_PostgreSQL_Server).
There is a lot of good info there but it's missing a few things
(from_collapse_limit for example).

I would prefer to see the annotated performance oriented .conf
settings to be written in terms of trade offs (too low? X too high? Y
setting in order to get? Z).  For example, did you know that if crank
max_locks_per_transaction you also increase the duration of every
query that hits pg_locks() -- well, now you do :-).

merlin

Re: shared_buffers advice

From
Cédric Villemain
Date:
2010/5/24 Konrad Garus <konrad.garus@gmail.com>:
> 2010/3/11 Paul McGarry <paul@paulmcgarry.com>:
>
>> I'm basically wondering how the postgresql cache (ie shared_buffers)
>> and the OS page_cache interact. The general advice seems to be to
>> assign 1/4 of RAM to shared buffers.
>>
>> I don't have a good knowledge of the internals but I'm wondering if
>> this will effectively mean that roughly the same amount of RAM being
>> used for the OS page cache will be used for redundantly caching
>> something the Postgres is caching as well?
>
> I have a similar problem but I can't see an answer in this thread.
>
> Our dedicated server has 16 GB RAM. Among other settings
> shared_buffers is 2 GB, effective_cache_size is 12 GB.
>
> Do shared_buffers duplicate contents of OS page cache? If so, how do I
> know if 25% RAM is the right value for me? Actually it would not seem
> to be true - the less redundancy the better.

At the moment where a block is requested for the first time (usualy
8kb from postgres, so in fact 2 blocks in OS), you have 'duplicate'
buffers.
But, depending of your workload, it is not so bad because those 2
blocks should not be requested untill some time (because in postgresql
shared buffers) and should be evicted by OS in favor of new blocks
requests.
Again it depends on your workload, if you have a case where you
refresh a lot the shared buffers then you will have more blocks in the
2 caches  at the same time.

You can try pgfincore extension to grab stats from OS cache and/or
patch postgresql if you want real stats ;)
pgbuffercache is provided with postgresql and deliver very usefull information :
http://www.postgresql.org/docs/8.4/interactive/pgbuffercache.html

>
> Another question - is there a tool or built-in statistic that tells
> when/how often/how much a table is read from disk? I mean physical
> read, not poll from OS cache to shared_buffers.
>
> --
> Konrad Garus
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: shared_buffers advice

From
Konrad Garus
Date:
2010/5/26 Cédric Villemain <cedric.villemain.debian@gmail.com>:

> At the moment where a block is requested for the first time (usualy
> 8kb from postgres, so in fact 2 blocks in OS), you have 'duplicate'
> buffers.
> But, depending of your workload, it is not so bad because those 2
> blocks should not be requested untill some time (because in postgresql
> shared buffers) and should be evicted by OS in favor of new blocks
> requests.

Since pg_buffercache is 4-8 times smaller, it would seem to be
extremely rare to me. And when PG requests a block, it also needs to
evict something from shared_buffers.

> You can try pgfincore extension to grab stats from OS cache and/or
> patch postgresql if you want real stats ;)

Thank you! It seems to be the tool I was looking for. Could help me
locate and troubleshoot the hogs in page cache. I also find the
snapshot/restore function promising. Every morning our cache is cold
or filled with irrelevant data left by nightly batch jobs, thus
severely impacting the performance. Seems to be exactly what this tool
is for.

How does it work? How stable is it? Can we use it in production on a
daily basis?

> pgbuffercache is provided with postgresql and deliver very usefull information :
> http://www.postgresql.org/docs/8.4/interactive/pgbuffercache.html

Thank you. I already am using it. I've already found a few hogs with it.

--
Konrad Garus

Re: shared_buffers advice

From
Cédric Villemain
Date:
2010/5/27 Konrad Garus <konrad.garus@gmail.com>:
> 2010/5/26 Cédric Villemain <cedric.villemain.debian@gmail.com>:
>
>> At the moment where a block is requested for the first time (usualy
>> 8kb from postgres, so in fact 2 blocks in OS), you have 'duplicate'
>> buffers.
>> But, depending of your workload, it is not so bad because those 2
>> blocks should not be requested untill some time (because in postgresql
>> shared buffers) and should be evicted by OS in favor of new blocks
>> requests.
>
> Since pg_buffercache is 4-8 times smaller, it would seem to be
> extremely rare to me. And when PG requests a block, it also needs to
> evict something from shared_buffers.

3 very important things :
* postgresql shared buffers are database oriented
* OS shared buffers are *more* complex and will not evict the same
buffers as postgres.
* OS page cache can handle tens of GB where postgres usually have no
gain in performance over 10GB.

>
>> You can try pgfincore extension to grab stats from OS cache and/or
>> patch postgresql if you want real stats ;)
>
> Thank you! It seems to be the tool I was looking for. Could help me
> locate and troubleshoot the hogs in page cache. I also find the
> snapshot/restore function promising. Every morning our cache is cold
> or filled with irrelevant data left by nightly batch jobs, thus
> severely impacting the performance. Seems to be exactly what this tool
> is for.
>
> How does it work? How stable is it? Can we use it in production on a
> daily basis?

It works thanks to mincore/posix_fadvise stuff : you need linux.
It is stable enough in my own experiment. I did use it for debugging
purpose in production servers with succes.
BUT :
* snapshot/restore is done via a flat_file (one per segment or
table/index) and *it is not removed* when you drop a table.
* it might exist corner case not yet handled (like snapshot a
database, change things like drop table, truncate table, then restore)

It needs some polish to be totally production ready but the job can be done.


>
>> pgbuffercache is provided with postgresql and deliver very usefull information :
>> http://www.postgresql.org/docs/8.4/interactive/pgbuffercache.html
>
> Thank you. I already am using it. I've already found a few hogs with it.
>
> --
> Konrad Garus
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: shared_buffers advice

From
Konrad Garus
Date:
2010/5/27 Cédric Villemain <cedric.villemain.debian@gmail.com>:

> It works thanks to mincore/posix_fadvise stuff : you need linux.
> It is stable enough in my own experiment. I did use it for debugging
> purpose in production servers with succes.

What impact does it have on performance?

Does it do anything, is there any interaction between it and PG/OS,
when it's not executing a command explicitly invoked by me?

--
Konrad Garus

Re: shared_buffers advice

From
Cédric Villemain
Date:
2010/5/27 Konrad Garus <konrad.garus@gmail.com>:
> 2010/5/27 Cédric Villemain <cedric.villemain.debian@gmail.com>:
>
>> It works thanks to mincore/posix_fadvise stuff : you need linux.
>> It is stable enough in my own experiment. I did use it for debugging
>> purpose in production servers with succes.
>
> What impact does it have on performance?

pgmincore() and pgmincore_snapshot() both are able to mmap up to 1GB.
I didn't mesure a performance impact. But I haven't enough benchmarks/test yet.

>
> Does it do anything, is there any interaction between it and PG/OS,
> when it's not executing a command explicitly invoked by me?

pgfincore does nothing until you call one of the functions.

Reducing the mmap window is faisable, and I had start something to use
effective_io_concurrency in order to improve prefetch (for restore)
but this part of the code is not yet finished.

>
> --
> Konrad Garus
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: shared_buffers advice

From
Konrad Garus
Date:
2010/5/27 Cédric Villemain <cedric.villemain.debian@gmail.com>:

> pgmincore() and pgmincore_snapshot() both are able to mmap up to 1GB.

Does it mean they can occupy 1 GB of RAM? How does it relate to amount
of page buffers mapped by OS?

--
Konrad Garus

Re: shared_buffers advice

From
Cédric Villemain
Date:
2010/5/27 Konrad Garus <konrad.garus@gmail.com>:
> 2010/5/27 Cédric Villemain <cedric.villemain.debian@gmail.com>:
>
>> pgmincore() and pgmincore_snapshot() both are able to mmap up to 1GB.
>
> Does it mean they can occupy 1 GB of RAM? How does it relate to amount
> of page buffers mapped by OS?

well, that is the projection of file in memory. only projection, but
the memory is still acquire. It is ok to rework this part and project
something like 128MB and loop. (in fact the code is needed for 9.0
because segment can be > 1GB, I didn't check what is the optimum
projection size yet)
So both yes at your questions :)

>
> --
> Konrad Garus
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: shared_buffers advice

From
Konrad Garus
Date:
2010/5/27 Cédric Villemain <cedric.villemain.debian@gmail.com>:

> well, that is the projection of file in memory. only projection, but
> the memory is still acquire. It is ok to rework this part and project
> something like 128MB and loop. (in fact the code is needed for 9.0
> because segment can be > 1GB, I didn't check what is the optimum
> projection size yet)
> So both yes at your questions :)

So when I map 12 GB, this process will consume 1 GB and the time
needed to browse through the whole 12 GB buffer?

--
Konrad Garus

Re: shared_buffers advice

From
Cédric Villemain
Date:
2010/5/27 Konrad Garus <konrad.garus@gmail.com>:
> 2010/5/27 Cédric Villemain <cedric.villemain.debian@gmail.com>:
>
>> well, that is the projection of file in memory. only projection, but
>> the memory is still acquire. It is ok to rework this part and project
>> something like 128MB and loop. (in fact the code is needed for 9.0
>> because segment can be > 1GB, I didn't check what is the optimum
>> projection size yet)
>> So both yes at your questions :)
>
> So when I map 12 GB, this process will consume 1 GB and the time
> needed to browse through the whole 12 GB buffer?

Exactly. And the time to browse depend on the number of blocks already
in core memory.
I am interested by tests results and benchmarks if you are going to do some :)


>
> --
> Konrad Garus
>



--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: shared_buffers advice

From
Konrad Garus
Date:
2010/5/27 Cédric Villemain <cedric.villemain.debian@gmail.com>:

> Exactly. And the time to browse depend on the number of blocks already
> in core memory.
> I am interested by tests results and benchmarks if you are going to do some :)

I am still thinking whether I want to do it on this prod machine.
Maybe on something less critical first (but still with a good amount
of memory mapped by page buffers).

What system have you tested it on? Has it ever run on a few-gig system? :-)

--
Konrad Garus

Re: shared_buffers advice

From
Cédric Villemain
Date:
2010/5/28 Konrad Garus <konrad.garus@gmail.com>:
> 2010/5/27 Cédric Villemain <cedric.villemain.debian@gmail.com>:
>
>> Exactly. And the time to browse depend on the number of blocks already
>> in core memory.
>> I am interested by tests results and benchmarks if you are going to do some :)
>
> I am still thinking whether I want to do it on this prod machine.
> Maybe on something less critical first (but still with a good amount
> of memory mapped by page buffers).
>
> What system have you tested it on? Has it ever run on a few-gig system? :-)

databases up to 300GB for the stats purpose.
The snapshot/restore was done for bases around 40-50GB but with only
16GB of RAM.

I really thing some improvments are posible before using it in
production, even if it should work well as it is.
At least something to remove the orphan snapshot files (in case of
drop table, or truncate). And probably increase the quality of the
code around the prefetch.(better handling of
effective_io_concurrency...the prefetch is linerar but blocks requests
are grouped)

If you are able to test/benchs on a pre-production env, do it :)

--
Cédric Villemain               2ndQuadrant
http://2ndQuadrant.fr/     PostgreSQL : Expertise, Formation et Support

Re: shared_buffers advice

From
Greg Smith
Date:
Merlin Moncure wrote:
> I would prefer to see the annotated performance oriented .conf
> settings to be written in terms of trade offs (too low? X too high? Y
> setting in order to get? Z).  For example, did you know that if crank
> max_locks_per_transaction you also increase the duration of every
> query that hits pg_locks() -- well, now you do :-).
>

You can't do this without providing more context and tools for people to
measure their systems.  At PGCon last week, I presented a talk
specifically about tuning shared_buffers and the checkpoint settings.
What's come out of my research there is that you can stare at the data
in pg_buffercache and pg_stat_bgwriter and classify systems based on the
distribution of usage counts in their buffer cache on how the background
writer copes with that.  The right style of tuning to apply is dependent
on whether someone has a high proportion of buffers with a usage count
 >=2.  A tuning guide that actually covered that in enough detail to be
an improvement over what is in the "Tuning Your PostgreSQL Server" would
be overwhelming large, defeating the purpose of that document--providing
a fairly bite-size guide.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: shared_buffers advice

From
Merlin Moncure
Date:
On Fri, May 28, 2010 at 2:57 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Merlin Moncure wrote:
>>
>> I would prefer to see the annotated performance oriented .conf
>> settings to be written in terms of trade offs (too low? X too high? Y
>> setting in order to get? Z).  For example, did you know that if crank
>> max_locks_per_transaction you also increase the duration of every
>> query that hits pg_locks() -- well, now you do :-).
>>
>
> You can't do this without providing more context and tools for people to
> measure their systems.  At PGCon last week, I presented a talk specifically
> about tuning shared_buffers and the checkpoint settings.  What's come out of
> my research there is that you can stare at the data in pg_buffercache and
> pg_stat_bgwriter and classify systems based on the distribution of usage
> counts in their buffer cache on how the background writer copes with that.
>  The right style of tuning to apply is dependent on whether someone has a
> high proportion of buffers with a usage count >=2.  A tuning guide that
> actually covered that in enough detail to be an improvement over what is in
> the "Tuning Your PostgreSQL Server" would be overwhelming large, defeating
> the purpose of that document--providing a fairly bite-size guide.

Sure. IOW, a .conf guide should answer:
*) What are the symptoms of misconfigured shared_buffers (too low/too high)?
*) How do you confirm this?

Our documentation is completely unclear wrt these questions, other
than to give some vague advice in order to get 'good performance'.
I'm of the opinion (rightly or wrongly) that the prevailing opinions
on how to configure shared_buffers are based on special case
benchmarking information or simply made up.  The dangers of setting it
too high are very real (especially on linux) but this isn't mentioned;
contrast that to the care put into the fsync language.  This is in the
face of some prevailing myths (more shared_buffers = more cache =
faster)  that have some grains of truth but aren't the whole story.  I
just helped out a friend that oversubscribed and blew up his linux
box...oom killer strikes again.

I'm not complaining here mind you; I'd just like to filter out all the
anecdotal information and similar noise.  shared_buffers is a bit of a
bugaboo because it is fairly subtle in how it interacts with
production workloads and there is so little solid information out
there.  I would *love* to see some formal verifiable tests showing >
20% improvements resulting from shared_buffers tweaks that could be
repeated on other hardware/os installations.   Got any notes for your
talk? :-)

merlin

Re: shared_buffers advice

From
Greg Smith
Date:
Merlin Moncure wrote:
> I'm of the opinion (rightly or wrongly) that the prevailing opinions
> on how to configure shared_buffers are based on special case
> benchmarking information or simply made up.

Well, you're wrong, but it's OK; we'll forgive you this time.  It's true
that a lot of the earlier advice here wasn't very well tested across
multiple systems.  I have a stack of data that supports the anecdotal
guidelines are in the right ballpark now though, most of which is
protected by NDA.  If you look at the spreadsheet at
http://www.pgcon.org/2010/schedule/events/218.en.html you'll see three
examples I was able to liberate for public consumption, due to some
contributions by list regulars here (I'm working on a fourth right
now).  The first one has shared_buffers set at 8GB on a 96GB server, at
the upper limit of where it's useful, and the database is using every
bit of that more effectively than had it been given to the OS to
manage.  (I'm starting to get access to test hardware to investigate why
there's an upper limit wall around 10GB for shared_buffers too)  The
other two are smaller systems, and they don't benefit nearly as much
from giving the database memory given their workload.  Basically it
comes down to two things:

1) Are you getting a lot of buffers where the usage count is >=3?  If
not, you can probably reduce shared_buffers and see better performance.
This is not the case with the first system shown, but is true on the
second and third.

2) Is the average size of the checkpoints too large?  If so, you might
have to reduce shared_buffers in order to pull that down.  Might even
need to pull down the checkpoint parameters too.

Workloads that don't like the database to have RAM certainly exist, but
there are just as many that appreciate every bit of memory you dedicated
to it.

With all the tuning work I've been doing the last few months, the only
thing I've realized the standard guidelines (as embodied by pgtune and
the wiki pages) are wrong is in regards to work_mem.  You have to be
much more careful with that than what pgtune in particular suggests.
The rest of the rules of thumb pgtune is based on and "Tuning your
PostgreSQL Server" suggests are not bad.

Accomplishing a major advance over the current state of things really
needs some captures of real application load from a production system of
both major types that we can playback, to give something more realistic
than one of the boring benchmark loads.  Dimitri Fontaine is working on
some neat tools in that area, and now that we're working together more
closely I'm hoping we can push that work forward further.  That's the
real limiting factor here now, assembling repeatable load testing that
looks like an application rather than a benchmark.

--
Greg Smith  2ndQuadrant US  Baltimore, MD
PostgreSQL Training, Services and Support
greg@2ndQuadrant.com   www.2ndQuadrant.us


Re: shared_buffers advice

From
Scott Marlowe
Date:
On Mon, May 24, 2010 at 12:25 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> *) shared_buffers is one of the _least_ important performance settings
> in postgresql.conf

Yes, and no.  It's usually REALLY helpful to make sure it's more than
8 or 24Megs.  But it doesn't generally need to be huge to make a
difference.

Re: shared_buffers advice

From
Dave Crooke
Date:
If, like me, you came from the Oracle world, you may be tempted to throw a ton of RAM at this. Don't. PG does not like it.

On Fri, May 28, 2010 at 4:11 PM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
On Mon, May 24, 2010 at 12:25 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> *) shared_buffers is one of the _least_ important performance settings
> in postgresql.conf

Yes, and no.  It's usually REALLY helpful to make sure it's more than
8 or 24Megs.  But it doesn't generally need to be huge to make a
difference.

--
Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-performance

Re: shared_buffers advice

From
Merlin Moncure
Date:
On Fri, May 28, 2010 at 5:02 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Merlin Moncure wrote:
>>
>> I'm of the opinion (rightly or wrongly) that the prevailing opinions
>> on how to configure shared_buffers are based on special case
>> benchmarking information or simply made up.
>
> Well, you're wrong, but it's OK; we'll forgive you this time.  It's true
> that a lot of the earlier advice here wasn't very well tested across
> multiple systems.  I have a stack of data that supports the anecdotal
> guidelines are in the right ballpark now though, most of which is protected
> by NDA.  If you look at the spreadsheet at
> http://www.pgcon.org/2010/schedule/events/218.en.html you'll see three
> examples I was able to liberate for public consumption, due to some
> contributions by list regulars here (I'm working on a fourth right now).
>  The first one has shared_buffers set at 8GB on a 96GB server, at the upper
> limit of where it's useful, and the database is using every bit of that more
> effectively than had it been given to the OS to manage.  (I'm starting to
> get access to test hardware to investigate why there's an upper limit wall
> around 10GB for shared_buffers too)  The other two are smaller systems, and
> they don't benefit nearly as much from giving the database memory given
> their workload.  Basically it comes down to two things:
>
> 1) Are you getting a lot of buffers where the usage count is >=3?  If not,
> you can probably reduce shared_buffers and see better performance.  This is
> not the case with the first system shown, but is true on the second and
> third.
>
> 2) Is the average size of the checkpoints too large?  If so, you might have
> to reduce shared_buffers in order to pull that down.  Might even need to
> pull down the checkpoint parameters too.
>
> Workloads that don't like the database to have RAM certainly exist, but
> there are just as many that appreciate every bit of memory you dedicated to
> it.
>
> With all the tuning work I've been doing the last few months, the only thing
> I've realized the standard guidelines (as embodied by pgtune and the wiki
> pages) are wrong is in regards to work_mem.  You have to be much more
> careful with that than what pgtune in particular suggests.  The rest of the
> rules of thumb pgtune is based on and "Tuning your PostgreSQL Server"
> suggests are not bad.
>
> Accomplishing a major advance over the current state of things really needs
> some captures of real application load from a production system of both
> major types that we can playback, to give something more realistic than one
> of the boring benchmark loads.  Dimitri Fontaine is working on some neat
> tools in that area, and now that we're working together more closely I'm
> hoping we can push that work forward further.  That's the real limiting
> factor here now, assembling repeatable load testing that looks like an
> application rather than a benchmark.

This is great information -- exactly the kind of research I'm talking
about.  btw I like being proved wrong! :-)  I need some time to
process this.

merlin