Thread: The shared buffers challenge

The shared buffers challenge

From

Merlin Moncure

Date:

26 May 2011, 14:32:07

Hello performers, I've long been unhappy with the standard advice
given for setting shared buffers.  This includes the stupendously
vague comments in the standard documentation, which suggest certain
settings in order to get 'good performance'.  Performance of what?
Connection negotiation speed?  Not that it's wrong necessarily, but
ISTM too much based on speculative or anecdotal information.  I'd like
to see the lore around this setting clarified, especially so we can
refine advice to: 'if you are seeing symptoms x,y,z set shared_buffers
from a to b to get symptom reduction of k'.  I've never seen a
database blow up from setting them too low, but over the years I've
helped several people with bad i/o situations or outright OOM
conditions from setting them too high.

My general understanding of shared_buffers is that they are a little
bit faster than filesystem buffering (everything these days is
ultimately based on mmap AIUI, so there's no reason to suspect
anything else).  Where they are most helpful is for masking of i/o if
a page gets dirtied >1 times before it's written out to the heap, but
seeing any benefit from that at all is going to be very workload
dependent.  There are also downsides using them instead of on the heap
as well, and the amount of buffers you have influences checkpoint
behavior.  So things are complex.

So, the challenge is this: I'd like to see repeatable test cases that
demonstrate regular performance gains > 20%.  Double bonus points for
cases that show gains > 50%.  No points given for anecdotal or
unverifiable data. Not only will this help raise the body of knowledge
regarding the setting, but it will help produce benchmarking metrics
against which we can measure multiple interesting buffer related
patches in the pipeline.  Anybody up for it?

merlin

Re: The shared buffers challenge

From

"Kevin Grittner"

Date:

26 May 2011, 15:10:30

Merlin Moncure <mmoncure@gmail.com> wrote:

> So, the challenge is this: I'd like to see repeatable test cases
> that demonstrate regular performance gains > 20%.  Double bonus
> points for cases that show gains > 50%.

Are you talking throughput, maximum latency, or some other metric?

In our shop the metric we tuned for in reducing shared_buffers was
getting the number of "fast" queries (which normally run in under a
millisecond) which would occasionally, in clusters, take over 20
seconds (and thus be canceled by our web app and present as errors
to the public) down to zero.  While I know there are those who care
primarily about throughput numbers, that's worthless to me without
maximum latency information under prolonged load.  I'm not talking
90th percentile latency numbers, either -- if 10% of our web
requests were timing out the villagers would be coming after us with
pitchforks and torches.

-Kevin

Re: The shared buffers challenge

From

Merlin Moncure

Date:

26 May 2011, 15:36:28

On Thu, May 26, 2011 at 10:10 AM, Kevin Grittner
<Kevin.Grittner@wicourts.gov> wrote:
> Merlin Moncure <mmoncure@gmail.com> wrote:
>
>> So, the challenge is this: I'd like to see repeatable test cases
>> that demonstrate regular performance gains > 20%.  Double bonus
>> points for cases that show gains > 50%.
>
> Are you talking throughput, maximum latency, or some other metric?

I am talking about *any* metric..you've got something, let's see it.
But it's got to be verifiable, so no points scored.

See my note above about symptoms -- if your symptom of note happens to
be unpredictable spikes in fast query times under load, then I'd like
to scribble that advice directly into the docs along with (hopefully)
some reasoning of exactly why more database managed buffers are
helping.   As noted, I'm particularly interested in things we can test
outside of production environments, since I'm pretty skeptical the
Wisconsin Court System is going to allow the internet to log in and
repeat and verify test methodologies.  Point being: cranking buffers
may have been the bee's knees with, say, the 8.2 buffer manager, but
present and future improvements may have render that change moot or
even counter productive.  I doubt it's really changed much, but we
really need to do better on this -- all else being equal, the lowest
shared_buffers setting possible without sacrificing performance is
best because it releases more memory to the o/s to be used for other
things -- so "everthing's bigger in Texas" type approaches to
postgresql.conf manipulation (not that I see that here of course) are
not necessarily better :-).

merlin

Re: The shared buffers challenge

From

Claudio Freire

Date:

26 May 2011, 15:45:33

On Thu, May 26, 2011 at 5:36 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> Point being: cranking buffers
> may have been the bee's knees with, say, the 8.2 buffer manager, but
> present and future improvements may have render that change moot or
> even counter productive.

I suggest you read the docs on how shared buffers work, because,
reasonably, it would be all the way around.

Recent improvments into how postgres manage its shared buffer pool
makes them better than the OS cache, so there should be more incentive
to increase them, rather than decrease them.

Workload conditions may make those improvements worthless, hinting
that you should decrease them.

But you have to know your workload and you have to know how the shared
buffers work.

Re: The shared buffers challenge

From

Merlin Moncure

Date:

26 May 2011, 16:02:36

On Thu, May 26, 2011 at 10:45 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Thu, May 26, 2011 at 5:36 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> Point being: cranking buffers
>> may have been the bee's knees with, say, the 8.2 buffer manager, but
>> present and future improvements may have render that change moot or
>> even counter productive.
>
> I suggest you read the docs on how shared buffers work, because,
> reasonably, it would be all the way around.
>
> Recent improvments into how postgres manage its shared buffer pool
> makes them better than the OS cache, so there should be more incentive
> to increase them, rather than decrease them.
>
> Workload conditions may make those improvements worthless, hinting
> that you should decrease them.
>
> But you have to know your workload and you have to know how the shared
> buffers work.

I am not denying that any of those things are the case, although your
assumption that I haven't read the documentation was obviously not
grounded upon research.  What you and I know/don't know is not the
point.  The point is what we can prove, because going through the
motions of doing that is useful.  You are also totally missing my
other thrust, which is that future changes to how things work could
change the dynamics of .conf configuration -- btw not for the first
time in the history of the project.

merlin

Re: The shared buffers challenge

From

Claudio Freire

Date:

26 May 2011, 16:38:05

On Thu, May 26, 2011 at 6:02 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
> The point is what we can prove, because going through the
> motions of doing that is useful.

Exactly, and whatever you can "prove" will be workload-dependant.
So you can't prove anything "generally", since no single setting is
best for all.

> You are also totally missing my
> other thrust, which is that future changes to how things work could
> change the dynamics of .conf configuration

Nope, I'm not missing it, simply not commenting on it.

Re: The shared buffers challenge

From

Merlin Moncure

Date:

26 May 2011, 16:46:10

On Thu, May 26, 2011 at 11:37 AM, Claudio Freire <klaussfreire@gmail.com> wrote:
> On Thu, May 26, 2011 at 6:02 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
>> The point is what we can prove, because going through the
>> motions of doing that is useful.
>
> Exactly, and whatever you can "prove" will be workload-dependant.
> So you can't prove anything "generally", since no single setting is
> best for all.

Then we should stop telling people to adjust it unless we can match
the workload to the improvement.  There are some people here who can
do that as if by magic, but that's not the issue.  I'm trying to
understand the why it works better for some than for others.  What's
frustrating is simply believing something is the case, without trying
to understand why.  How about, instead of arguing with me, coming up
with something for the challenge?

merlin

Re: The shared buffers challenge

From

"Kevin Grittner"

Date:

26 May 2011, 17:45:43

Merlin Moncure <mmoncure@gmail.com> wrote:
> Kevin Grittner <Kevin.Grittner@wicourts.gov> wrote:
>> Merlin Moncure <mmoncure@gmail.com> wrote:
>>
>>> So, the challenge is this: I'd like to see repeatable test cases
>>> that demonstrate regular performance gains > 20%.  Double bonus
>>> points for cases that show gains > 50%.
>>
>> Are you talking throughput, maximum latency, or some other
>> metric?
>
> I am talking about *any* metric..you've got something, let's see
> it.  But it's got to be verifiable, so no points scored.

Oh, that wasn't to score points; just advocating for more than a
one-dimensional view of performance.  I'm adding to your demands,
not attempting to satisfy them.  :-)

> See my note above about symptoms -- if your symptom of note
> happens to be unpredictable spikes in fast query times under load,
> then I'd like to scribble that advice directly into the docs along
> with (hopefully) some reasoning of exactly why more database
> managed buffers are helping.

In our case it was *fewer* shared_buffers which helped.

> As noted, I'm particularly interested in things we can test
> outside of production environments, since I'm pretty skeptical the
> Wisconsin Court System is going to allow the internet to log in
> and repeat and verify test methodologies.

Right, while it was a fairly scientific and methodical test, it was
against a live production environment.  We adjusted parameters
incrementally, a little each day, from where they had been toward
values which were calculated in advance to be better based on our
theory of the problem (aided in no small part by analysis and advice
from Greg Smith), and saw a small improvement each day with the
problem disappearing entirely right at the target values we had
calculated in advance.  :-)

> Point being: cranking buffers may have been the bee's knees with,
> say, the 8.2 buffer manager, but present and future improvements
> may have render that change moot or even counter productive.

We did find that in 8.3 and later we can support a larger
shared_buffer setting without the problem than in 8.2 and earlier.
We still need to stay on the low side of what is often advocated to
keep the failure rate from this issue at zero.

> all else being equal, the lowest shared_buffers setting possible
> without sacrificing performance is best because it releases more
> memory to the o/s to be used for other things

I absolutely agree with this.

I think the problem is that it is very tedious and time-consuming to
construct artificial tests for these things.  Greg Smith has spent a
lot of time and done a lot of research investigating the dynamics of
these issues, and recommends a process of incremental adjustments
for tuning the relevant settings which, in my opinion, is going to
be better than any generalized advice on settings.

Don't get me wrong, I would love to see numbers which "earned
points" under the criteria you outline.  I would especially love it
if they could be part of the suite of tests in our performance farm.
I just think that the wealth of anecdotal evidence and the dearth of
repeatable benchmarks in this area is due to the relatively low-cost
techniques available to tune production systems to solve pressing
needs versus the relatively high cost of creating repeatable test
cases (without, by the way, solving an immediate need).

-Kevin

Re: The shared buffers challenge

From

Greg Smith

Date:

26 May 2011, 23:10:33

Merlin Moncure wrote:
> So, the challenge is this: I'd like to see repeatable test cases that
> demonstrate regular performance gains > 20%.  Double bonus points for
> cases that show gains > 50%.

Do I run around challenging your suggestions and giving you homework?
You have no idea how much eye rolling this whole message provoked from me.

OK, so the key thing to do is create a table such that shared_buffers is
smaller than the primary key index on a table, then UPDATE that table
furiously.  This will page constantly out of the buffer cache to the OS
one, doing work that could be avoided.  Increase shared_buffers to where
it fits instead, and all the index writes are buffered to write only
once per checkpoint.  Server settings to exaggerate the effect:

shared_buffers = 32MB
checkpoint_segments = 256
log_checkpoints = on
autovacuum = off

Test case:

createdb pgbench
pgbench -i -s 20 pgbench
psql -d pgbench -c "select
pg_size_pretty(pg_relation_size('public.pgbench_accounts_pkey'))"
psql -c "select pg_stat_reset_shared('bgwriter')"
pgbench -T 120 -c 4 -n pgbench
psql -x -c "SELECT * FROM pg_stat_bgwriter"

This gives the following size for the primary key and results:

 pg_size_pretty
----------------
 34 MB

transaction type: TPC-B (sort of)
scaling factor: 20
query mode: simple
number of clients: 4
number of threads: 1
duration: 120 s
number of transactions actually processed: 13236
tps = 109.524954 (including connections establishing)
tps = 109.548498 (excluding connections establishing)

-[ RECORD 1 ]---------+------------------------------
checkpoints_timed     | 0
checkpoints_req       | 0
buffers_checkpoint    | 0
buffers_clean         | 16156
maxwritten_clean      | 131
buffers_backend       | 5701
buffers_backend_fsync | 0
buffers_alloc         | 25276
stats_reset           | 2011-05-26 18:39:57.292777-04

Now, change so the whole index fits instead:

shared_buffers = 512MB

...which follows the good old "25% of RAM" guidelines given this system
has 2GB of RAM.  Restart the server, repeat the test case.  New results:

transaction type: TPC-B (sort of)
scaling factor: 20
query mode: simple
number of clients: 4
number of threads: 1
duration: 120 s
number of transactions actually processed: 103440
tps = 861.834090 (including connections establishing)
tps = 862.041716 (excluding connections establishing)

gsmith@meddle:~/personal/scripts$ psql -x -c "SELECT * FROM
pg_stat_bgwriter"
-[ RECORD 1 ]---------+------------------------------
checkpoints_timed     | 0
checkpoints_req       | 0
buffers_checkpoint    | 0
buffers_clean         | 0
maxwritten_clean      | 0
buffers_backend       | 1160
buffers_backend_fsync | 0
buffers_alloc         | 34071
stats_reset           | 2011-05-26 18:43:40.887229-04

Rather than writing 16156+5701=21857 buffers out during the test to
support all the index churn, instead only 1160 buffers go out,
consisting mostly of the data blocks for pgbench_accounts that are being
updated irregularly.  With less than 1 / 18th as I/O to do, the system
executes nearly 8X as many UPDATE statements during the test run.

As for figuring out how this impacts more complicated cases, I hear
somebody wrote a book or something that went into pages and pages of
detail about all this.  You might want to check it out.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: The shared buffers challenge

From

Samuel Gendler

Date:

26 May 2011, 23:47:48

On Thu, May 26, 2011 at 4:10 PM, Greg Smith <greg@2ndquadrant.com> wrote:

As for figuring out how this impacts more complicated cases, I hear somebody wrote a book or something that went into pages and pages of detail about all this. You might want to check it out.

I was just going to suggest that there was significant and detailed documentation of this stuff in a certain book, a well-thumbed copy of which should be sitting on the desk of anyone attempting any kind of postgres performance tuning.

Re: The shared buffers challenge

From

Merlin Moncure

Date:

27 May 2011, 15:23:21

On Thu, May 26, 2011 at 6:10 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Merlin Moncure wrote:
>>
>> So, the challenge is this: I'd like to see repeatable test cases that
>> demonstrate regular performance gains > 20%.  Double bonus points for
>> cases that show gains > 50%.
>
> Do I run around challenging your suggestions and giving you homework?  You
> have no idea how much eye rolling this whole message provoked from me.

That's just plain unfair: I didn't challenge your suggestion nor give
you homework. In particular, I'm not suggesting the 25%-ish default is
wrong -- but trying to help people understand why it's there and what
it's doing.  I bet 19 people out of 20 could not explain what the
primary effects of shared_buffers with any degree of accuracy.  That
group of people in fact would have included me until recently, when I
started studying bufmgr.c for mostly unrelated reasons.  Understand my
basic points:

*) the documentation should really explain this better (in particular,
it should debunk the myth 'more buffers = more caching')
*) the 'what' is not nearly so important as the 'why'
*) the popular understanding of what buffers do is totally, completely, wrong
*) I'd like to gather cases to benchmark changes that interact with
these settings
*) I think you are fighting the 'good fight'. I'm trying to help

> OK, so the key thing to do is create a table such that shared_buffers is
> smaller than the primary key index on a table, then UPDATE that table
> furiously.  This will page constantly out of the buffer cache to the OS one,
> doing work that could be avoided.  Increase shared_buffers to where it fits
> instead, and all the index writes are buffered to write only once per
> checkpoint.  Server settings to exaggerate the effect:

This is exactly what I'm looking for...that's really quite striking.
I knew that buffer 'hit' before it goes out the door is what to gun
for.

> As for figuring out how this impacts more complicated cases, I hear somebody
> wrote a book or something that went into pages and pages of detail about all
> this.  You might want to check it out.

so i've heard: http://imgur.com/lGOqx (and yes: I 100% endorse the
book: everyone who is serious about postgres should own a copy).
Anyways, double points to you ;-).

merlin

Re: The shared buffers challenge

From

Scott Carey

Date:

27 May 2011, 16:54:09

So how far do you go?  128MB?  32MB?  4MB?

Anecdotal and an assumption, but I'm pretty confident that on any server
with at least 1GB of dedicated RAM, setting it any lower than 200MB is not
even going to help latency (assuming checkpoint and log configuration is
in the realm of sane, and connections*work_mem is sane).

The defaults have been so small for so long on most platforms, that any
increase over the default generally helps performance -- and in many cases
dramatically.  So if more is better, then most users assume that even more
should be better.
But its not so simple, there are drawbacks to a larger buffer and
diminishing returns with larger size.  I think listing the drawbacks of a
larger buffer and symptoms that can result would be a big win.

And there is an OS component to it too.  You can actually get away with
shared_buffers at 90% of RAM on Solaris.  Linux will explode if you try
that (unless recent kernels have fixed its shared memory accounting).

On 5/26/11 8:10 AM, "Kevin Grittner" <Kevin.Grittner@wicourts.gov> wrote:

>Merlin Moncure <mmoncure@gmail.com> wrote:
>
>> So, the challenge is this: I'd like to see repeatable test cases
>> that demonstrate regular performance gains > 20%.  Double bonus
>> points for cases that show gains > 50%.
>
>Are you talking throughput, maximum latency, or some other metric?
>
>In our shop the metric we tuned for in reducing shared_buffers was
>getting the number of "fast" queries (which normally run in under a
>millisecond) which would occasionally, in clusters, take over 20
>seconds (and thus be canceled by our web app and present as errors
>to the public) down to zero.  While I know there are those who care
>primarily about throughput numbers, that's worthless to me without
>maximum latency information under prolonged load.  I'm not talking
>90th percentile latency numbers, either -- if 10% of our web
>requests were timing out the villagers would be coming after us with
>pitchforks and torches.
>
>-Kevin
>
>--
>Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
>To make changes to your subscription:
>http://www.postgresql.org/mailpref/pgsql-performance

Re: The shared buffers challenge

From

"Kevin Grittner"

Date:

27 May 2011, 17:07:22

Scott Carey <scott@richrelevance.com> wrote:

> So how far do you go?  128MB?  32MB?  4MB?

Under 8.2 we had to keep shared_buffers less than the RAM on our BBU
RAID controller, which had 256MB -- so it worked best with
shared_buffers in the 160MB to 200MB range.  With 8.3 we found that
anywhere from 512MB to 1GB performed better without creating
clusters of stalls.  In both cases we also had to significantly
boost the aggressiveness of the background writer.

Since the "sweet spot" is so dependent on such things as your RAID
controller and your workload, I *highly* recommend Greg's
incremental tuning approach.  The rough guidelines which get tossed
about make reasonable starting points, but you really need to make
relatively small changes with the actual load you're trying to
optimize and monitor the metrics which matter to you.  On a big data
warehouse you might not care if the database becomes unresponsive
for a couple minutes every now and then if it means better overall
throughput.  On a web server, you may not have much problem keeping
up with the overall load, but want to ensure reasonable response
time.

> Anecdotal and an assumption, but I'm pretty confident that on any
> server with at least 1GB of dedicated RAM, setting it any lower
> than 200MB is not even going to help latency (assuming checkpoint
> and log configuration is in the realm of sane, and
> connections*work_mem is sane).

I would add the assumption that you've got at least 256MB BBU cache
on your RAID controller.

> The defaults have been so small for so long on most platforms,
> that any increase over the default generally helps performance --
> and in many cases dramatically.

Agreed.

> So if more is better, then most users assume that even more should
> be better.

That does seem like a real risk.

-Kevin

Re: The shared buffers challenge

From

Greg Smith

Date:

27 May 2011, 18:26:44

Scott Carey wrote:
> And there is an OS component to it too.  You can actually get away with
> shared_buffers at 90% of RAM on Solaris.  Linux will explode if you try
> that (unless recent kernels have fixed its shared memory accounting).
>

You can use much larger values for shared_buffers on Solaris with UFS as
the filesystem than almost anywhere else, as you say.  UFS defaults to
caching an extremely tiny amount of memory by default.  Getting
PostgreSQL to buffer everything therefore leads to minimal
double-caching and little write caching that creates checkpoint spikes,
so 90% is not impossible there.

If you're using ZFS instead, that defaults to similar aggressive caching
as Linux.  You may even have to turn that down if you want the database
to have a large amount of memory for its own use even with normal levels
of sizing; just space for shared_buffers and work_mem can end up being
too large of a pair of competitors for caching RAM.  ZFS is not really
not tuned all that differently from how Linux approaches caching in that
regard.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: The shared buffers challenge

From

Greg Smith

Date:

27 May 2011, 18:48:04

Merlin Moncure wrote:
> That's just plain unfair: I didn't challenge your suggestion nor give
> you homework.

I was stuck either responding to your challenge, or leaving the
impression I hadn't done the research to back the suggestions I make if
I didn't.  That made it a mandatory homework assignment for me, and I
didn't appreciate that.

> *) the documentation should really explain this better (in particular,
> it should debunk the myth 'more buffers = more caching'

Any attempt to make a serious change to the documentation around
performance turns into a bikeshedding epic, where the burden of proof to
make a change is too large to be worth the trouble to me anymore.  I
first started publishing tuning papers outside of the main docs because
it was the path of least resistance to actually getting something useful
in front of people.  After failing to get even basic good
recommendations for checkpoint_segments into the docs, I completely gave
up on focusing there as my primary way to spread this sort of information.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: The shared buffers challenge

From

Merlin Moncure

Date:

27 May 2011, 18:57:35

On Fri, May 27, 2011 at 1:47 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Merlin Moncure wrote:
>>
>> That's just plain unfair: I didn't challenge your suggestion nor give
>> you homework.
>
> I was stuck either responding to your challenge, or leaving the impression I
> hadn't done the research to back the suggestions I make if I didn't.  That
> made it a mandatory homework assignment for me, and I didn't appreciate
> that.

really -- that wasn't my intent.  in any event, i apologize.

merlin

Re: The shared buffers challenge

From

Robert Haas

Date:

27 May 2011, 19:04:29

On Fri, May 27, 2011 at 2:47 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> Any attempt to make a serious change to the documentation around performance
> turns into a bikeshedding epic, where the burden of proof to make a change
> is too large to be worth the trouble to me anymore.  I first started
> publishing tuning papers outside of the main docs because it was the path of
> least resistance to actually getting something useful in front of people.
>  After failing to get even basic good recommendations for
> checkpoint_segments into the docs, I completely gave up on focusing there as
> my primary way to spread this sort of information.

Hmm.  That's rather unfortunate.  +1 for revisiting that topic, if you
have the energy for it.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: The shared buffers challenge

From

Maciek Sakrejda

Date:

27 May 2011, 19:31:17

>>  After failing to get even basic good recommendations for
>> checkpoint_segments into the docs, I completely gave up on focusing there as
>> my primary way to spread this sort of information.
>
> Hmm.  That's rather unfortunate.  +1 for revisiting that topic, if you
> have the energy for it.

Another +1. While I understand that this is not simple, many users
will not look outside of standard docs, especially when first
evaluating PostgreSQL. Merlin is right that the current wording does
not really mention a down side to cranking shared_buffers on a system
with plenty of RAM.

---
Maciek Sakrejda | System Architect | Truviso

1065 E. Hillsdale Blvd., Suite 215
Foster City, CA 94404
(650) 242-3500 Main
www.truviso.com

Re: The shared buffers challenge

From

Claudio Freire

Date:

27 May 2011, 22:12:19

On Fri, May 27, 2011 at 9:24 PM, Maciek Sakrejda <msakrejda@truviso.com> wrote:
> Another +1. While I understand that this is not simple, many users
> will not look outside of standard docs, especially when first
> evaluating PostgreSQL. Merlin is right that the current wording does
> not really mention a down side to cranking shared_buffers on a system
> with plenty of RAM.

If you read the whole docs it does.

If you read caching, checkpoints, WAL, all of it, you can connect the dots.
It isn't easy, but database management isn't easy.

Re: The shared buffers challenge

From

Mark Kirkwood

Date:

27 May 2011, 23:30:42

On 27/05/11 11:10, Greg Smith wrote:
>
> OK, so the key thing to do is create a table such that shared_buffers
> is smaller than the primary key index on a table, then UPDATE that
> table furiously.  This will page constantly out of the buffer cache to
> the OS one, doing work that could be avoided.  Increase shared_buffers
> to where it fits instead, and all the index writes are buffered to
> write only once per checkpoint.  Server settings to exaggerate the
> effect:
>
> shared_buffers = 32MB
> checkpoint_segments = 256
> log_checkpoints = on
> autovacuum = off
>
> Test case:
>
> createdb pgbench
> pgbench -i -s 20 pgbench
> psql -d pgbench -c "select
> pg_size_pretty(pg_relation_size('public.pgbench_accounts_pkey'))"
> psql -c "select pg_stat_reset_shared('bgwriter')"
> pgbench -T 120 -c 4 -n pgbench
> psql -x -c "SELECT * FROM pg_stat_bgwriter"
>
> This gives the following size for the primary key and results:
>
> pg_size_pretty
> ----------------
> 34 MB
>
> transaction type: TPC-B (sort of)
> scaling factor: 20
> query mode: simple
> number of clients: 4
> number of threads: 1
> duration: 120 s
> number of transactions actually processed: 13236
> tps = 109.524954 (including connections establishing)
> tps = 109.548498 (excluding connections establishing)
>
> -[ RECORD 1 ]---------+------------------------------
> checkpoints_timed     | 0
> checkpoints_req       | 0
> buffers_checkpoint    | 0
> buffers_clean         | 16156
> maxwritten_clean      | 131
> buffers_backend       | 5701
> buffers_backend_fsync | 0
> buffers_alloc         | 25276
> stats_reset           | 2011-05-26 18:39:57.292777-04
>
> Now, change so the whole index fits instead:
>
> shared_buffers = 512MB
>
> ...which follows the good old "25% of RAM" guidelines given this
> system has 2GB of RAM.  Restart the server, repeat the test case.  New
> results:
>
> transaction type: TPC-B (sort of)
> scaling factor: 20
> query mode: simple
> number of clients: 4
> number of threads: 1
> duration: 120 s
> number of transactions actually processed: 103440
> tps = 861.834090 (including connections establishing)
> tps = 862.041716 (excluding connections establishing)
>
> gsmith@meddle:~/personal/scripts$ psql -x -c "SELECT * FROM
> pg_stat_bgwriter"
> -[ RECORD 1 ]---------+------------------------------
> checkpoints_timed     | 0
> checkpoints_req       | 0
> buffers_checkpoint    | 0
> buffers_clean         | 0
> maxwritten_clean      | 0
> buffers_backend       | 1160
> buffers_backend_fsync | 0
> buffers_alloc         | 34071
> stats_reset           | 2011-05-26 18:43:40.887229-04
>
> Rather than writing 16156+5701=21857 buffers out during the test to
> support all the index churn, instead only 1160 buffers go out,
> consisting mostly of the data blocks for pgbench_accounts that are
> being updated irregularly.  With less than 1 / 18th as I/O to do, the
> system executes nearly 8X as many UPDATE statements during the test run.
>
> As for figuring out how this impacts more complicated cases, I hear
> somebody wrote a book or something that went into pages and pages of
> detail about all this.  You might want to check it out.
>

Greg, having an example with some discussion like this in the docs would
probably be helpful. If you want to add it that would be great, however
that sounds dangerously like giving you homework :-) I'm happy to put
something together for the docs if you'd prefer that I do my own
assignments.

Cheers

Mark

Re: The shared buffers challenge

From

Jeff Davis

Date:

28 May 2011, 00:26:02

On Thu, 2011-05-26 at 09:31 -0500, Merlin Moncure wrote:
> Where they are most helpful is for masking of i/o if
> a page gets dirtied >1 times before it's written out to the heap

Another possible benefit of higher shared_buffers is that it may reduce
WAL flushes. A page cannot be evicted from shared_buffers until the WAL
has been flushed up to the page's LSN ("WAL before data"); so if there
is memory pressure to evict dirty buffers, it may cause extra WAL
flushes.

I'm not sure what the practical effects of this are, however, but it
might be an interesting thing to test.

Regards,
    Jeff Davis

Re: The shared buffers challenge

From

Greg Smith

Date:

28 May 2011, 21:47:24

On 05/27/2011 07:30 PM, Mark Kirkwood wrote:
> Greg, having an example with some discussion like this in the docs
> would probably be helpful.

If we put that example into the docs, two years from now there will be
people showing up here saying "I used the recommended configuration from
the docs" that cut and paste it into their postgresql.conf, turning
autovacuum off and everything.  Periodically people used to publish
"recommended postgresql.conf" settings on random web pages, sometimes
with poor suggestions, and those things kept showing up in people's
configurations posted to the lists here for long after they were no
longer applicable.  I've resisted publishing specific configuration
examples in favor of working on pgtune specifically because of having
observed that.

There's several new small features in 9.1 that make it a easier to
instrument checkpoint behavior and how it overlaps with shared_buffers
increases:  summary of sync times, ability to reset pg_stat_bgwriter,
and a timestamp on when it was last reset.  It's not obvious at all how
those all stitch together into some new tuning approaches, but they do.
Best example I've given so far is at
http://archives.postgresql.org/pgsql-hackers/2011-02/msg00209.php ; look
at how I can turn the mysterious buffers_backend field into something
measured in MB/s using these new features.

That's the direction all this is moving toward.  If it's easy for people
to turn the various buffer statistics into human-readable form, the way
tuning changes impact the server operation becomes much easier to see.
Documenting the much harder to execute methodology you can apply to
earlier versions isn't real exciting to me at this point.  I've done
that enough that people who want the info can find it, even if it's not
all in the manual.  The ways you can do it in 9.1 are so much easier,
and more accurate in regards to the sync issues, that I'm more
interested in beefing up the manual in regards to using them at this point.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: The shared buffers challenge

From

Merlin Moncure

Date:

31 May 2011, 14:34:09

On Fri, May 27, 2011 at 7:19 PM, Jeff Davis <pgsql@j-davis.com> wrote:
> On Thu, 2011-05-26 at 09:31 -0500, Merlin Moncure wrote:
>> Where they are most helpful is for masking of i/o if
>> a page gets dirtied >1 times before it's written out to the heap
>
> Another possible benefit of higher shared_buffers is that it may reduce
> WAL flushes. A page cannot be evicted from shared_buffers until the WAL
> has been flushed up to the page's LSN ("WAL before data"); so if there
> is memory pressure to evict dirty buffers, it may cause extra WAL
> flushes.
>
> I'm not sure what the practical effects of this are, however, but it
> might be an interesting thing to test.

Hm, I bet it could make a fairly big difference if wal data is not on
a separate volume.

merlin