Thread: Wrong docs on wal_buffers?

Wrong docs on wal_buffers?

From

Josh Berkus

Date:

05 January 2011, 16:43:30

All,

One of my coworkers just pointed this out:

"The amount of memory used in shared memory for WAL data. The default is
64 kilobytes (64kB). The setting need only be large enough to hold the
amount of WAL data generated by one typical transaction, since the data
is written out to disk at every transaction commit. This parameter can
only be set at server start."
http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html

That's quite incorrect.  The wal_buffers are shared by all concurrent
transactions, so it needs to be sized appropriately for all
*simultaneous* uncommitted transactions, otherwise you'll get
unnecessary flushing.

Certainly performance testing data posted on this list and -hackers.
bears that out.  My suggestion instead:

"The amount of shared memory dedicated to buffering writes to the WAL.
The default is 64 kilobytes (64kB), which is low for a busy production
server.  Users who have high write concurrency, or transactions which
commit individual large data writes, will want to increase it to between
1MB and 16MB. This parameter can only be set at server start."

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Wrong docs on wal_buffers?

From

Jeff Janes

Date:

05 January 2011, 17:45:30

On Wed, Jan 5, 2011 at 12:43 PM, Josh Berkus <josh@agliodbs.com> wrote:
> All,
>
> One of my coworkers just pointed this out:
>
> "The amount of memory used in shared memory for WAL data. The default is
> 64 kilobytes (64kB). The setting need only be large enough to hold the
> amount of WAL data generated by one typical transaction, since the data
> is written out to disk at every transaction commit. This parameter can
> only be set at server start."
> http://www.postgresql.org/docs/9.0/static/runtime-config-wal.html
>
> That's quite incorrect.  The wal_buffers are shared by all concurrent
> transactions, so it needs to be sized appropriately for all
> *simultaneous* uncommitted transactions, otherwise you'll get
> unnecessary flushing.

I'd thought the same thing in the past.  But on further thinking about
it, I had decided otherwise.

On a highly concurrent system, transaction commits are constantly and
unavoidably writing and flushing other transactions' WAL.

If the transactions are well spread out, each of N concurrent
homogeneous transactions only has 1/N of its total WAL in shared
buffers at any one time, so the total does come out to about 1/N * N =
1 typical transaction size.  Throw in stochastic departures from
uniform distribution, and it would be somewhat higher, but not N.

Only if all the transactions move through the system in lock-step,
would need N times the typical size for one transaction.  pgbench can
create this condition, but I don't know how likely it is for
real-world work flows to do so.  Maybe it is common there as well?

But my bigger objection to the original wording is that it is very
hard to know how much WAL a typical transaction generates, especially
under full_page_writes.

And the risks are rather asymmetric.  I don't know of any problem from
too large a buffer until it starts crowding out shared_buffers, while
under-sizing leads to the rather drastic performance consequences of
AdvanceXLInsertBuffer having to wait on the WALWriteLock while holding
the WALInsertLock,

>
> Certainly performance testing data posted on this list and -hackers.
> bears that out.  My suggestion instead:
>
> "The amount of shared memory dedicated to buffering writes to the WAL.
> The default is 64 kilobytes (64kB), which is low for a busy production
> server.  Users who have high write concurrency, or transactions which
> commit individual large data writes, will want to increase it to between
> 1MB and 16MB. This parameter can only be set at server start."

I like this wording.

But I wonder if initdb.c, when selecting the default shared_buffers,
shouldn't test with wal_buffers = shared_buffers/64 or
shared_buffers/128, with a lower limit of 8 blocks, and set that as
the default.

Cheers,

Jeff

Re: Wrong docs on wal_buffers?

From

"Pierre C"

Date:

05 January 2011, 18:58:45

> And the risks are rather asymmetric.  I don't know of any problem from
> too large a buffer until it starts crowding out shared_buffers, while
> under-sizing leads to the rather drastic performance consequences of
> AdvanceXLInsertBuffer having to wait on the WALWriteLock while holding
> the WALInsertLock,

Suppose you have a large update which generates lots of WAL, some WAL
segment switching will take place, and therefore some fsync()s. If
wal_buffers is small enough that it fills up during the time it takes to
fsync() the previous WAL segment, isn't there a risk that all WAL writes
are stopped, waiting for the end of this fsync() ?

Re: Wrong docs on wal_buffers?

From

Josh Berkus

Date:

06 January 2011, 14:59:22

> And the risks are rather asymmetric.  I don't know of any problem from
> too large a buffer until it starts crowding out shared_buffers, while
> under-sizing leads to the rather drastic performance consequences of
> AdvanceXLInsertBuffer having to wait on the WALWriteLock while holding
> the WALInsertLock,

Yes, performance testing has bourne that out.  Increasing wal_buffers to
between 1MB and 16MB has benfitted most test cases (DBT2, pgBench, user
databases) substantially, while an increase has never been shown to be a
penalty.  Increases above 16MB didn't seem to help, which is
unsurprising given the size of a WAL segment.

> But I wonder if initdb.c, when selecting the default shared_buffers,
> shouldn't test with wal_buffers = shared_buffers/64 or
> shared_buffers/128, with a lower limit of 8 blocks, and set that as
> the default.

We talked about bumping it to 512kB or 1MB for 9.1.  Did that get in?
Do I need to write that patch?

It would be nice to have it default to 16MB out of the gate, but there
we're up against the Linux/FreeBSD SysV memory limits again.  When are
those OSes going to modernize?

--
                                  -- Josh Berkus
                                     PostgreSQL Experts Inc.
                                     http://www.pgexperts.com

Re: Wrong docs on wal_buffers?

From

Scott Carey

Date:

06 January 2011, 17:51:10

On Jan 6, 2011, at 10:58 AM, Josh Berkus wrote:

>
>> But I wonder if initdb.c, when selecting the default shared_buffers,
>> shouldn't test with wal_buffers = shared_buffers/64 or
>> shared_buffers/128, with a lower limit of 8 blocks, and set that as
>> the default.
>
> We talked about bumping it to 512kB or 1MB for 9.1.  Did that get in?
> Do I need to write that patch?
>
> It would be nice to have it default to 16MB out of the gate, but there
> we're up against the Linux/FreeBSD SysV memory limits again.  When are
> those OSes going to modernize?
>

Why wait?  Just set it to 1MB, and individual distributions can set it lower if need be (for example Mac OSX with its
4MBdefault shared memory limit).   Bowing to lowest common denominator OS settings causes more problems than it solves
IMO.

> --
>                                  -- Josh Berkus
>                                     PostgreSQL Experts Inc.
>                                     http://www.pgexperts.com
>
> --
> Sent via pgsql-performance mailing list (pgsql-performance@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-performance

Re: Wrong docs on wal_buffers?

From

Jeff Janes

Date:

06 January 2011, 19:02:44

On Thu, Jan 6, 2011 at 10:58 AM, Josh Berkus <josh@agliodbs.com> wrote:
>
>> And the risks are rather asymmetric.  I don't know of any problem from
>> too large a buffer until it starts crowding out shared_buffers, while
>> under-sizing leads to the rather drastic performance consequences of
>> AdvanceXLInsertBuffer having to wait on the WALWriteLock while holding
>> the WALInsertLock,
>
> Yes, performance testing has bourne that out.  Increasing wal_buffers to
> between 1MB and 16MB has benfitted most test cases (DBT2, pgBench, user
> databases) substantially, while an increase has never been shown to be a
> penalty.  Increases above 16MB didn't seem to help, which is
> unsurprising given the size of a WAL segment.
>
>> But I wonder if initdb.c, when selecting the default shared_buffers,
>> shouldn't test with wal_buffers = shared_buffers/64 or
>> shared_buffers/128, with a lower limit of 8 blocks, and set that as
>> the default.
>
> We talked about bumping it to 512kB or 1MB for 9.1.  Did that get in?

Doesn't look like it, not yet anyway.

> Do I need to write that patch?
>
> It would be nice to have it default to 16MB out of the gate,

Would that be a good default even when the shared_buffer is only 32MB
(the maximum that initdb will ever pick as the default)?

> but there
> we're up against the Linux/FreeBSD SysV memory limits again.  When are
> those OSes going to modernize?

I don't think that we should let that limit us.

For one thing, some Linux distributions already do have large defaults
for SHMMAX.  SUSE, for, example, defaults to 4GB on 32-bit and much
much larger on 64-bit, and I think they have for years.

For another thing, initdb already does a climb-down on shared_buffers
until it finds something that works.  All we would have to do is make
wal_buffers participate in that climb-down.

If I manually set SHMMAX to 32MB, then initdb currently climbs down to
28MB for the shared_buffers on my 32 bit machine.  At that point, I
can increase wal_buffers to 896kB before shmget fails, so I think
512kb would be a good default in that situation.

Maybe initdb should test larger values for shared_buffers as well,
rather than starting at only 32MB.

Cheers,

Jeff

Re: Wrong docs on wal_buffers?

From

Greg Smith

Date:

07 January 2011, 00:38:07

Josh Berkus wrote:
> We talked about bumping it to 512kB or 1MB for 9.1.  Did that get in?
> Do I need to write that patch?
>

If it defaulted to 3% of shared_buffers, min 64K & max 16MB for the auto
setting, it would for the most part become an autotuned parameter.  That
would make it 0.75 to 1MB at the standard anemic Linux default kernel
parameters.  Maybe more than some would like, but dropping
shared_buffers from 24MB to 23MB to keep this from being ridiculously
undersized is probably a win.  That percentage would reach 16MB by the
time shared_buffers was increased to 533MB, which also seems about right
to me.  On a really bad setup (brief pause to flip off Apple) with only
4MB to work with total, you'd end up with wal_buffers between 64 and
128K, so very close to the status quo.

Code that up, and we could probably even remove the parameter as a
tunable altogether.  Very few would see a downside relative to any
sensible configuration under the current situation, and many people
would notice better automagic performance with one less parameter to
tweak.  Given the recent investigations about the serious downsides of
tiny wal_buffers values on new Linux kernels when using open_datasync, a
touch more aggression about this setting seems particularly appropriate
to consider now.  That's been swapped out as the default, but it's still
possible people will switch to it.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services and Support        www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books

Re: Wrong docs on wal_buffers?

From

Samuel Gendler

Date:

07 January 2011, 10:09:47

On Thu, Jan 6, 2011 at 8:37 PM, Greg Smith <greg@2ndquadrant.com> wrote:

Josh Berkus wrote:
We talked about bumping it to 512kB or 1MB for 9.1. Did that get in?
Do I need to write that patch?

If it defaulted to 3% of shared_buffers, min 64K & max 16MB for the auto setting, it would for the most part become an autotuned parameter. That would make it 0.75 to 1MB at the standard anemic Linux default kernel parameters. Maybe more than some would like, but dropping shared_buffers from 24MB to 23MB to keep this from being ridiculously undersized is probably a win. That percentage would reach 16MB by the time shared_buffers was increased to 533MB, which also seems about right to me. On a really bad setup (brief pause to flip off Apple) with only 4MB to work with total, you'd end up with wal_buffers between 64 and 128K, so very close to the status quo.

Code that up, and we could probably even remove the parameter as a tunable altogether. Very few would see a downside relative to any sensible configuration under the current situation, and many people would notice better automagic performance with one less parameter to tweak. Given the recent investigations about the serious downsides of tiny wal_buffers values on new Linux kernels when using open_datasync, a touch more aggression about this setting seems particularly appropriate to consider now. That's been swapped out as the default, but it's still possible people will switch to it.

Does it not seem that this insistence on shipping a default config that works out of the box on every system incurs a dramatic penalty when it comes to getting a useful postgres config for a production system? It seems like postgres is forcing users to learn all of the fairly specialized and intricate details of how shared memory is utilized by the write ahead log, rather than asking them to modify the shared memory settings as part of the installation procedure on a handful of systems - changes which are relatively common and easily documented on affected systems. Most sysadmins will not be unfamiliar with modifying shared memory settings while none without postgres expertise will have a clue about configuring postgres WAL logs, shared buffers, and checkpoint segments. If we're trying to provide an easy first-use experience for inexperienced users, doesn't it actually make more sense to require a reasonable amount of shared memory rather than constraining the install to function with only a tiny amount of shared memory in a time when it is common even for laptops to have 4 or more gigabytes of RAM?

I'm sure this argument has probably been done to death on this list (I'm a relatively recent subscriber), but issues with configuration options with nearly useless values as a result of shared memory constraints in the default config sure seem to crop up a lot. Wouldn't so many issues be resolved if postgres shipped with useful defaults for a modern hardware config along with instructions for how to adjust shared memory constraints so that the config will function on each system?

--sam

Re: Wrong docs on wal_buffers?

From

Tom Lane

Date:

07 January 2011, 11:07:36

Samuel Gendler <sgendler@ideasculptor.com> writes:
> Does it not seem that this insistence on shipping a default config that
> works out of the box on every system incurs a dramatic penalty when it comes
> to getting a useful postgres config for a production system?

> I'm sure this argument has probably been done to death on this list (I'm a
> relatively recent subscriber),

No kidding.  Please review the archives.

The short answer is that even though modern machines tend to have plenty
of RAM, they don't tend to have correspondingly large default settings
of SHMMAX etc.  If we crank up the default shared-memory-usage settings
to the point where PG won't start in a couple of MB, we won't accomplish
a thing in terms of "making it work out of the box"; we'll just put
another roadblock in front of newbies getting to try it at all.

            regards, tom lane

Re: Wrong docs on wal_buffers?

From

Samuel Gendler

Date:

07 January 2011, 13:30:46

On Fri, Jan 7, 2011 at 7:07 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Samuel Gendler <sgendler@ideasculptor.com> writes:
> Does it not seem that this insistence on shipping a default config that
> works out of the box on every system incurs a dramatic penalty when it comes
> to getting a useful postgres config for a production system?

> I'm sure this argument has probably been done to death on this list (I'm a
> relatively recent subscriber),

No kidding. Please review the archives.

The short answer is that even though modern machines tend to have plenty
of RAM, they don't tend to have correspondingly large default settings
of SHMMAX etc. If we crank up the default shared-memory-usage settings
to the point where PG won't start in a couple of MB, we won't accomplish
a thing in terms of "making it work out of the box"; we'll just put
another roadblock in front of newbies getting to try it at all.

Yes, I understand that. I was trying to make the point that, in an attempt to make things very easy for novice users, we are actually making them quite a bit more complex for novice users who want to do anything besides start the server. But no need to have the debate again.

--sam

Re: Wrong docs on wal_buffers?

From

Greg Smith

Date:

07 January 2011, 14:46:25

Samuel Gendler wrote:
> I was trying to make the point that, in an attempt to make things very
> easy for novice users, we are actually making them quite a bit more
> complex for novice users who want to do anything besides start the server.

People who can't start the server often abandon PostgreSQL, never come
back again.  And in the worst case, they popularize their frustration
via blogs etc.  That contributes to the perception the program is hard
to use far more than people who run into problems only with
performance--people expect database tuning to be non-trivial, but they
have minimal tolerance for trouble when first trying to use a program.
 From an advocacy perspective, there is no higher priority than making
sure things work as smoothly as feasible for people who have never run
PostgreSQL before.  Changing the software so it doesn't work out of the
box on a system with minimal shared memory defaults, as seen on Linux
and other operating systems, would be one of the worst possible changes
to the database anyone could make.

About once a month now I came across someone who used my pgtune
program:  https://github.com/gregs1104/pgtune to get a reasonable
starting configuration.  That program is still rough and has
limitations, but the general idea turns out to work just fine.  I'm
bored with discussions about making this easier for users unless they
involve people volunteering to help with the coding needed to turn that,
or something better than it, into a release quality tool.  The path to
sort this out is mapped out in excruciating detail from my perspective.
The only thing missing are code contributors with time to walk down it.
So far it's me, occasional code refinement from Matt Harrison, some
packager help, and periodic review from people like Josh Berkus.  And
that's just not good enough to make progress on this particular front
quickly enough.

--
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services and Support        www.2ndQuadrant.us
"PostgreSQL 9.0 High Performance": http://www.2ndQuadrant.com/books