Re: increasing the default WAL segment size - Mailing list pgsql-hackers

From Magnus Hagander
Subject Re: increasing the default WAL segment size
Date
Msg-id CABUevEyMb9yc4KW6xWzewk2awO5KekA_gr3X=gbn3ijci=CPuA@mail.gmail.com
Whole thread Raw
In response to Re: increasing the default WAL segment size  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: increasing the default WAL segment size  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers


On Thu, Aug 25, 2016 at 6:59 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Thu, Aug 25, 2016 at 11:21 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
> On 25 August 2016 at 02:31, Robert Haas <robertmhaas@gmail.com> wrote:
>> Furthermore, there is an enforced, synchronous fsync at the end of
>> every segment, which actually does hurt performance on write-heavy
>> workloads.[2] Of course, if that were the only reason to consider
>> increasing the segment size, it would probably make more sense to just
>> try to push that extra fsync into the background, but that's not
>> really the case.  From what I hear, the gigantic number of files is a
>> bigger pain point.
>
> I think we should fully describe the problem before finding a solution.

Sure, that's usually a good idea.  I attempted to outline all of the
possible issues of which I am aware in my original email, but of
course you may know of considerations which I overlooked.

> This is too big a change to just tweak a value without discussing the
> actual issue.

Again, I tried to make sure I was discussing the actual issues in my
original email.  In brief: having to run archive_command multiple
times per second imposes very tight latency requirements on it;
directories with hundreds of thousands or millions of files are hard
to manage; enforced synchronous fsyncs at the end of each segment hurt
performance.

> And if the problem is as described, how can a change of x4 be enough
> to make it worth the pain of change? I think you're already admitting
> it can't be worth it by discussing initdb configuration.

I guess it depends on how much pain of change you think there will be.
I would expect a change from 16MB -> 64MB to be fairly painless, but
(1) it might break tools that aren't designed to cope with differing
segment sizes and (2) it will increase disk utilization for people who
have such low velocity systems that they never end up with more than 2
WAL segments, and now those segments are bigger.  If you know of other
impacts or have reason to believe those problems will be serious,
please fill in the details.

Despite the fact that initdb configuration has dominated this thread,
I mentioned it only in the very last sentence of my email and only as
a possibility.  I believe that a 4x change will be good enough for the
majority of people for whom this is currently a pain point.  However,
yes, I do believe that there are some people for whom it won't be
sufficient.  And I believe that as we continue to enhance PostgreSQL
to support higher and higher transaction rates, the number of people
who need an extra-large WAL segment size will increase.  As I see it,
there are three options here:

1. Do nothing.  So far, I don't see anybody arguing for that.

2. Change the default to 64MB and call it good.  This idea seems to
have considerable support.

3. Allow initdb-time configurability but keep the default at 16MB.  I
don't see any support for this.  There is clearly support for
configurability, but I don't see anyone arguing that the current
default is preferable, unless that is what you are arguing.

4. Change the default to 64MB and also allow initdb-time
configurability.  This option also appears to enjoy substantial
support, perhaps more than #2.  Magnus seemed to be arguing that this
is preferable to #2, because then it's easier for people to change the
setting back if someone discovers a case where the higher default is a
problem; Tom, on the other hand, seems to think this is overkill. 

Personally, I believe option #4 is for the best.  I believe that the
great majority of users will be better off with 64MB than with 16MB,
but I like the idea of allowing for smaller values (for people with
really low-velocity instances) and larger ones (for people with really
high-velocity instances).

I was not arguing for #4 over #2, at least not strongly. I think #2 is fine, and I think #4 are fine. #4 allows a way out, but it's not *that* important unless we go *beyond* 64Mb.

I was mainly arguing that we can't claim "it has a configure switch so it's kinda configurable" as a way out. If we want it configurable *at all*, it should be an initdb switch. If we are confident in our defaults, it doesn't have to be.

I agree that #4 is best. I'm not sure it's worth the cost. I'm not worried at all about the risk of master/slave sync thing, per previous statement. But if it does have performance implications, per Andres suggestion, then making it configurable at initdb time probably comes with a cost that's not worth paying.


--

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: PG_DIAG_SEVERITY and a possible bug in pq_parse_errornotice()
Next
From: Peter Geoghegan
Date:
Subject: Re: UPSERT strange behavior