Re: fallocate / posix_fallocate for new WAL file creation (etc...) - Mailing list pgsql-hackers

From Jon Nelson
Subject Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Date
Msg-id CAKuK5J3UPsP64G1f3GSiQyx7W2229yffjPZ4_ctGqvLh2khL+w@mail.gmail.com
Whole thread Raw
In response to Re: fallocate / posix_fallocate for new WAL file creation (etc...)  (Greg Smith <greg@2ndQuadrant.com>)
Responses Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Re: fallocate / posix_fallocate for new WAL file creation (etc...)
List pgsql-hackers
On Thu, May 16, 2013 at 7:05 PM, Greg Smith <greg@2ndquadrant.com> wrote:
> On 5/16/13 9:16 AM, Jon Nelson wrote:
>>
>> Am I doing this the right way? Should I be posting the full patch each
>> time, or incremental patches?
>
>
> There are guidelines for getting your patch in the right format at
> https://wiki.postgresql.org/wiki/Working_with_Git#Context_diffs_with_Git
> that would improve this one.  You have some formatting issues with tab
> spacing at lines 120 through 133 in your v3 patch.  And it looks like there
> was a formatting change on line 146 that is making the diff larger than it
> needs to be.

I've corrected the formatting change (end-of-line whitespace was
stripped) on line 146.
The other whitespace changes are - I think - due to newly-indented
code due to a new code block.
Included please find a v4 patch which uses context diffs per the above url.

> The biggest thing missing from this submission is information about what
> performance testing you did.  Ideally performance patches are submitted with
> enough information for a reviewer to duplicate the same test the author did,
> as well as hard before/after performance numbers from your test system.  It
> often turns tricky to duplicate a performance gain, and being able to run
> the same test used for initial development eliminates a lot of the problems.

This has been a bit of a struggle. While it's true that WAL file
creation doesn't happen with great frequency, and while it's also true
that - with strace and other tests - it can be proven that
fallocate(16MB) is much quicker than writing it zeroes by hand,
proving that in the larger context of a running install has been
challenging.

Attached you'll find a small test script (t.sh) which creates a new
cluster in 'foo', changes some config values, starts the cluster, and
then times how long it takes pgbench to prepare a database. I've used
"wal_level = hot_standby" in the hopes that this generates the largest
number of WAL files (and I set the number of such files to 1024). The
hardware is an AMD 9150e with a 2-disk software RAID1 (SATA disks) on
kernel 3.9.2 and ext4 (x86_64, openSUSE 12.3). The test results are
not that surprising. The longer the test (the larger the scale factor)
the less of a difference using posix_fallocate makes. With a scale
factor of 100, I see an average of 10-11% reduction in the time taken
to initialize the database. With 300, it's about 5.5% and with 900,
it's between 0 and 1.2%. I will be doing more testing but this is what
I started with. I'm very open to suggestions.

> Second bit of nitpicking.  There are already some GUC values that appear or
> disappear based on compile time options.  They're all debugging related
> things though.  I would prefer not to see this one go away when it's
> implementation isn't available.  That's going to break any scripts that SHOW
> the setting to see if it's turned on or not as a first problem.  I think the
> right model to follow here is the IFDEF setup used for
> effective_io_concurrency.  I wouldn't worry about this too much though.
> Having a wal_use_fallocate GUC is good for testing.  But if it works out
> well, when it's ready for commit I don't see why anyone would want it turned
> off on platforms where it works.  There are already too many performance
> tweaking GUCs.  Something has to be very likely to be changed from the
> default before its worth adding one for it.

Ack.  I've revised the patch to always have the GUC (for now), default
to false, and if configure can't find posix_fallocate (or the user
disables it by way of pg_config_manual.h) then it remains a GUC that
simply can't be changed.

I'll also be re-running the tests.

--
Jon

Attachment

pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Planning incompatibilities for Postgres 10.0
Next
From: Bruce Momjian
Date:
Subject: Re: Planning incompatibilities for Postgres 10.0