Re: patch to allow disable of WAL recycling - Mailing list pgsql-hackers

From Jerry Jelinek
Subject Re: patch to allow disable of WAL recycling
Date
Msg-id CACPQ5Fr7W19BHV+0Qn1RLHE1UZe1T5HzbAvxRRU8+J3BmCSGEg@mail.gmail.com
Whole thread Raw
In response to patch to allow disable of WAL recycling  (Jerry Jelinek <jerry.jelinek@joyent.com>)
Responses Re: patch to allow disable of WAL recycling  ("Joshua D. Drake" <jd@commandprompt.com>)
Re: patch to allow disable of WAL recycling  (Alvaro Herrera <alvherre@2ndquadrant.com>)
Re: patch to allow disable of WAL recycling  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Thanks to everyone who took the time to look at the patch and send me feedback.  I'm happy to work on improving the documentation of this new tunable to clarify when it should be used and the implications. I'm trying to understand more specifically what else needs to be done next. To summarize, I think the following general concerns were brought up.

1) Disabling WAL recycling could have a negative performance impact on a COW filesystem if all WAL files could be kept in the filesystem cache.
2) Disabling WAL recycling reduces reliability, even on COW filesystems.
3) Using something like posix_fadvise to reload recycled WAL files into the filesystem cache is better even for a COW filesystem.
4) There are "several" other purposes for WAL recycling which this tunable would impact.
5) A WAL recycling tunable is too specific and a more general solution is needed.
6) Need more performance data.

For #1, #2 and #3, I don't understand these concerns. It would be helpful if these could be more specific

For #4, can anybody enumerate these other purposes for WAL recycling?

For #5, perhaps I am making an incorrect assumption about what the original response was requesting, but I understand that WAL recycling is just one aspect of WAL file creation/allocation. However, the creation of a new WAL file is not a problem we've ever observed. In general, any modern filesystem should do a good job of caching recently accessed files. We've never observed a problem with the allocation of a new WAL file slightly before it is needed. The problem we have observed is specifically around WAL file recycling when we have to access old files that are long gone from the filesystem cache. The semantics around recycling seem pretty crisp as compared to some other tunable which would completely change how WAL files are created. Given that a change like that is also much more intrusive, it seems better to provide a tunable to disable WAL recycling vs. some other kind of tunable for which we can't articulate any improvement except in the recycling scenario.

For #6, there is no feasible way for us to recreate our workload on other operating systems or filesystems. Can anyone expand on what performance data is needed?

I'd like to restate the original problem we observed.

When PostgreSQL decides to reuse an old WAL file whose contents have been evicted from the cache (because they haven't been used in hours), this turns what should be a workload bottlenecked by synchronous write performance (that can be well-optimized with an SSD log device) into a random read workload (that's much more expensive for any system). What's significantly worse is that we saw this on synchronous standbys. When that happened, the WAL receiver was blocked on a random read from disk, and since it's single-threaded, all write queries on the primary stop until the random read finishes. This is particularly bad for us when the sync is doing other I/O (e.g., for an autovacuum or a database backup) that causes disk reads to take hundreds of milliseconds.

To summarize, recycling old WAL files seems like an optimization designed for certain filesystems that allocate disk blocks up front. Given that the existing behavior is already filesystem specific, is there specific reasons why we can't provide a tunable to disable this behavior for filesystems which don't behave that way?

Thanks again,
Jerry


On Tue, Jun 26, 2018 at 7:35 AM, Jerry Jelinek <jerry.jelinek@joyent.com> wrote:
Hello All,

Attached is a patch to provide an option to disable WAL recycling. We have found that this can help performance by eliminating read-modify-write behavior on old WAL files that are no longer resident in the filesystem cache. The is a lot more detail on the background of the motivation for this in the following thread.


A similar change has been tested against our 9.6 branch that we're currently running, but the attached patch is against master.

Thanks,
Jerry


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: GiST VACUUM
Next
From: Tom Lane
Date:
Subject: Re: _isnan() on Windows