Re: patch to allow disable of WAL recycling - Mailing list pgsql-hackers

From Tom Lane
Subject Re: patch to allow disable of WAL recycling
Date
Msg-id 19196.1531750368@sss.pgh.pa.us
Whole thread Raw
In response to Re: patch to allow disable of WAL recycling  (Andres Freund <andres@anarazel.de>)
Responses Re: patch to allow disable of WAL recycling  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
> On 2018-07-15 20:55:38 -0400, Tom Lane wrote:
>> That's not the way to think about it.  On a COW file system, we don't
>> want to "create 16MB files" at all --- we should just fill WAL files
>> on-the-fly, because the pre-fill activity isn't actually serving the
>> intended purpose of reserving disk space.  It's just completely useless
>> overhead :-(.  So we can't really make a direct comparison between the
>> two approaches; there's no good way to net out the cost of constructing
>> the WAL data we need to write.

> We probably should still allocate them in 16MB segments. We rely on the
> size being fixed in a number of places.

Reasonable point.  I was supposing that it'd be okay if a partially
written segment were shorter than 16MB, but you're right that that
would require vetting a lot of code to be sure about it.

> But it's probably worthwhile to
> just do a posix_fadvise or such. Also, if we continually increase the
> size with each write we end up doing a lot more metadata transactions,
> which'll essentially serve to increase journalling overhead further.

Hm.  What you're claiming is that on these FSen, extending a file involves
more/different metadata activity than allocating new space for a COW
overwrite of an existing area within the file.  Is that really true?
The former case would be far more common in typical usage, and somehow
I doubt the FS authors would have been too stupid to optimize things so
that the same journal entry can record both the space allocation and the
logical-EOF change.

But anyway, this means we have two nearly independent issues to
investigate: whether recycling/renaming old files is cheaper than
constantly creating and deleting them, and whether to use physical
file zeroing versus some "just set the EOF please" filesystem call
when first creating a file.  The former does seem like it's purely
a performance question, but the latter involves a tradeoff of
performance against an ENOSPC-panic protection feature that in
reality only works on some filesystems.

            regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Usage of epoch in txid_current
Next
From: Robert Haas
Date:
Subject: Re: cursors with prepared statements