Hi,
On 2023-07-11 09:09:43 +0200, Jakub Wartak wrote:
> On Mon, Jul 10, 2023 at 6:24 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2023-07-03 11:53:56 +0200, Jakub Wartak wrote:
> > > Out of curiosity I've tried and it is reproducible as you have stated : XFS
> > > @ 4.18.0-425.10.1.el8_7.x86_64:
> > >...
> > > According to iostat and blktrace -d /dev/sda -o - | blkparse -i - output ,
> > > the XFS issues sync writes while ext4 does not, xfs looks like constant
> > > loop of sync writes (D) by kworker/2:1H-kblockd:
> >
> > That clearly won't go well. It's not reproducible on newer systems,
> > unfortunately :(. Or well, fortunately maybe.
> >
> >
> > I wonder if a trick to avoid this could be to memorialize the fact that we
> > bulk extended before and extend by that much going forward? That'd avoid the
> > swapping back and forth.
>
> I haven't seen this thread [1] "Question on slow fallocate", from XFS
> mailing list being mentioned here (it was started by Masahiko), but I
> do feel it contains very important hints even challenging the whole
> idea of zeroing out files (or posix_fallocate()). Please especially
> see Dave's reply.
I think that's just due to the reproducer being a bit too minimal and the
actual problem being addressed not being explained.
> He also argues that posix_fallocate() != fallocate(). What's interesting is
> that it's by design and newer kernel versions should not prevent such
> behaviour, see my testing result below.
I think the problem there was that I was not targetting a different file
between the different runs, somehow assuming the test program would be taking
care of that.
I don't think the test program actually tests things in a particularly useful
way - it does fallocate()s in 8k chunks - which postgres never does.
> All I can add is that this those kernel versions (4.18.0) seem to very
> popular across customers (RHEL, Rocky) right now and that I've tested
> on most recent available one (4.18.0-477.15.1.el8_8.x86_64) using
> Masahiko test.c and still got 6-7x slower time when using XFS on that
> kernel. After installing kernel-ml (6.4.2) the test.c result seems to
> be the same (note it it occurs only when 1st allocating space, but of
> course it doesnt if the same file is rewritten/"reallocated"):
test.c really doesn't reproduce postgres behaviour in any meaningful way,
using it as a benchmark is a bad idea.
Greetings,
Andres Freund