Re: FileFallocate misbehaving on XFS - Mailing list pgsql-hackers

From Andres Freund
Subject Re: FileFallocate misbehaving on XFS
Date
Msg-id ys2ygnyp7paydnd5qzusogzooxmq6cjczxept3mobg4hwztxv6@hblfznjt4fbk
Whole thread Raw
In response to Re: FileFallocate misbehaving on XFS  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: FileFallocate misbehaving on XFS
List pgsql-hackers
Hi,

On 2024-12-10 12:36:40 -0500, Robert Haas wrote:
> On Mon, Dec 9, 2024 at 7:31 PM Andres Freund <andres@anarazel.de> wrote:
> > Pretty unexcited about all of these - XFS is fairly widely used for PG, but
> > this problem doesn't seem very common. It seems to me that we're missing
> > something that causes this to only happen in a small subset of cases.
>
> I wonder if this is actually pretty common on XFS. I mean, we've
> already hit this with at least one EDB customer, and Michael's report
> is, as far as I know, independent of that; and he points to a
> pgsql-general thread which, AFAIK, is also independent. We don't get
> three (or more?) independent reports of that many bugs, so I think
> it's not crazy to think that the problem is actually pretty common.

Maybe. I think we would have gotten a lot more reports if it were common. I
know of quite a few very busy installs using xfs.

I think there must be some as-of-yet-unknown condition gating it. E.g. that
the filesystem has been created a while ago and has some now-on-by-default
options disabled.


> > I think the source of this needs to be debugged further before we try to apply
> > workarounds in postgres.
>
> Why? It seems to me that this has to be a filesystem bug,

Adding workarounds for half-understood problems tends to lead to code that we
can't evolve in the future, as we a) don't understand b) can't reproduce the
problem.

Workarounds could also mask some bigger / worse issues.  We e.g. have blamed
ext4 for a bunch of bugs that then turned out to be ours in the past. But we
didn't look for a long time, because it was convenient to just blame ext4.


> and we should almost certainly adopt one of these ideas from Michael Harris:
>
>  - Providing a way to configure PG not to use posix_fallocate at runtime

I'm not strongly opposed to that. That's testable without access to an
affected system.  I wouldn't want to automatically do that when detecting an
affected system though, that'll make behaviour way less predictable.


>  - In the case of posix_fallocate failing with ENOSPC, fall back to
> FileZero (worst case that will fail as well, in which case we will
> know that we really are out of space)

I doubt that that's a good idea. What if fallocate failing is an indicator of
a problem? What if you turn on AIO + DIO and suddenly get a much more
fragmented file?

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Assert failure on running a completed portal again
Next
From: Tom Lane
Date:
Subject: Re: Assert failure on running a completed portal again