Re: FileFallocate misbehaving on XFS - Mailing list pgsql-hackers

From Andres Freund
Subject Re: FileFallocate misbehaving on XFS
Date
Msg-id jq6lozj36wseov4tbg5ziduvy7bfj7r3oxmbyifi6yn24dmsyp@4cj5oivz22mj
Whole thread Raw
In response to Re: FileFallocate misbehaving on XFS  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
Hi,

On 2024-12-19 17:47:13 +1100, Michael Harris wrote:
> I finally managed to get the patched version installed in a production
> database where the error is occurring very regularly.

Thanks!


> Here is a sample of the output:
> 
> 2024-12-19 01:08:50 CET [2533222]:  LOG:  mdzeroextend FileFallocate
> failing with ENOSPC: free space for filesystem containing
> "pg_tblspc/107724/PG_16_202307071/465960/2591590762.15" f_blocks:
> 2683831808, f_bfree: 205006167, f_bavail: 205006167 f_files:
> 1073741376, f_ffree: 1069933796

That's ~700 GB of free space...

It'd be interesting to see filefrag -v for that segment.


> This is a different system to those I previously provided logs from.
> It is also running RHEL8 with a similar configuration to the other
> system.

Given it's a RHEL system, have you raised this as an issue with RH? They
probably have somebody with actual XFS hacking experience on staff.

RH's kernels are *heavily* patched, so it's possible the issue is actually RH
specific.


> I have so far not installed the bpftrace that Jakub suggested before -
> as I say this is a production machine and I am wary of triggering a
> kernel panic or worse (even though it seems like the risk for that
> would be low?). While a kernel stack trace would no doubt be helpful
> to the XFS developers, from a postgres point of view, would that be
> likely to help us decide what to do about this?

Well, I'm personally wary of installing workarounds for a problem I don't
understand and can't reproduce, which might be specific to old filesystems
and/or heavily patched kernels.  This clearly is an FS bug.

That said, if we learn that somehow this is a fundamental XFS issue that can
be triggered on every XFS filesystem, with current kernels, it becomes more
reasonable to implement a workaround in PG.


Another thing I've been wondering about is if we could reduce the frequency of
hitting problems by rounding up the number of blocks we extend by to powers of
two. That would probably reduce fragmentation, and the extra space would be
quickly used in workloads where we extend by a bunch of blocks at once,
anyway.

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pure parsers and reentrant scanners
Next
From: Jelte Fennema-Nio
Date:
Subject: Re: AIO v2.0