Home > mailing lists

Re: FileFallocate misbehaving on XFS - Mailing list pgsql-hackers

From	Andres Freund
Subject	Re: FileFallocate misbehaving on XFS
Date	December 9, 2024 19:15:46
Msg-id	qhy5z65zhfui5b7vmwkqclbu7aksdvdkohxnb3bgzflvrnhugv@vy3pyzwpm3uv Whole thread Raw
In response to	Re: FileFallocate misbehaving on XFS (Tomas Vondra <tomas@vondra.me>)
List	pgsql-hackers

Tree view

Hi,

On 2024-12-09 15:47:55 +0100, Tomas Vondra wrote:
> On 12/9/24 11:27, Jakub Wartak wrote:
> > On Mon, Dec 9, 2024 at 10:19 AM Michael Harris <harmic@gmail.com
> > <mailto:harmic@gmail.com>> wrote:
> > 
> > Hi Michael,
> > 
> >     We found this thread describing similar issues:
> > 
> >     https://www.postgresql.org/message-id/flat/
> >     AS1PR05MB91059AC8B525910A5FCD6E699F9A2%40AS1PR05MB9105.eurprd05.prod.outlook.com
<https://www.postgresql.org/message-id/flat/AS1PR05MB91059AC8B525910A5FCD6E699F9A2%40AS1PR05MB9105.eurprd05.prod.outlook.com>
> > 
> > 
> > We've got some case in the past here in EDB, where an OS vendor has
> > blamed XFS AG fragmentation (too many AGs, and if one AG is not having
> > enough space -> error). Could You perhaps show us output of on that LUN:
> > 1. xfs_info
> > 2. run that script from https://www.suse.com/support/kb/doc/?
> > id=000018219 <https://www.suse.com/support/kb/doc/?id=000018219> for
> > Your AG range
> > 
> 
> But this can be reproduced on a brand new filesystem - I just tried
> creating a 1GB image, create XFS on it, mount it, and fallocate a 600MB
> file twice. Which that fails, and there can't be any real fragmentation.

If I understand correctly xfs, before even looking at the file's current
layout, checks if there's enough free space for the fallocate() to
succeed.  Here's an explanation for why:
https://www.spinics.net/lists/linux-xfs/msg55429.html

  The real problem with preallocation failing part way through due to
  overcommit of space is that we can't go back an undo the
  allocation(s) made by fallocate because when we get ENOSPC we have
  lost all the state of the previous allocations made. If fallocate is
  filling holes between unwritten extents already in the file, then we
  have no way of knowing where the holes we filled were and hence
  cannot reliably free the space we've allocated before ENOSPC was
  hit.

I.e. reserving space as you go would leave you open to ending up with some,
but not all, of those allocations having been made. Whereas pre-reserving the
worst case space needed, ahead of time, ensures that you have enough space to
go through it all.

You can't just go through the file [range] and compute how much free space you
will need allocate and then do the a second pass through the file, because the
file layout might have changed concurrently...

This issue seems independent of the issue Michael is having though. Postgres,
afaik, won't fallocate huge ranges with already allocated space.

Greetings,

Andres Freund

pgsql-hackers by date:

From: jian he
Date: 09 December 2024, 19:10:15
Subject: Re: NOT ENFORCED constraint feature

From: "David G. Johnston"
Date: 09 December 2024, 19:31:03
Subject: Re: Document NULL

Re: FileFallocate misbehaving on XFS - Mailing list pgsql-hackers

Previous

Next