RE: Error:could not extend file " with FileFallocate(): No space left on device - Mailing list pgsql-general

From Pecsök Ján
Subject RE: Error:could not extend file " with FileFallocate(): No space left on device
Date
Msg-id AS1PR05MB9105E3395555A70408D027309F642@AS1PR05MB9105.eurprd05.prod.outlook.com
Whole thread Raw
In response to Re: Error:could not extend file " with FileFallocate(): No space left on device  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: Error:could not extend file " with FileFallocate(): No space left on device
List pgsql-general
In link you provided there is mention, that in PostgreSQL 16 data is not being
compressed for PostgreSQL 16 server. Does it mean, that PosgreSQL 16 use much more space while computing queries?
If that is the case, it can be our problem, because our queries use sometimes several TB of disk space for computation
andif there is considerable increase in disk usage during the queries, it can happen, that sometimes 27TB is not
enough.

I have 2 questions, 

Is there any workaround, that Posgres wont use FileFallocate? Maybe set something in Linux not to allow Posgres to use
it?
The change was introduced in Posgres 16, does it mean, that Posgres 15.8 should have old behaviour?

We dont use COPY in our queries.




-----Original Message-----
From: Thomas Munro <thomas.munro@gmail.com> 
Sent: Wednesday, September 11, 2024 11:37 PM
To: Alvaro Herrera <alvherre@alvh.no-ip.org>
Cc: Pecsök Ján <jan.pecsok@profinit.eu>; pgsql-general@lists.postgresql.org; Andres Freund <andres@anarazel.de>
Subject: Re: Error:could not extend file " with FileFallocate(): No space left on device

On Thu, Sep 12, 2024 at 12:39 AM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>> On 2024-Sep-11, Pecsök Ján wrote:
> > In our case:
> > Kernel: Linux version 4.18.0-513.18.1.el8_9.ppc64le 
> > (mockbuild@ppc-hv-13.build.eng.rdu2.redhat.com) (gcc version 8.5.0 
> > 20210514 (Red Hat 8.5.0-20) (GCC)) #1 SMP Thu Feb 1 02:52:53 EST 
> > 2024 File systém type:xfs
>
> Can you please share the output of xfs_info for the filesystem(s) used?
>
> Apparently, it's possible for allocation groups to be suboptimally 
> laid out in a way that leads to ENOSPC with space still available.

Hmm, I have no clues about that, though I do remember reports of spurious ENOSPC errors from xfs many years ago on some
otherdatabase I was around maybe in the era of that kernel or a bit older.
 

Actually I was already wondering if we need to add a tunable to control that the heuristic that redirects to
posix_fallocate():


https://www.postgresql.org/message-id/flat/CAMazQQfp%2B3f8tD_Q23rCR%3DO%2BJj4jouSRVigbD8OmrTOfHV%2B8gA%40mail.gmail.com

There's no confirmation that writing zeros would be a useful workaround here, though.  Two things changed in 16: the
fallocate()path was invented, but also we started extending by more than one block at a time, which might take the
pwritev()path or the
 
fallocate() path, for bulk insertion via COPY.  That btrfs user would prefer pwritev() always IIRC, but if some version
ofxfs is alergic to this pattern I don't know if it's the size or the system call that's triggering it...
 

Is COPY used here?

And just for curiosity (I don't see any particular connection or what to do about it either way in the short term), are
wetalking about really big tables with lots of 1GB files named N.1, N.2, N.3, ...
 
files, or millions of smaller tables?  I kinda wonder if xfs (and any file system really) would really prefer us to use
largefiles instead (patches exist for this), and when many-terabyte clusters start working with huge numbers of
segments,we reach fun new kinds of internal resource exhaustion, or something like that....
 

. o O { I particularly dislike our habit of synthesising fake ENOSPC errors in a few code paths... grumble }

pgsql-general by date:

Previous
From: Brent Wood
Date:
Subject: Re: Effects of REVOKE SELECT ON ALL TABLES IN SCHEMA pg_catalog FROM PUBLIC
Next
From: François SIMON
Date:
Subject: post-bootstrap init : permission denied pg_description