Re: EINTR in ftruncate() - Mailing list pgsql-hackers

From Andres Freund
Subject Re: EINTR in ftruncate()
Date
Msg-id 20220711174518.yldckniicknsxgzl@awork3.anarazel.de
Whole thread Raw
In response to Re: EINTR in ftruncate()  (Thomas Munro <thomas.munro@gmail.com>)
Responses Re: EINTR in ftruncate()
List pgsql-hackers
Hi,

On 2022-07-07 17:58:10 +1200, Thomas Munro wrote:
> Even if we go with this approach now, I think it's plausible that we
> might want to reconsider this yet again one day, perhaps allocating
> memory with some future asynchronous infrastructure while still
> processing interrupts.  It's not very nice to hold up recovery or
> ProcSignalBarrier for long operations.

I think the next improvement would be to do the fallocate in smaller chunks,
and accept interrupts inbetween.


> I'm a little unclear about ftruncate() here.  I don't expect it to
> report EINTR in other places where we use it (ie to make a local file
> on a non-"slow device" smaller), because I expect that to be like
> read(), write() etc which we don't wrap in EINTR loops.  Here you've
> observed EINTR when messing around with a debugger*.  It seems
> inconsistent to put posix_fallocate() in an EINTR retry loop for the
> benefit of debuggers, but not ftruncate().  But perhaps that's good
> enough, on the theory that posix_fallocate(1GB) is a very large target
> and you have a decent chance of hitting it.

> *It's funny that ftruncate() apparently doesn't automatically restart
> for ptrace SIGCONT on Linux according to your report, while poll()
> does according to my experiments, even though the latter is one of the
> never-restart functions (it doesn't on other OSes I hack on, and you
> feel the difference when debugging missing wakeup type bugs...).
> Random implementation details...

My test was basically while (true); {if (!ftruncate()) bleat(); if
(!fallocate()) bleat();} with a SIGALRM triggering regularly in the
background. The ftruncate failed, rarely, when attaching / detaching strace
-p. Rarely enough that I had already written you an IM saying that I couldn't
make it fail...  So it's hard to be confident this can't otherwise be
hit. With that caveat: I didn't hit it with a "real" file on a "real"
filesystem in a few minutes of trying. But unsurprisingly it triggers when
putting the file on a tmpfs.


> @@ -362,6 +355,14 @@ dsm_impl_posix_resize(int fd, off_t size)
>  {
>      int            rc;
>  
> +    /*
> +     * Block all blockable signals, except SIGQUIT.  posix_fallocate() can run
> +     * for quite a long time, and is an all-or-nothing operation.  If we
> +     * allowed SIGUSR1 to interrupt us repeatedly (for example, due to recovery
> +     * conflicts), the retry loop might never succeed.
> +     */
> +    PG_SETMASK(&BlockSig);
> +
>      /* Truncate (or extend) the file to the requested size. */
>      rc = ftruncate(fd, size);
>

Hm - given that we've observed ftruncate failing with strace, and that
stracing to find problems isn't insane, shouldn't we retry the ftruncate too?
It's kind of obsoleted by your next patch, but not really, because it's not
unconceivable that other OSs behave similarly? And IIUC you're planning to not
backpatch 0002?


> +    pgstat_report_wait_start(WAIT_EVENT_DSM_FILL_ZERO_WRITE);

(not new in this patch, just moved around) - FILL_ZERO_WRITE is imo an odd
description of what's happening... In my understanding this isn't doing any
writing, just allocating. But whatever...


Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: First draft of the PG 15 release notes
Next
From: Robert Haas
Date:
Subject: Re: DropRelFileLocatorBuffers